DocumentationAPI ReferenceπŸ““ TutorialsπŸ§‘β€πŸ³ Cookbook🀝 IntegrationsπŸ’œ Discord

ElasticsearchDocumentStore

Use an Elasticsearch database with Haystack.

ElasticsearchDocumentStore is excellent if you want to evaluate the performance of different retrieval options (dense vs. sparse) and aim for a smooth transition from PoC to production.

It features the approximate nearest neighbors (ANN) search.

Initialization

InstallΒ Elasticsearch and thenΒ startΒ an instance. Haystack 2.0 supports Elasticsearch 8.

If you have Docker set up, we recommend pulling the Docker image and running it.

docker pull docker.elastic.co/elasticsearch/elasticsearch:8.11.1
docker run -p 9200:9200 -e "discovery.type=single-node" -e "ES_JAVA_OPTS=-Xms1024m -Xmx1024m" -e "xpack.security.enabled=false" elasticsearch:8.11.1

As an alternative, you can go to Elasticsearch integration GitHub and start a Docker container running Elasticsearch using the provided docker-compose.yml:

docker compose up

Once you have a running Elasticsearch instance, install the elasticsearch-haystack integration:

pip install elasticsearch-haystack

Then, initialize an ElasticsearchDocumentStore object that’s connected to the Elasticsearch instance and write Documents to it:

from haystack_integrations.document_stores.elasticsearch import ElasticsearchDocumentStore
from haystack import Document

document_store = ElasticsearchDocumentStore(hosts = "<http://localhost:9200>")
document_store.write_documents([
    Document(content="This is first"),
    Document(content="This is second")
    ])
print(document_store.count_documents())

Supported Retrievers

ElasticsearchBM25Retriever: A keyword-based Retriever that fetches Documents matching a query from the Document Store.

ElasticsearchEmbeddingRetriever: Compares the query and Document embeddings and fetches the Documents most relevant to the query.