Skip to main content
Version: 2.18

ElasticsearchDocumentStore

Use an Elasticsearch database with Haystack.

API referenceElasticsearch
GitHub linkhttps://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/elasticsearch

ElasticsearchDocumentStore is excellent if you want to evaluate the performance of different retrieval options (dense vs. sparse) and aim for a smooth transition from PoC to production.

It features the approximate nearest neighbours (ANN) search.

Initialization​

Install Elasticsearch and then start an instance. Haystack supports Elasticsearch 8.

If you have Docker set up, we recommend pulling the Docker image and running it.

shell
docker pull docker.elastic.co/elasticsearch/elasticsearch:8.11.1
docker run -p 9200:9200 -e "discovery.type=single-node" -e "ES_JAVA_OPTS=-Xms1024m -Xmx1024m" -e "xpack.security.enabled=false" elasticsearch:8.11.1

As an alternative, you can go to Elasticsearch integration GitHub and start a Docker container running Elasticsearch using the provided docker-compose.yml:

shell
docker compose up

Once you have a running Elasticsearch instance, install the elasticsearch-haystack integration:

shell
pip install elasticsearch-haystack

Then, initialize an ElasticsearchDocumentStore object that’s connected to the Elasticsearch instance and writes documents to it:

python
from haystack_integrations.document_stores.elasticsearch import ElasticsearchDocumentStore
from haystack import Document

document_store = ElasticsearchDocumentStore(hosts = "http://localhost:9200")
document_store.write_documents([
Document(content="This is first"),
Document(content="This is second")
])
print(document_store.count_documents())

Supported Retrievers​

ElasticsearchBM25Retriever: A keyword-based Retriever that fetches documents matching a query from the Document Store.

ElasticsearchEmbeddingRetriever: Compares the query and document embeddings and fetches the documents most relevant to the query.