Version: 2.18

ElasticsearchDocumentStore

Use an Elasticsearch database with Haystack.


API reference	Elasticsearch
GitHub link	https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/elasticsearch

ElasticsearchDocumentStore is excellent if you want to evaluate the performance of different retrieval options (dense vs. sparse) and aim for a smooth transition from PoC to production.

It features the approximate nearest neighbours (ANN) search.

Initialization

Install Elasticsearch and then start an instance. Haystack supports Elasticsearch 8.

If you have Docker set up, we recommend pulling the Docker image and running it.

shell

docker pull docker.elastic.co/elasticsearch/elasticsearch:8.11.1
docker run -p 9200:9200 -e "discovery.type=single-node" -e "ES_JAVA_OPTS=-Xms1024m -Xmx1024m" -e "xpack.security.enabled=false" elasticsearch:8.11.1

As an alternative, you can go to Elasticsearch integration GitHub and start a Docker container running Elasticsearch using the provided docker-compose.yml:

shell

docker compose up

Once you have a running Elasticsearch instance, install the elasticsearch-haystack integration:

shell

pip install elasticsearch-haystack

Then, initialize an ElasticsearchDocumentStore object that’s connected to the Elasticsearch instance and writes documents to it:

python

from haystack_integrations.document_stores.elasticsearch import (
    ElasticsearchDocumentStore,
)
from haystack import Document

document_store = ElasticsearchDocumentStore(hosts="http://localhost:9200")
document_store.write_documents(
    [Document(content="This is first"), Document(content="This is second")],
)
print(document_store.count_documents())

Supported Retrievers

ElasticsearchBM25Retriever: A keyword-based Retriever that fetches documents matching a query from the Document Store.

ElasticsearchEmbeddingRetriever: Compares the query and document embeddings and fetches the documents most relevant to the query.

Initialization​

Supported Retrievers​

Initialization

Supported Retrievers