DocumentationAPI ReferenceTutorialsGitHub Code ExamplesDiscord Community

Get Started

Have a look at this page to learn how to quickly get up and running with Haystack. It contains instructions for installing, building a basic pipeline, preparing your files, and running a search.

# The most straightforward way to install the latest release of Haystack is through pip.
# This command will install everything needed for Pipelines that use an Elasticsearch Document Store.

pip install farm-haystack
# If you plan to be using more advanced features like Milvus, FAISS, Weaviate, OCR or Ray, 
# you will need to install a full version of Haystack.
# The following command will install the latest version on the main branch.

git clone
cd haystack
pip install -e '.[all]'
See what Haystack can do with our Explore the World demo (
# You can set up and run a local Haystack demo via Docker.
# To enable GPU acceleration, add the `-f docker-compose-gpu.yml` flag to both `docker-compose` commands.
# See our Quick Demo page for more details.

git clone
cd haystack
docker-compose pull
docker-compose up


For a full guide on how to install Haystack see Installation.


You can effortlessly try out a running Haystack system by going to our hosted Explore the World demo, or starting up your own Haystack service via Docker! See Quick Demo for details.

The Building Blocks of Haystack

Here’s a sample of some Haystack code showing a question answering system using a retriever and a reader. For a working code example, check out our starter tutorial.

# DocumentStore: holds all your data
document_store = ElasticsearchDocumentStore()

# Clean & load your documents into the DocumentStore
dicts = convert_files_to_dicts(doc_dir, clean_func=clean_wiki_text)

# Retriever: A Fast and simple algo to identify the most promising candidate documents
retriever = ElasticsearchRetriever(document_store)

# Reader: Powerful but slower neural network trained for QA
model_name = "deepset/roberta-base-squad2"
reader = FARMReader(model_name)

# Pipeline: Combines all the components
pipe = ExtractiveQAPipeline(reader, retriever)

# Voilà! Ask a question!
question = "Who is the father of Sansa Stark?"
prediction =

Loading Documents into the DocumentStore

In Haystack, DocumentStores expect Documents in a dictionary format. They are loaded as follows:

document_store = ElasticsearchDocumentStore()
dicts = [
        'content': DOCUMENT_TEXT_HERE,
        'meta': {'name': DOCUMENT_NAME, ...}
    }, ...

When we talk about Documents in Haystack, we are referring specifically to the individual blocks of text that are being held in the DocumentStore. You might want to use all the text in one file as a Document, or split it into multiple Documents. This splitting can have a big impact on speed and performance.



If Haystack is running very slowly, you might want to try splitting your text into smaller Documents. If you want an improvement to performance, you might want to try concatenating text to make larger Documents. See Optimization for more details.

Running Search Queries

There are many different flavours of search that can be created using Haystack. But to give just one example of what can be achieved, let's look more closely at an Open Domain Question Answering (ODQA) Pipeline.

Querying in an ODQA system involves searching for an answer to a given question within the full document store. This process will:

  • make the Retriever filter for a small set of relevant candidate documents
  • get the Reader to process this set of candidate documents
  • return potential answers to the given question

Usually, there are tight time constraints on querying and so it needs to be a lightweight operation. When documents are loaded, Haystack will precompute any of the results that might be useful at query time.

In Haystack, querying is performed with a Pipeline object which connects the reader to the retriever.

# Pipeline: Combines all the components
pipe = ExtractiveQAPipeline(reader, retriever)

# Voilà! Ask a question!
question = "Who is the father of Sansa Stark?"
prediction =

When the query is complete, you can expect to see results that look something like this:

    {   'answer': 'Eddard',
        'context': 's Nymeria after a legendary warrior queen. She travels '
                   "with her father, Eddard, to King's Landing when he is made "
                   'Hand of the King. Before she leaves,'
    }, ...

Custom Search Pipelines

Haystack provides many different building blocks for you to mix and match. They include:

  • Readers
  • Retrievers (sparse and dense)
  • DocumentStores
  • Summarizers
  • Generators
  • Translators

These can all be combined in a configuration that you want. Have a look at our Pipelines page to see what's possible!

Related Links