HomeDocumentationAPI ReferenceWhat's NewTutorials
Haystack Homepage

Introduction to Haystack

Haystack is an open-source framework for building search systems that work intelligently over large document collections. Learn more about Haystack and how it works.


Get Started

To skip the introductions and go directly to installing and creating a search app, see Get Started.

Haystack is an end-to-end framework that you can use to build powerful and production-ready pipelines for different search use cases. Whether you want to perform question answering or semantic document search, you can use the state-of-the-art NLP models in Haystack to provide unique search experiences and make it possible for your users to query in natural language. Haystack is built in a modular fashion so that you can combine the best technology from other open-source projects, like Hugging Face's Transformers, Elasticsearch, or Milvus.

The Building Blocks of Haystack

Haystack is geared towards building great search systems that are customizable and production-ready. There's a bunch of components you can use to build them.


Haystack offers nodes that perform different kinds of text processing. These are often powered by the latest transformer models. Code-wise, they are Python classes with methods you can directly call. For example, to perform question answering with a Reader node, all you need to do is provide it with documents and a question.

Working on this level with Haystack nodes is a hands-on approach. It gives you a very direct way of manipulating inputs and inspecting outputs. This can be useful for exploration, prototyping, and debugging.

reader = FARMReader(model="deepset/roberta-base-squad2")
result = reader.predict(
    query="Which country is Canberra located in?",


Haystack is built on the idea that great systems are more than the sum of their parts. By combining the NLP power of different nodes, you can create powerful and customizable systems. The pipeline is the key to making this modular approach work.

When adding nodes to a pipeline, you can define how data flows through the system and which nodes perform their processing step when. On top of simplifying data flow logic, this also allows for complex routing options, such as those involving decision nodes.

p = Pipeline()
p.add_node(component=retriever, name="Retriever", inputs=["Query"])
p.add_node(component=reader, name="Reader", inputs=["Retriever"])
result = p.run(query="What did Einstein work on?")

Why Use Pipelines for Search?

The value of chaining together different nodes is clearest when looking at the Retriever-Reader Pipeline. It's one of the most common systems built in Haystack. It harnesses the reading comprehension power of the Reader but also applies it to large document bases with the help of the Retriever.

Readers, also known as Closed-Domain Question Answering systems in machine learning speak, are powerful models that closely analyze documents and perform the core task of question answering. The Readers in Haystack are trained from the latest transformer-based language models and can be significantly sped up using GPU acceleration. But it's not currently feasible to use the Reader directly on a large collection of documents.

The Retriever assists the Reader by acting as a lightweight filter that reduces the number of documents the Reader must process. It scans through all documents in the database, quickly identifies the relevant ones, and dismisses the irrelevant ones. It ends up with a small set of candidate documents that it passes on to the Reader.

p = ExtractiveQAPipeline(reader, retriever)
result = p.run(query="What is the capital of Australia?")

You can't do question answering with a Retriever only. And with just a Reader, it would be unacceptably slow. The power of this system comes from the combination of the two nodes.


Question Answering Tutorial

To start building your first question answering system, see our Introductory Tutorial.

The Retriever-Reader pipeline is by no means the only one that Haystack offers, but it's perhaps the most instructive for showing the gains from combining nodes. Many of the synergistic node combinations are covered by Ready Made Pipelines, but we're sure there are many still to be discovered!


The Agent is a very versatile, prompt-based component that uses a large language model and employs reasoning to answer complex questions beyond the capabilities of extractive or generative question answering. It's particularly useful for multi-hop question answering scenarios where it must combine information from multiple sources to arrive at an answer.

When the Agent receives a query, it forms a plan of action consisting of steps it has to complete. It then starts with choosing the right tool and proceeds using the output from each tool as input for the next. It uses the tools in a loop until it reaches the final answer.

The Agent can use Haystack pipelines, nodes, and web queries as tools to amplify its capabilities to solve the most complex search tasks.

agent = Agent(
    final_answer_pattern=r"Final Answer\s*:\s*(.*)",

hotpot_questions = [
    "What year was the father of the Princes in the Tower born?",
    "Name the movie in which the daughter of Noel Harrison plays Violet Trefusis.",
    "Where was the actress who played the niece in the Priest film born?",
    "Which author is English: John Braine or Studs Terkel?",


To deploy a search system, you need more than just a Python script. You need a service that can stay on, handle requests as they come in, and be callable by many different applications. For this, Haystack comes with a REST API designed to work in production environments.

When set up like this, you can load Pipelines from YAML files, interact with Pipelines via HTTP requests, and connect Haystack to user-facing GUIs.

curl -X 'POST' \
  '' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "query": "Who is the father of Arya Stark?",
  "params": {}

Related Links