Version: 2.23-unstable

Get Started

Have a look at this page to learn how to quickly get up and running with Haystack. It contains instructions for installing, running your first RAG pipeline, adding data and further resources.

Build your first RAG application

Let's build your first Retrieval Augmented Generation (RAG) pipeline and see how Haystack answers questions.

First, install the minimal form of Haystack:

shell

pip install haystack-ai

Were you already using Haystack 1.x?

warning

Installing farm-haystack and haystack-ai in the same Python environment (virtualenv, Colab, or system) causes problems.

Installing both packages in the same environment can somehow work or fail in obscure ways. We suggest installing only one of these packages per Python environment. Make sure that you remove both packages if they are installed in the same environment, followed by installing only one of them:

bash

pip uninstall -y farm-haystack haystack-ai
pip install haystack-ai

If you have any questions, please reach out to us on the GitHub Discussion or Discord.

In the examples below, we show how to set an API key using a Haystack Secret. You can choose between OpenAI or Hugging Face as your LLM provider. For easier use, you can also set the API key as an environment variable (OPENAI_API_KEY or HF_API_TOKEN).

OpenAI
Hugging Face

note

Using OpenAIChatGenerator requires an OpenAI API key with sufficient quota.
New users on the free tier may immediately encounter a 429 ("insufficient_quota") error when running this example. If you do not have enough OpenAI credits, try the Hugging Face tab instead.

python

# import necessary dependencies
from haystack import Pipeline, Document
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.components.retrievers import InMemoryBM25Retriever
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.builders import ChatPromptBuilder
from haystack.utils import Secret
from haystack.dataclasses import ChatMessage

# create a document store and write documents to it
document_store = InMemoryDocumentStore()
document_store.write_documents([
    Document(content="My name is Jean and I live in Paris."),
    Document(content="My name is Mark and I live in Berlin."),
    Document(content="My name is Giorgio and I live in Rome.")
])

# A prompt corresponds to an NLP task and contains instructions for the model. Here, the pipeline will go through each Document to figure out the answer.
prompt_template = [
    ChatMessage.from_system(
        """
        Given these documents, answer the question.
        Documents:
        {% for doc in documents %}
            {{ doc.content }}
        {% endfor %}
        Question:
        """
    ),
    ChatMessage.from_user(
        "{{question}}"
    ),
    ChatMessage.from_system("Answer:")
]

# create the components adding the necessary parameters
retriever = InMemoryBM25Retriever(document_store=document_store)
prompt_builder = ChatPromptBuilder(template=prompt_template, required_variables="*")
llm = OpenAIChatGenerator(api_key=Secret.from_env_var("OPENAI_API_KEY"), model="gpt-4o-mini")

# Create the pipeline and add the components to it. The order doesn't matter.
# At this stage, the Pipeline validates the components without running them yet.
rag_pipeline = Pipeline()
rag_pipeline.add_component("retriever", retriever)
rag_pipeline.add_component("prompt_builder", prompt_builder)
rag_pipeline.add_component("llm", llm)

# Arrange pipeline components in the order you need them. If a component has more than one inputs or outputs, indicate which input you want to connect to which output using the format ("component_name.output_name", "component_name, input_name").
rag_pipeline.connect("retriever", "prompt_builder.documents")
rag_pipeline.connect("prompt_builder", "llm")

# Run the pipeline by specifying the first component in the pipeline and passing its mandatory inputs. Optionally, you can pass inputs to other components.
question = "Who lives in Paris?"
results = rag_pipeline.run(
    {
        "retriever": {"query": question},
        "prompt_builder": {"question": question},
    }
)

print(results["llm"]["replies"])

note

Using HuggingFaceAPIChatGenerator requires a Hugging Face API token.
You can get a free Hugging Face token to use the Serverless Inference API. This API is rate-limited but perfect for experimentation.

python

# import necessary dependencies
from haystack import Pipeline, Document
from haystack.components.generators.chat import HuggingFaceAPIChatGenerator
from haystack.components.retrievers import InMemoryBM25Retriever
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.builders import ChatPromptBuilder
from haystack.utils import Secret
from haystack.dataclasses import ChatMessage

# create a document store and write documents to it
document_store = InMemoryDocumentStore()
document_store.write_documents([
    Document(content="My name is Jean and I live in Paris."),
    Document(content="My name is Mark and I live in Berlin."),
    Document(content="My name is Giorgio and I live in Rome.")
])

# A prompt corresponds to an NLP task and contains instructions for the model. Here, the pipeline will go through each Document to figure out the answer.
prompt_template = [
    ChatMessage.from_system(
        """
        Given these documents, answer the question.
        Documents:
        {% for doc in documents %}
            {{ doc.content }}
        {% endfor %}
        Question:
        """
    ),
    ChatMessage.from_user(
        "{{question}}"
    ),
    ChatMessage.from_system("Answer:")
]

# create the components adding the necessary parameters
retriever = InMemoryBM25Retriever(document_store=document_store)
prompt_builder = ChatPromptBuilder(template=prompt_template, required_variables="*")
llm = HuggingFaceAPIChatGenerator(
    api_type="serverless_inference_api",
    api_params={"model": "Qwen/Qwen2.5-7B-Instruct", "provider": "together"},
    token=Secret.from_env_var("HF_API_TOKEN")
)

# Create the pipeline and add the components to it. The order doesn't matter.
# At this stage, the Pipeline validates the components without running them yet.
rag_pipeline = Pipeline()
rag_pipeline.add_component("retriever", retriever)
rag_pipeline.add_component("prompt_builder", prompt_builder)
rag_pipeline.add_component("llm", llm)

# Arrange pipeline components in the order you need them. If a component has more than one inputs or outputs, indicate which input you want to connect to which output using the format ("component_name.output_name", "component_name, input_name").
rag_pipeline.connect("retriever", "prompt_builder.documents")
rag_pipeline.connect("prompt_builder", "llm")

# Run the pipeline by specifying the first component in the pipeline and passing its mandatory inputs. Optionally, you can pass inputs to other components.
question = "Who lives in Paris?"
results = rag_pipeline.run(
    {
        "retriever": {"query": question},
        "prompt_builder": {"question": question},
    }
)

print(results["llm"]["replies"])

Adding Your Data

Instead of running the RAG pipeline on example data, learn how you can add your own custom data using Document Stores.

Build your first RAG application​

Adding Your Data​

Build your first RAG application

Adding Your Data