Documentation

Get Started

Have a look at this page to learn how to quickly get up and running with Haystack. It contains instructions for installing, running your first RAG pipeline, adding data and further resources.

Build your first RAG application

Let's build your first Retrieval Augmented Generation (RAG) Pipeline and see how Haystack answers questions.

First, install the minimal form of Haystack:

pip install haystack-ai
Are you already using Haystack 1.x?

🚧

Warning

Installing farm-haystack and haystack-ai in the same Python environment (virtualenv, Colab, or system) causes problems.

Installing both packages in the same environment can somehow work or fail in obscure ways. We suggest installing only one of these packages per Python environment. Make sure that you remove both packages if they are installed in the same environment, followed by installing only one of them:

pip uninstall -y farm-haystack haystack-ai pip install haystack-ai

If you have any questions, please reach out to us on the GitHub Discussion or Discord.

The following code will load your data to the Document Store, build a RAG pipeline, and ask a question based on the data. You only need an OpenAI key as an environment variable, OPENAI_API_KEY, to get this code snippet to work.

Alternatively, you can start by using one of the ready-made pipeline templates.

import os from haystack import Pipeline, Document from haystack.document_stores.in_memory import InMemoryDocumentStore from haystack.components.retrievers.in_memory import InMemoryBM25Retriever from haystack.components.generators import OpenAIGenerator from haystack.components.builders.answer_builder import AnswerBuilder from haystack.components.builders.prompt_builder import PromptBuilder # Set the environment variable OPENAI_API_KEY os.environ['OPENAI_API_KEY'] = "Your OpenAI API Key" # Write documents to InMemoryDocumentStore document_store = InMemoryDocumentStore() document_store.write_documents([ Document(content="My name is Jean and I live in Paris."), Document(content="My name is Mark and I live in Berlin."), Document(content="My name is Giorgio and I live in Rome.") ]) # Build a RAG pipeline prompt_template = """ Given these documents, answer the question. Documents: {% for doc in documents %} {{ doc.content }} {% endfor %} Question: {{question}} Answer: """ retriever = InMemoryBM25Retriever(document_store=document_store) prompt_builder = PromptBuilder(template=prompt_template) llm = OpenAIGenerator() rag_pipeline = Pipeline() rag_pipeline.add_component("retriever", retriever) rag_pipeline.add_component("prompt_builder", prompt_builder) rag_pipeline.add_component("llm", llm) rag_pipeline.connect("retriever", "prompt_builder.documents") rag_pipeline.connect("prompt_builder", "llm") # Ask a question question = "Who lives in Paris?" results = rag_pipeline.run( { "retriever": {"query": question}, "prompt_builder": {"question": question}, } ) print(results["llm"]["replies"])

Are you curious about what each step does in this code example? Check out the recipe below for details:

Adding Your Data

Instead of running the RAG pipeline on example data, learn how you can add your own custom data using Document Stores.


Did this page help you?