Name	AnswerBuilder
Folder Path	/builders/
Most Common Position in a Pipeline	In pipelines with a Generator, such as a RAG pipeline, after the Generator component to create GeneratedAnswer objects from its replies.
Inputs	“query”: A query string ”replies”: A list of strings that are replies from a Generator "documents": (optional) A list of documents (for example, from the retriever) "meta": (optional) Metadata information
Outputs	“answers”: List of GeneratedAnswer objects

Overview

AnswerBuilder takes a query and the replies a Generator returns as input and parses them into GeneratedAnswer objects. Optionally, it also takes Documents and metadata from the Generator as inputs to enrich the GeneratedAnswer objects.

The optional pattern parameter defines how to extract answer texts from replies. It needs to be a regular expression with a maximum of one capture group. If a capture group is present, the text matched by the capture group is used as the answer. If no capture group is present, the whole match is used as the answer. If no pattern is set, the whole reply is used as the answer text.

The optional reference_pattern parameter can be set to a regular expression that parses referenced documents from the replies so that only those referenced documents are listed in the GeneratedAnswer objects. Haystack assumes that documents are referenced by their index in the list of input documents and that indices start at 1. For example, if you set the reference_pattern to \\[(\\d+)\\], it finds “1” in a string "This is an answer[1]". If reference_pattern is not set, all input documents are listed in the GeneratedAnswer objects.

Usage

On its own

Below is an example where we’re using the AnswerBuilder to parse a string that could be the reply received from a Generator using a custom regular expression. Any text other than the answer will not be included in the GeneratedAnswer object constructed by the builder.

from haystack.components.builders import AnswerBuilder

builder = AnswerBuilder(pattern="Answer: (.*)")
builder.run(query="What's the answer?", replies=["This is an argument. Answer: This is the answer."])

In a pipeline

Below is an example of a RAG pipeline where we use an AnswerBuilder to create GeneratedAnswer objects from the replies returned by a Generator. In addition to the text of the reply, these objects also hold the query, the referenced docs, and metadata returned by the Generator.

from haystack import Pipeline
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.components.generators import OpenAIGenerator
from haystack.components.builders.answer_builder import AnswerBuilder
from haystack.components.builders.prompt_builder import PromptBuilder
prompt_template = """
    Given these documents, answer the question.\nDocuments:
    {% for doc in documents %}
        {{ doc.content }}
    {% endfor %}

    \nQuestion: {{query}}
    \nAnswer:
    """
p = Pipeline()
p.add_component(instance=InMemoryBM25Retriever(document_store=InMemoryDocumentStore()), name="retriever")
p.add_component(instance=PromptBuilder(template=prompt_template), name="prompt_builder")
p.add_component(instance=OpenAIGenerator(api_key=os.environ.get("OPENAI_API_KEY")), name="llm")
p.add_component(instance=AnswerBuilder(), name="answer_builder")
p.connect("retriever", "prompt_builder.documents")
p.connect("prompt_builder", "llm")
p.connect("llm.replies", "answer_builder.replies")
p.connect("llm.meta", "answer_builder.meta")
p.connect("retriever", "answer_builder.documents")
query = "What is the capital of France?"
result = p.run(
            {
                "retriever": {"query": question},
                "prompt_builder": {"query": question},
                "answer_builder": {"query": question},
            }
	)
print(result)