Most common position in a pipeline	In query pipelines, after a component that returns a list of documents, such as a Retriever
Mandatory init variables	"token": The Hugging Face API token. Can be set with `HF_API_TOKEN` or `HF_TOKEN` env var.
Mandatory run variables	"documents": A list of documents "query": A query string
Output variables	"answers": A list of `ExtractedAnswer` objects
API reference	Readers
GitHub link	https://github.com/deepset-ai/haystack/blob/main/haystack/components/readers/extractive.py

Overview

ExtractiveReader locates and extracts answers to a given query from the document text. It's used in extractive QA systems where you want to know exactly where the answer is located within the document. It's usually coupled with a Retriever that precedes it, but you can also use it with other components that fetch documents.

Readers assign a probability to answers. This score ranges from 0 to 1, indicating how well the results the Reader returned match the query. Probability closest to 1 means the model has high confidence in the answer's relevance. The Reader sorts the answers based on their probability scores, with higher probability listed first. You can limit the number of answers the Reader returns in the optional top_k parameter.

You can use the probability to set the quality expectations for your system. To do that, use the confidence_score parameter of the Reader to set a minimum probability threshold for answers. For example, setting confidence_threshold to 0.7 means only answers with a probability higher than 0.7 will be returned.

By default, the Reader includes a scenario where no answer to the query is found in the document text (no_answer=True). In this case, it returns an additional ExtractedAnswer with no text and the probability that none of the top_k answers are correct. For example, if top_k=4 the system will return four answers and an additional empty one. Each answer has a probability assigned. If the empty answer has a probability of 0.5, it means that's the probability that none of the returned answers is correct. To receive only the actual top_k answers, set the no_answer parameter to False when initializing the component.

Models

Here are the models that we recommend for using with ExtractiveReader:

Model URL	Description	Language
deepset/roberta-base-squad2-distilled (default)	A distilled model, relatively fast and with good performance.	English
deepset/roberta-large-squad2	A large model with good performance. Slower than the distilled one.	English
deepset/tinyroberta-squad2	A distilled version of roberta-large-squad2 model, very fast.	English
deepset/xlm-roberta-base-squad2	A base multilingual model with good speed and performance.	Multilingual

You can also view other question answering models on Hugging Face.

Usage

On its own

Below is an example that uses the ExtractiveReader outside of a pipeline. The Reader gets the query and the documents at runtime. It should return two answers and an additional third answer with no text and the probability that the top_k answers are incorrect.

from haystack import Document
from haystack.components.readers import ExtractiveReader

docs = [Document(content="Paris is the capital of France."), Document(content="Berlin is the capital of Germany.")]

reader = ExtractiveReader()
reader.warm_up()

reader.run(query="What is the capital of France?", documents=docs, top_k=2)

In a pipeline

Below is an example of a pipeline that retrieves a document from an InMemoryDocumentStore based on keyword search (using InMemoryBM25Retriever). It then uses the ExtractiveReader to extract the answer to our query from the top retrieved documents.

With the ExtractiveReader’s top_k set to 2, an additional, third answer with no text and the probability that the other top_k answers are incorrect is also returned.

from haystack import Document, Pipeline
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.components.readers import ExtractiveReader

docs = [Document(content="Paris is the capital of France."),
        Document(content="Berlin is the capital of Germany."),
        Document(content="Rome is the capital of Italy."),
        Document(content="Madrid is the capital of Spain.")]
document_store = InMemoryDocumentStore()
document_store.write_documents(docs)

retriever = InMemoryBM25Retriever(document_store = document_store)
reader = ExtractiveReader()
reader.warm_up()

extractive_qa_pipeline = Pipeline()
extractive_qa_pipeline.add_component(instance=retriever, name="retriever")
extractive_qa_pipeline.add_component(instance=reader, name="reader")

extractive_qa_pipeline.connect("retriever.documents", "reader.documents")

query = "What is the capital of France?"
extractive_qa_pipeline.run(data={"retriever": {"query": query, "top_k": 3}, 
                                   "reader": {"query": query, "top_k": 2}})