DocumentationAPI Reference📓 Tutorials🧑‍🍳 Cookbook🤝 Integrations💜 Discord🎨 Studio
Documentation

AnswerGenerator

The AnswerGenerator reads a set of documents and generates an answer to a question, word by word. While extractive question answering highlights the span of text that answers a query, generative question answering returns a novel text answer it has composed.

The best current approaches draw upon both the knowledge the Answer Generator gained during language model pretraining (parametric memory) and the passages provided to it with a Retriever (non-parametric memory). This approach is called the Retrieval-augmented Generation.

Haystack offers the Retrieval-augmented Generation using the Seq2SeqGenerator, RAGenerator, OpenAIAnswerGenerator or the PromptNode. You can learn how to build it using a PromptNode in the Creating a Generative QA Pipeline with PromptNode tutorial.

📘

PromptNode is recommended

We strongly recommend using PromptNode instead of OpenAIAnswerGenerator and deprecated classes.

With Haystack, you can use an API to perform answer generation. GPT-3 from OpenAI is a large-scale generative language model that can be used in this question answering setting. It is infeasible to run a model of this size on local hardware, but you can query an instance of GPT-3 using an API. The OpenAIAnswerGenerator and PromptNode classes facilitate this, and like any other component in Haystack, you can use them in isolation or as part of a pipeline. Note that to use GPT-3, you need to sign up for an OpenAI account and obtain a valid API key.

Position in a PipelineAfter the Retriever, you can use it as a substitute to the Reader.
InputDocuments
OutputAnswers
ClassesOpenAIAnswerGenerator
RAGenerator
Seq2SeqGenerator

🚧

Seq2SeqGenerator and RAGenerator deprecation

Seq2SeqGenerator and RAGenerator are deprecated and will be removed from Haystack in future versions. We recommend using PromptNode instead. See our GitHub discussion post for more details.

Pros

  • More appropriately phrased answers.
  • Able to synthesize information from different texts.
  • Can draw on latent knowledge stored in the language model.

Cons

  • Not easy to track what piece of information the Answer Generator is basing its response on.

Answer Generator Classes

  • OpenAIAnswerGenerator: A class that uses the GPT-3 models hosted by OpenAI. It performs queries by making API calls but otherwise functions like any other Haystack component. You can specify the model you want it to use. To use OpenAIAnswerGenerator you need an API key from an active OpenAI account.
  • RAGenerator: Retrieval-Augmented Generator based on Hugging Face's transformers model. Its main advantages are a manageable model size and the fact that the answer generation depends on retrieved Documents. This means that the model can easily adjust to domain Documents even after the training is finished.
  • Seq2SeqGenerator: A generic sequence-to-sequence generator based on Hugging Face's transformers. You can use it with any Hugging Face language model that extends GenerationMixin. See also How to Generate Text.

👍

Tutorial

Even though PromptNode is not an Answer Generator node, you can use it as an AnswerGenerator in a custom pipeline due to its versatile structure. See the tutorial on Creating a Generative QA Pipeline with PromptNode.

Usage

To initialize the OpenAIAnswerGenerator, run:

from haystack.nodes import OpenAIAnswerGenerator

generator = OpenAIAnswerGenerator(api_key=MY_API_KEY)

To initialize a locally hosted AnswerGenerator, run:

from haystack.nodes import Seq2SeqGenerator

generator = Seq2SeqGenerator(model_name_or_path="vblagoje/bart_lfqa")

To use an AnswerGenerator in a pipeline, run:

from haystack.pipelines import GenerativeQAPipeline

pipeline = GenerativeQAPipeline(generator=generator, retriever=retriever)
result = pipeline.run(query='What are the best party games for adults?', params={"Retriever": {"top_k": 5}})

To run a stand-alone AnswerGenerator, run:

result = generator.predict(
    query='What are the best party games for adults?',
    documents=[doc1, doc2, doc3...],
    top_k=top_k
)

OpenAIAnswerGenerator

Using Azure OpenAI Service

In addition to working with APIs directly from OpenAI, you can use OpenAIAnswerGenerator with Azure OpenAI. For available models and versions for the service, check Azure documentation.

node = OpenAIAnswerGenerator(
    api_key=api_key,
    azure_base_url="https://<your-endpoint>.openai.azure.com",
    azure_deployment_name="<your-deployment-name>",
    api_version="2022-12-01",
    model="text-davinci-003",
    max_tokens=50,
    presence_penalty=0.1,
    frequency_penalty=0.1,
    top_k=3,
    temperature=0.9
)

Customizing the Prompt

If the OpenAIAnswerGenerator doesn't return the answers you expect, adjust the prompt it uses for generating answers. The prompt has a big impact on the accuracy and style of the answers. You can change the prompt by passing the prompt_template parameter when initializing OpenAIAnswerGenerator:

from haystack.nodes.prompt import PromptTemplate

# Write your own prompt
my_template = PromptTemplate(
        name="qa-more-reliable-answers",
        prompt_text="Please answer the question according to the above context."
                    "If the question cannot be answered from the context, reply with 'I cannot find the required information in the context.'"
                    "\n===\nContext: {examples_context}\n===\n{examples}\n\n"
                    "===\nContext: {context}\n===\n{query}",
    )

node = OpenAIAnswerGenerator(
    api_key=api_key,
    model="text-davinci-003",
    max_tokens=50,
    presence_penalty=0.1,
    frequency_penalty=0.1,
    top_k=3,
    temperature=0.9,
    prompt_template=my_template, # pass your new prompt to the answer generator
)

You can also easily modify the few-shot examples the model gets. For more details and a full list of available parameters, see the API reference.

Example of OpenAIAnswerGenerator in a Pipeline

from haystack.pipelines import Pipeline
from haystack.nodes import OpenAIAnswerGenerator
from haystack.schema import Document

# These docs could also come from a retriever
# Here we explicitly specify them to avoid the setup steps for Retriever and DocumentStore
doc_1 = "Contrails are a manmade type of cirrus cloud formed when water vapor from the exhaust of a jet engine condenses on particles, which come from either the surrounding air or the exhaust itself, and freezes, leaving behind a visible trail. The exhaust can also trigger the formation of cirrus by providing ice nuclei when there is an insufficient naturally-occurring supply in the atmosphere. One of the environmental impacts of aviation is that persistent contrails can form into large mats of cirrus, and increased air traffic has been implicated as one possible cause of the increasing frequency and amount of cirrus in Earth's atmosphere."
doc_2 = "Because the aviation industry is especially sensitive to the weather, accurate weather forecasting is essential. Fog or exceptionally low ceilings can prevent many aircraft from landing and taking off. Turbulence and icing are also significant in-flight hazards. Thunderstorms are a problem for all aircraft because of severe turbulence due to their updrafts and outflow boundaries, icing due to the heavy precipitation, as well as large hail, strong winds, and lightning, all of which can cause severe damage to an aircraft in flight. Volcanic ash is also a significant problem for aviation, as aircraft can lose engine power within ash clouds. On a day-to-day basis airliners are routed to take advantage of the jet stream tailwind to improve fuel efficiency. Aircrews are briefed prior to takeoff on the conditions to expect en route and at their destination. Additionally, airports often change which runway is being used to take advantage of a headwind. This reduces the distance required for takeoff, and eliminates potential crosswinds."

# Let's initiate the OpenAIAnswerGenerator 
node = OpenAIAnswerGenerator(
    api_key=api_key,
    model="text-davinci-003",
    max_tokens=50,
    presence_penalty=0.1,
    frequency_penalty=0.1,
    top_k=3,
    temperature=0.9
)

# Let's create a pipeline with OpenAIAnswerGenerator
pipe = Pipeline()
pipe.add_node(component=node, name="prompt_node", inputs=["Query"])

output = pipe.run(query="Why do airplanes leave contrails in the sky?", documents=[Document(doc_1), Document(doc_2)])
output["answers"]

# Printed results
[<Answer {'answer': ' Contrails are created when water vapor from the exhaust of a jet engine condenses on particles, which come from either the surrounding air or the exhaust itself, and freezes, leaving behind a visible trail.', 'type': 'generative', 'score': None, 'context': None, 'offsets_in_document': None, 'offsets_in_context': None, 'document_id': None, 'meta': {'doc_ids': ['6a371f0bbb37c291befaaaf4704dc694', '2a2f7c49e1bec7864dd4bb447d5d0bfa'], 'doc_scores': [None, None], 'content': ["Contrails are a manmade type of cirrus cloud formed when water vapor from the exhaust of a jet engine condenses on particles, which come from either the surrounding air or the exhaust itself, and freezes, leaving behind a visible trail. The exhaust can also trigger the formation of cirrus by providing ice nuclei when there is an insufficient naturally-occurring supply in the atmosphere. One of the environmental impacts of aviation is that persistent contrails can form into large mats of cirrus, and increased air traffic has been implicated as one possible cause of the increasing frequency and amount of cirrus in Earth's atmosphere.", 'Because the aviation industry is especially sensitive to the weather, accurate weather forecasting is essential. Fog or exceptionally low ceilings can prevent many aircraft from landing and taking off. Turbulence and icing are also significant in-flight hazards. Thunderstorms are a problem for all aircraft because of severe turbulence due to their updrafts and outflow boundaries, icing due to the heavy precipitation, as well as large hail, strong winds, and lightning, all of which can cause severe damage to an aircraft in flight. Volcanic ash is also a significant problem for aviation, as aircraft can lose engine power within ash clouds. On a day-to-day basis airliners are routed to take advantage of the jet stream tailwind to improve fuel efficiency. Aircrews are briefed prior to takeoff on the conditions to expect en route and at their destination. Additionally, airports often change which runway is being used to take advantage of a headwind. This reduces the distance required for takeoff, and eliminates potential crosswinds.'], 'titles': ['', '']}}>,
 <Answer {'answer': ' Airplanes leave contrails in the sky because water vapor from the exhaust of a jet engine condenses on particles, which come from either the surrounding air or the exhaust itself, and freezes, leaving behind a visible trail.', 'type': 'generative', 'score': None, 'context': None, 'offsets_in_document': None, 'offsets_in_context': None, 'document_id': None, 'meta': {'doc_ids': ['6a371f0bbb37c291befaaaf4704dc694', '2a2f7c49e1bec7864dd4bb447d5d0bfa'], 'doc_scores': [None, None], 'content': ["Contrails are a manmade type of cirrus cloud formed when water vapor from the exhaust of a jet engine condenses on particles, which come from either the surrounding air or the exhaust itself, and freezes, leaving behind a visible trail. The exhaust can also trigger the formation of cirrus by providing ice nuclei when there is an insufficient naturally-occurring supply in the atmosphere. One of the environmental impacts of aviation is that persistent contrails can form into large mats of cirrus, and increased air traffic has been implicated as one possible cause of the increasing frequency and amount of cirrus in Earth's atmosphere.", 'Because the aviation industry is especially sensitive to the weather, accurate weather forecasting is essential. Fog or exceptionally low ceilings can prevent many aircraft from landing and taking off. Turbulence and icing are also significant in-flight hazards. Thunderstorms are a problem for all aircraft because of severe turbulence due to their updrafts and outflow boundaries, icing due to the heavy precipitation, as well as large hail, strong winds, and lightning, all of which can cause severe damage to an aircraft in flight. Volcanic ash is also a significant problem for aviation, as aircraft can lose engine power within ash clouds. On a day-to-day basis airliners are routed to take advantage of the jet stream tailwind to improve fuel efficiency. Aircrews are briefed prior to takeoff on the conditions to expect en route and at their destination. Additionally, airports often change which runway is being used to take advantage of a headwind. This reduces the distance required for takeoff, and eliminates potential crosswinds.'], 'titles': ['', '']}}>,
 <Answer {'answer': ' Contrails are formed when water vapor from the exhaust of a jet engine condenses on particles, which come from either the surrounding air or the exhaust itself, and freezes, leaving behind a visible trail.', 'type': 'generative', 'score': None, 'context': None, 'offsets_in_document': None, 'offsets_in_context': None, 'document_id': None, 'meta': {'doc_ids': ['6a371f0bbb37c291befaaaf4704dc694', '2a2f7c49e1bec7864dd4bb447d5d0bfa'], 'doc_scores': [None, None], 'content': ["Contrails are a manmade type of cirrus cloud formed when water vapor from the exhaust of a jet engine condenses on particles, which come from either the surrounding air or the exhaust itself, and freezes, leaving behind a visible trail. The exhaust can also trigger the formation of cirrus by providing ice nuclei when there is an insufficient naturally-occurring supply in the atmosphere. One of the environmental impacts of aviation is that persistent contrails can form into large mats of cirrus, and increased air traffic has been implicated as one possible cause of the increasing frequency and amount of cirrus in Earth's atmosphere.", 'Because the aviation industry is especially sensitive to the weather, accurate weather forecasting is essential. Fog or exceptionally low ceilings can prevent many aircraft from landing and taking off. Turbulence and icing are also significant in-flight hazards. Thunderstorms are a problem for all aircraft because of severe turbulence due to their updrafts and outflow boundaries, icing due to the heavy precipitation, as well as large hail, strong winds, and lightning, all of which can cause severe damage to an aircraft in flight. Volcanic ash is also a significant problem for aviation, as aircraft can lose engine power within ash clouds. On a day-to-day basis airliners are routed to take advantage of the jet stream tailwind to improve fuel efficiency. Aircrews are briefed prior to takeoff on the conditions to expect en route and at their destination. Additionally, airports often change which runway is being used to take advantage of a headwind. This reduces the distance required for takeoff, and eliminates potential crosswinds.'], 'titles': ['', '']}}>]

Here's the prompt used in this example:

Please answer the question according to the above context.
===
Context: In 2017, U.S. life expectancy was 78.6 years.
===
Q: What is human life expectancy in the United States?
A: 78 years.
===
Context: Because the aviation industry is especially sensitive to the weather, accurate weather forecasting is essential. Fog or exceptionally low ceilings can prevent many aircraft from landing and taking off. Turbulence and icing are also significant in-flight hazards. Thunderstorms are a problem for all aircraft because of severe turbulence due to their updrafts and outflow boundaries, icing due to the heavy precipitation, as well as large hail, strong winds, and lightning, all of which can cause severe damage to an aircraft in flight. Volcanic ash is also a significant problem for aviation, as aircraft can lose engine power within ash clouds. On a day-to-day basis airliners are routed to take advantage of the jet stream tailwind to improve fuel efficiency. Aircrews are briefed prior to takeoff on the conditions to expect en route and at their destination. Additionally, airports often change which runway is being used to take advantage of a headwind. This reduces the distance required for takeoff, and eliminates potential crosswinds. Contrails are a manmade type of cirrus cloud formed when water vapor from the exhaust of a jet engine condenses on particles, which come from either the surrounding air or the exhaust itself, and freezes, leaving behind a visible trail. The exhaust can also trigger the formation of cirrus by providing ice nuclei when there is an insufficient naturally-occurring supply in the atmosphere. One of the environmental impacts of aviation is that persistent contrails can form into large mats of cirrus, and increased air traffic has been implicated as one possible cause of the increasing frequency and amount of cirrus in Earth's atmosphere.
===
Q: Why do airplanes leave contrails in the sky?
A: