QuestionGenerator can come in handy if you want to use auto-suggested questions in your search app. You can run it in a pipeline or on its own. Learn how to use it.

The QuestionGenerator takes a Document as input and generates questions which it believes the Document can answer. This is almost the inverse of the Reader which takes a question and Documents as input and returns an Answer. QuestionGenerator models can be trained using question answering datasets.

Position in a PipelineAt the very beginning of a querying Pipeline



In Haystack, the term Generator by itself can sometimes be used to refer to an AnswerGenerator, but not a QuestionGenerator. The QuestionGenerator receives only Documents as input and returns questions as output while the Generator or AnswerGenerator classes are an alternative to the Reader. They take a question and Documents as input and return an answer.


To run a stand-alone QuestionGenerator:

from haystack.nodes import QuestionGenerator

text = """Python is an interpreted, high-level, general-purpose programming language. Created by Guido van Rossum
and first released in 1991, Python's design philosophy emphasizes code
readability with its notable use of significant whitespace."""

qg = QuestionGenerator()
result = qg.generate(text)

The output looks like this:

[' Who created Python?',
 ' When was Python first released?',
 " What is Python's design philosophy?"]

Ready-Made Pipelines

In Haystack, there are two pipeline configurations that are already encapsulated in its own class:

  • QuestionGenerationPipeline
  • QuestionAnswerGenerationPipeline

To learn more about these pipelines, have a look at Ready-Made Pipelines. To start using the pipelines, check out the Question Generation tutorial.

Use Case: Auto-Suggested Questions

Generated questions can help users find the information that they are looking for. Search engines now present auto-suggested questions to your top search results and even present suggested answers. It is possible to build the same functionality in Haystack using the QuestionGenerator.

After your Retriever has returned some candidate documents, you can run the QuestionGenerator to suggest more answerable questions. By presenting these generated questions to your users, you can give them a sense of other facts and topics that are present in the documents. You can go one step further by predicting answers to these questions with a Reader or Generator.

Use Case: Human in the Loop Annotation

A QuestionGenerator can enable different annotation workflows. For example, given a text corpus, you could use the QuestionGenerator to create questions, but you can also use then use a Reader to predict answers.

Correct QA pairs created in this manner might not be so effective in retraining your Reader model. However, correcting wrong QA pairs creates training samples that your model found challenging. These examples are likely to be impactful when it comes to retraining. This is also a quicker workflow than having annotators generate both a question and an answer.