Extract the output of a Generator to an Answer format, and build prompts.
Module answer_builder
AnswerBuilder
Converts a query and Generator replies into a GeneratedAnswer
object.
AnswerBuilder parses Generator replies using custom regular expressions.
Check out the usage example below to see how it works.
Optionally, it can also take documents and metadata from the Generator to add to the GeneratedAnswer
object.
AnswerBuilder works with both non-chat and chat Generators.
Usage example
from haystack.components.builders import AnswerBuilder
builder = AnswerBuilder(pattern="Answer: (.*)")
builder.run(query="What's the answer?", replies=["This is an argument. Answer: This is the answer."])
AnswerBuilder.__init__
def __init__(pattern: Optional[str] = None,
reference_pattern: Optional[str] = None)
Creates an instance of the AnswerBuilder component.
Arguments:
pattern
: The regular expression pattern to extract the answer text from the Generator. If not specified, the entire response is used as the answer. The regular expression can have one capture group at most. If present, the capture group text is used as the answer. If no capture group is present, the whole match is used as the answer. Examples:[^\n]+$
finds "this is an answer" in a string "this is an argument.\nthis is an answer".Answer: (.*)
finds "this is an answer" in a string "this is an argument. Answer: this is an answer".reference_pattern
: The regular expression pattern used for parsing the document references. If not specified, no parsing is done, and all documents are referenced. References need to be specified as indices of the input documents and start at [1]. Example:\[(\d+)\]
finds "1" in a string "this is an answer[1]".
AnswerBuilder.run
@component.output_types(answers=List[GeneratedAnswer])
def run(query: str,
replies: Union[List[str], List[ChatMessage]],
meta: Optional[List[Dict[str, Any]]] = None,
documents: Optional[List[Document]] = None,
pattern: Optional[str] = None,
reference_pattern: Optional[str] = None)
Turns the output of a Generator into GeneratedAnswer
objects using regular expressions.
Arguments:
query
: The input query used as the Generator prompt.replies
: The output of the Generator. Can be a list of strings or a list ofChatMessage
objects.meta
: The metadata returned by the Generator. If not specified, the generated answer will contain no metadata.documents
: The documents used as the Generator inputs. If specified, they are added to theGeneratedAnswer
objects. If bothdocuments
andreference_pattern
are specified, the documents referenced in the Generator output are extracted from the input documents and added to theGeneratedAnswer
objects.pattern
: The regular expression pattern to extract the answer text from the Generator. If not specified, the entire response is used as the answer. The regular expression can have one capture group at most. If present, the capture group text is used as the answer. If no capture group is present, the whole match is used as the answer. Examples:[^\n]+$
finds "this is an answer" in a string "this is an argument.\nthis is an answer".Answer: (.*)
finds "this is an answer" in a string "this is an argument. Answer: this is an answer".reference_pattern
: The regular expression pattern used for parsing the document references. If not specified, no parsing is done, and all documents are referenced. References need to be specified as indices of the input documents and start at [1]. Example:\[(\d+)\]
finds "1" in a string "this is an answer[1]".
Returns:
A dictionary with the following keys:
answers
: The answers received from the output of the Generator.
Module prompt_builder
PromptBuilder
Renders a prompt filling in any variables so that it can send it to a Generator.
The prompt uses Jinja2 template syntax. The variables in the default template are used as PromptBuilder's input and are all optional. If they're not provided, they're replaced with an empty string in the rendered prompt. To try out different prompts, you can replace the prompt template at runtime by providing a template for each pipeline run invocation.
Usage examples
On its own
This example uses PromptBuilder to render a prompt template and fill it with target_language
and snippet
. PromptBuilder returns a prompt with the string "Translate the following context to Spanish.
Context: I can't speak Spanish.; Translation:".
from haystack.components.builders import PromptBuilder
template = "Translate the following context to {{ target_language }}. Context: {{ snippet }}; Translation:"
builder = PromptBuilder(template=template)
builder.run(target_language="spanish", snippet="I can't speak spanish.")
In a Pipeline
This is an example of a RAG pipeline where PromptBuilder renders a custom prompt template and fills it with the contents of the retrieved documents and a query. The rendered prompt is then sent to a Generator.
from haystack import Pipeline, Document
from haystack.utils import Secret
from haystack.components.generators import OpenAIGenerator
from haystack.components.builders.prompt_builder import PromptBuilder
# in a real world use case documents could come from a retriever, web, or any other source
documents = [Document(content="Joe lives in Berlin"), Document(content="Joe is a software engineer")]
prompt_template = """
Given these documents, answer the question.
Documents:
{% for doc in documents %}
{{ doc.content }}
{% endfor %}
Question: {{query}}
Answer:
"""
p = Pipeline()
p.add_component(instance=PromptBuilder(template=prompt_template), name="prompt_builder")
p.add_component(instance=OpenAIGenerator(api_key=Secret.from_env_var("OPENAI_API_KEY")), name="llm")
p.connect("prompt_builder", "llm")
question = "Where does Joe live?"
result = p.run({"prompt_builder": {"documents": documents, "query": question}})
print(result)
Changing the template at runtime (prompt engineering)
You can change the prompt template of an existing pipeline, like in this example:
documents = [
Document(content="Joe lives in Berlin", meta={"name": "doc1"}),
Document(content="Joe is a software engineer", meta={"name": "doc1"}),
]
new_template = """
You are a helpful assistant.
Given these documents, answer the question.
Documents:
{% for doc in documents %}
Document {{ loop.index }}:
Document name: {{ doc.meta['name'] }}
{{ doc.content }}
{% endfor %}
Question: {{ query }}
Answer:
"""
p.run({
"prompt_builder": {
"documents": documents,
"query": question,
"template": new_template,
},
})
To replace the variables in the default template when testing your prompt,
pass the new variables in the variables
parameter.
Overwriting variables at runtime
To overwrite the values of variables, use template_variables
during runtime:
language_template = """
You are a helpful assistant.
Given these documents, answer the question.
Documents:
{% for doc in documents %}
Document {{ loop.index }}:
Document name: {{ doc.meta['name'] }}
{{ doc.content }}
{% endfor %}
Question: {{ query }}
Please provide your answer in {{ answer_language | default('English') }}
Answer:
"""
p.run({
"prompt_builder": {
"documents": documents,
"query": question,
"template": language_template,
"template_variables": {"answer_language": "German"},
},
})
Note that language_template
introduces variable answer_language
which is not bound to any pipeline variable.
If not set otherwise, it will use its default value 'English'.
This example overwrites its value to 'German'.
Use template_variables
to overwrite pipeline variables (such as documents) as well.
PromptBuilder.__init__
def __init__(template: str,
required_variables: Optional[List[str]] = None,
variables: Optional[List[str]] = None)
Constructs a PromptBuilder component.
Arguments:
template
: A prompt template that uses Jinja2 syntax to add variables. For example:"Summarize this document: {{ documents[0].content }}\nSummary:"
It's used to render the prompt. The variables in the default template are input for PromptBuilder and are all optional, unless explicitly specified. If an optional variable is not provided, it's replaced with an empty string in the rendered prompt.required_variables
: List variables that must be provided as input to PromptBuilder. If a variable listed as required is not provided, an exception is raised. Optional.variables
: List input variables to use in prompt templates instead of the ones inferred from thetemplate
parameter. For example, to use more variables during prompt engineering than the ones present in the default template, you can provide them here.
PromptBuilder.to_dict
def to_dict() -> Dict[str, Any]
Returns a dictionary representation of the component.
Returns:
Serialized dictionary representation of the component.
PromptBuilder.run
@component.output_types(prompt=str)
def run(template: Optional[str] = None,
template_variables: Optional[Dict[str, Any]] = None,
**kwargs)
Renders the prompt template with the provided variables.
It applies the template variables to render the final prompt. You can provide variables via pipeline kwargs.
In order to overwrite the default template, you can set the template
parameter.
In order to overwrite pipeline kwargs, you can set the template_variables
parameter.
Arguments:
template
: An optional string template to overwrite PromptBuilder's default template. If None, the default template provided at initialization is used.template_variables
: An optional dictionary of template variables to overwrite the pipeline variables.kwargs
: Pipeline variables used for rendering the prompt.
Raises:
ValueError
: If any of the required template variables is not provided.
Returns:
A dictionary with the following keys:
prompt
: The updated prompt text after rendering the prompt template.
Module chat_prompt_builder
ChatPromptBuilder
ChatPromptBuilder is a component that renders a chat prompt from a template string using Jinja2 templates.
It is designed to construct prompts for the pipeline using static or dynamic templates: Users can change
the prompt template at runtime by providing a new template for each pipeline run invocation if needed.
The template variables found in the init template string are used as input types for the component and are all
optional, unless explicitly specified. If an optional template variable is not provided as an input, it will be
replaced with an empty string in the rendered prompt. Use `variable` and `required_variables` to specify the input
types and required variables.
Usage example with static prompt template:
```python
template = [ChatMessage.from_user("Translate to {{ target_language }}. Context: {{ snippet }}; Translation:")]
builder = ChatPromptBuilder(template=template)
builder.run(target_language="spanish", snippet="I can't speak spanish.")
```
Usage example of overriding the static template at runtime:
```python
template = [ChatMessage.from_user("Translate to {{ target_language }}. Context: {{ snippet }}; Translation:")]
builder = ChatPromptBuilder(template=template)
builder.run(target_language="spanish", snippet="I can't speak spanish.")
msg = "Translate to {{ target_language }} and summarize. Context: {{ snippet }}; Summary:"
summary_template = [ChatMessage.from_user(msg)]
builder.run(target_language="spanish", snippet="I can't speak spanish.", template=summary_template)
```
Usage example with dynamic prompt template:
```python
from haystack.components.builders import ChatPromptBuilder
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage
from haystack import Pipeline
from haystack.utils import Secret
# no parameter init, we don't use any runtime template variables
prompt_builder = ChatPromptBuilder()
llm = OpenAIChatGenerator(api_key=Secret.from_token("<your-api-key>"), model="gpt-3.5-turbo")
pipe = Pipeline()
pipe.add_component("prompt_builder", prompt_builder)
pipe.add_component("llm", llm)
pipe.connect("prompt_builder.prompt", "llm.messages")
location = "Berlin"
language = "English"
system_message = ChatMessage.from_system("You are an assistant giving information to tourists in {{language}}")
messages = [system_message, ChatMessage.from_user("Tell me about {{location}}")]
res = pipe.run(data={"prompt_builder": {"template_variables": {"location": location, "language": language},
"template": messages}})
print(res)
>> {'llm': {'replies': [ChatMessage(content="Berlin is the capital city of Germany and one of the most vibrant
and diverse cities in Europe. Here are some key things to know...Enjoy your time exploring the vibrant and dynamic
capital of Germany!", role=<ChatRole.ASSISTANT: 'assistant'>, name=None, meta={'model': 'gpt-3.5-turbo-0613',
'index': 0, 'finish_reason': 'stop', 'usage': {'prompt_tokens': 27, 'completion_tokens': 681, 'total_tokens':
708}})]}}
messages = [system_message, ChatMessage.from_user("What's the weather forecast for {{location}} in the next
{{day_count}} days?")]
res = pipe.run(data={"prompt_builder": {"template_variables": {"location": location, "day_count": "5"},
"template": messages}})
print(res)
>> {'llm': {'replies': [ChatMessage(content="Here is the weather forecast for Berlin in the next 5
days:
Day 1: Mostly cloudy with a high of 22°C (72°F) and...so it's always a good idea to check for updates closer to your visit.", role=<ChatRole.ASSISTANT: 'assistant'>, name=None, meta={'model': 'gpt-3.5-turbo-0613', 'index': 0, 'finish_reason': 'stop', 'usage': {'prompt_tokens': 37, 'completion_tokens': 201, 'total_tokens': 238}})]}} ```
Note how in the example above, we can dynamically change the prompt template by providing a new template to the
run method of the pipeline.
ChatPromptBuilder.__init__
def __init__(template: Optional[List[ChatMessage]] = None,
required_variables: Optional[List[str]] = None,
variables: Optional[List[str]] = None)
Constructs a ChatPromptBuilder component.
Arguments:
template
: A list ofChatMessage
instances. All user and system messages are treated as potentially having jinja2 templates and are rendered with the provided template variables. If not provided, the template must be provided at runtime using thetemplate
parameter of therun
method.required_variables
: An optional list of input variables that must be provided at all times. If not provided, an exception will be raised.variables
: A list of template variable names you can use in prompt construction. For example, ifvariables
contains the stringdocuments
, the component will create an input calleddocuments
of typeAny
. These variable names are used to resolve variables and their values during pipeline execution. The values associated with variables from the pipeline runtime are then injected into template placeholders of a prompt text template that is provided to therun
method. If not provided, variables are inferred fromtemplate
.
ChatPromptBuilder.run
@component.output_types(prompt=List[ChatMessage])
def run(template: Optional[List[ChatMessage]] = None,
template_variables: Optional[Dict[str, Any]] = None,
**kwargs)
Executes the prompt building process.
It applies the template variables to render the final prompt. You can provide variables either via pipeline
(set through variables
or inferred from template
at initialization) or via additional template variables
set directly to this method. On collision, the variables provided directly to this method take precedence.
Arguments:
template
: An optional list of ChatMessages to overwrite ChatPromptBuilder's default template. If None, the default template provided at initialization is used.template_variables
: An optional dictionary of template variables. These are additional variables users can provide directly to this method in contrast to pipeline variables.kwargs
: Pipeline variables (typically resolved from a pipeline) which are merged with the provided template variables.
Raises:
ValueError
: Ifchat_messages
is empty or contains elements that are not instances ofChatMessage
.
Returns:
A dictionary with the following keys:
prompt
: The updated list ofChatMessage
instances after rendering the found templates.
ChatPromptBuilder.to_dict
def to_dict() -> Dict[str, Any]
Returns a dictionary representation of the component.
Returns:
Serialized dictionary representation of the component.
ChatPromptBuilder.from_dict
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "ChatPromptBuilder"
Deserialize this component from a dictionary.
Arguments:
data
: The dictionary to deserialize and create the component.
Returns:
The deserialized component.