PromptNode
PromptNode brings you the power of large language models. It's an easy-to-use, customizable node that you can run on its own or in your pipelines for various NLP tasks.
With PromptNode, you can use large language models directly or in your pipelines.
What are large language models?
Large language models are huge models trained on enormous amounts of data. Interacting with such a model resembles talking to another person. These models have general knowledge of the world. You can ask them anything, and they'll be able to answer.
Large language models are trained to perform many NLP tasks with little training data. What's astonishing about them is that a single model can perform various NLP tasks with good accuracy.
Some examples of large language models include flan-t5-base, flan-paLM, chinchilla, and GPT-3 variants, such as text-davinci-003.
PromptNode is a very versatile node. It's used in query pipelines, but its position depends on what you want it to do. You can pass a template to specify the NLP task the PromptNode should perform and a model to use. For more information, see the Usage section.
Position in a Pipeline | Used in query pipelines. The position depends on the NLP task you want it to do. |
Input | Depends on the NLP task it performs. Some examples are query, documents, output of the preceding node. |
Output | Depends on the NLP task it performs. Some examples are answer, query, document summary. |
Classes | PromptNode |
Usage
You can use PromptNode as a stand-alone node or in a pipeline. If you don't specify the model you want to use for the node, it uses flan t5 base.
Maximum Token Limits
Each LLM has an overall maximum token limit that it can process. This limit includes both the prompt (input) and the response (output).
The
max_length
parameter in PromptNode only sets the maximum number of tokens for the generated text output. Therefore, take the potential token length of the prompt into account when setting the parameter. The token length of the prompt plus the specifiedmax_length
together must not be larger than the overall number of tokens the LLM can process.For example, OpenAI's
text-davinci-003
overall limit is 4096 tokens and forgpt-4-32k
the limit is 32768 tokens.
Stand Alone
Just initialize the node and ask a question. The model has general knowledge about the world, so you can ask it anything:
from haystack.nodes import PromptNode
# Initialize the node:
prompt_node = PromptNode()
# Run a prompt
prompt_node("What is the capital of Germany?")
# Here's the output:
['berlin']
With a Template
PromptNode can use a PromptTemplate that contains the prompt for the model.
For better results, use a task-specific PromptTemplate. You can pass additional variables like documents or questions to the node. The template combines all inputs into one or more prompts:
from haystack.nodes import PromptNode, PromptTemplate
from haystack import Document
# Initalize the node
prompt_node = PromptNode()
# You can check what out-of-the-box PromptTemplates are available:
prompt_node.get_prompt_template_names()
# Here's the output:
['question-answering',
'question-answering-per-document',
'question-answering-with-references',
'question-answering-with-document-scores',
'question-generation',
'conditioned-question-generation',
'summarization',
'question-answering-check',
'sentiment-analysis',
'multiple-choice-question-answering',
'topic-classification',
'language-detection',
'translation',
'zero-shot-react']
# Specify the template using the `prompt` method
# and pass your documents and questions:
prompt_node.prompt(prompt_template="question-answering",
documents=[Document("Berlin is the capital of Germany."), Document("Paris is the capital of France.")],
query="What is the capital of Germany?")
# Here's the output:
[<Answer {'answer': 'Berlin', 'type': 'generative', 'score': None, 'context': None, 'offsets_in_document': None, 'offsets_in_context': None, 'document_ids': ['1a7644ef76698b7a1c6ed23c357fa598', 'f225a94f83349e8776d6fb89ebfb41b8'], 'meta': {'prompt': 'Given the context please answer the question. Context: Berlin is the capital of Germany. Paris is the capital of France.; Question: What is the capital of Germany?; Answer:'}}>]
You can also create your own templates and pass variables and functions to them. For more information, see the Prompt Templates section.
With a Model Specified
By default, the PromptNode uses the flan t5 base model. But you can change it to any of these models:
- Hugging Face transformers (all text and text2text-generation models)
- OpenAI InstructGPT models, including ChatGPT and GPT-4
- Azure OpenAI InstructGPT models
- Cohere Command and Generation models
- Anthropic Claude
Here's how you set the model:
from haystack.nodes import PromptNode
# Initalize the node passing the model:
prompt_node = PromptNode(model_name_or_path="google/flan-t5-xl")
# Go ahead and ask a question:
prompt_node("What is the best city in Europe to live in?")
Streaming
You can enable streaming in PromptNode. Streaming will output LLM responses word by word rather than waiting for the entire response to be generated before outputting everything at once.
To enable streaming, you can specify one or both of the following parameters:
stream
is a boolean switch that simply enables streaming.stream_handler
needs to be a subclass ofTokenStreamingHandler
.
Both parameters are specified in the PromptNode init constructor. You can override them per PromptNode request by providing them as kwargs
.
Here's how to quickly enable streaming:
from haystack.nodes.prompt import PromptNode
pn = PromptNode("gpt-3.5-turbo", api_key="<api_key_goes_here>", model_kwargs={"stream":True})
prompt = "What are the three most interesting things about Berlin? Be elaborate and use numbered list"
pn(prompt)
The default streaming is used when stream
is on, and stream_handler
is not specified.
When you provide the stream_handler
, streaming is enabled. You can register your custom handler to output responses within a custom execution.
Here's how you register a custom handler:
from haystack.nodes.prompt import PromptNode
from haystack.nodes.prompt.invocation_layer.handlers import TokenStreamingHandler
class MyCustomTokenStreamingHandler(TokenStreamingHandler):
def __call__(self, token_received, **kwargs) -> str:
# here is your custom logic for each token
return token_received
custom_handler = MyCustomTokenStreamingHandler()
pn = PromptNode("gpt-3.5-turbo", api_key="<api_key_goes_here>", model_kwargs={"stream_handler": custom_handler})
prompt = "What are the three most interesting things about Berlin? Be elaborate and use a numbered list."
pn(prompt)
You can use your implementation of the TokenStreamingHandler
for all invocation layers that support streaming. For example, if you switch from OpenAI to Hugging Face transformers, you can use the same custom TokenStreamingHandler
for both.
In a Pipeline
The real power of PromptNode shows when you use it in a pipeline. Look at the example to get an idea of what's possible.
Examples
Long-Form Question Answering
Long-form QA is one use of the PromptNode, but certainly not the only one. In this QA type, PromptNode handles complex questions by synthesizing information from various documents to retrieve an answer.
from haystack.pipelines import Pipeline
from haystack.nodes import PromptNode, PromptTemplate
from haystack.schema import Document
# Let's create a custom LFQA prompt using PromptTemplate
lfqa_prompt = PromptTemplate(name="lfqa",
prompt_text="""Synthesize a comprehensive answer from the following topk most relevant paragraphs and the given question.
Provide a clear and concise response that summarizes the key points and information presented in the paragraphs.
Your answer should be in your own words and be no longer than 50 words.
\n\n Paragraphs: {join(documents)} \n\n Question: {query} \n\n Answer:""",
output_parser=AnswerParser(),)
# These docs could also come from a retriever
# Here we explicitly specify them to avoid the setup steps for Retriever and DocumentStore
doc_1 = "Contrails are a manmade type of cirrus cloud formed when water vapor from the exhaust of a jet engine condenses on particles, which come from either the surrounding air or the exhaust itself, and freezes, leaving behind a visible trail. The exhaust can also trigger the formation of cirrus by providing ice nuclei when there is an insufficient naturally-occurring supply in the atmosphere. One of the environmental impacts of aviation is that persistent contrails can form into large mats of cirrus, and increased air traffic has been implicated as one possible cause of the increasing frequency and amount of cirrus in Earth's atmosphere."
doc_2 = "Because the aviation industry is especially sensitive to the weather, accurate weather forecasting is essential. Fog or exceptionally low ceilings can prevent many aircraft from landing and taking off. Turbulence and icing are also significant in-flight hazards. Thunderstorms are a problem for all aircraft because of severe turbulence due to their updrafts and outflow boundaries, icing due to the heavy precipitation, as well as large hail, strong winds, and lightning, all of which can cause severe damage to an aircraft in flight. Volcanic ash is also a significant problem for aviation, as aircraft can lose engine power within ash clouds. On a day-to-day basis airliners are routed to take advantage of the jet stream tailwind to improve fuel efficiency. Aircrews are briefed prior to takeoff on the conditions to expect en route and at their destination. Additionally, airports often change which runway is being used to take advantage of a headwind. This reduces the distance required for takeoff, and eliminates potential crosswinds."
# Let's initiate the PromptNode
node = PromptNode("text-davinci-003", default_prompt_template=lfqa_prompt, api_key=api_key)
pipe = Pipeline()
pipe.add_node(component=node, name="prompt_node", inputs=["Query"])
output = pipe.run(query="Why do airplanes leave contrails in the sky?", documents=[Document(doc_1), Document(doc_2)])
[a.answer for a in output["answers"]]
# Here's the answer:
["Contrails are manmade clouds formed when water vapor from the exhaust of a jet engine condenses on particles, which come from either the surrounding air or the exhaust itself, and freezes, creating a visible trail. Increased air traffic has been linked to the greater frequency and amount of these cirrus clouds in Earth's atmosphere."]
Multiple PromptNodes and a Retriever
You can have multiple PromptNodes in your pipeline that reuse the PromptModel to save resources:
from haystack import Pipeline
from haystack.nodes.prompt import PromptNode, PromptModel
# You'd also need to import a Retriever and a DocumentStore,
# we're skipping this in this example
top_k = 10
query = "Who is Paul Atreides' father?"
prompt_model = PromptModel()
node = PromptNode(prompt_model, default_prompt_template="question-generation", output_variable="query")
node2 = PromptNode(prompt_model, default_prompt_template="question-answering-per-document")
# You'd also need to initialize a Retriever with a DocumentStore
# We're skipping this step in this example to simplify it
pipe = Pipeline()
pipe.add_node(component=retriever, name="retriever", inputs=["Query"])
pipe.add_node(component=node, name="prompt_node", inputs=["retriever"])
pipe.add_node(component=node2, name="prompt_node_2", inputs=["prompt_node"])
output = pipe.run(query=query, params={"retriever": {"top_k": 10}})
dict(zip(output["query"], output["answers"]))
PromptTemplates
PromptNode comes with out-of-the-box prompt templates ready for you to use. A template corresponds to an NLP task and contains instructions for the model. Here are some of the templates you can specify for the PromptNode:
question-answering
This template joins all documents into one document and passes one prompt to the model, instructing it to perform question answering on the joined document.question-answering-per-document
This template performs question answering by passing one prompt per document. This may improve the output but is more resource intensive than thequestion-answering
template.question-answering-with-references
This is the same asquestion answering
, but instructs the model to cite documents by adding references. It joins all documents into one and passes one prompt to the model. The model outputs answers together with references to the documents that contain them.question-answering-with-document-scores
Performs question answering, taking into account document scores stored in metadata. This is the template used by the PromptNode in the WebQAPipeline.question-generation
Generates a question based on your documents.conditioned-question-generation
Based on your documents, generates a question for the answer you provide.summarization
Summarizes documents.question-answering-check
Checks if the documents contain the answer to a question.sentiment-analysis
Analyzes the sentiment of the documents.multiple-choice-question-answering
From a set of options, chooses the one that best answers the question.topic-classification
Categorizes documents by their topic.language-detection
Returns the language of the documents.translation
Translates documents.
For a full list of ready-to-use templates, see PromptTemplate in the Github repo. You can also call get_predefined_prompt_templates()
to get this list.
from haystack.nodes.prompt.prompt_template import get_predefined_prompt_templates
get_predefined_prompt_templates()
If you don't specify the template, the node tries to guess what task you want it to perform. By indicating the template, you ensure it performs the right task.
PromptTemplate Structure
Here's an example of the question-answering
template:
PromptTemplate(
name="question-answering",
prompt_text="Given the context please answer the question. Context: {join(documents)}; Question: "
"{query}; Answer:",
output_parser=AnswerParser(),
),
prompt_text
contains the prompt for the task you want the model to do. It also specifies input variables:document
andquery
. The variables are either primitives or lists of primitives.
At runtime, these variables must be present in the execution context of the node. You can apply functions to those variables. For example, you can combine the list of documents into a string by applying thejoin
function. By doing this, only one prompt instead oflen(documents)
prompts is executed.output_parser
converts the output of the model to HaystackDocument
,Answer
, orLabel
object. There's a ready-to-useAnswerParser
which converts the output to the HaystackAnswer
object. Have a look at the API documentation for more information.
Functions in Prompts
You can add functions to your template to control how the documents, the query, or any other variable are rendered. A simplified version of the question-answering
template looks like this:
PromptTemplate(
name="question-answering-simple",
prompt_text="Please answer the question. "
"Context: {' - '.join([d.meta['name']+': '+d.content for d in documents])}; Question: {query}; Answer: ",
output_parser=AnswerParser(),
),
Function Format
The functions use the Python f-string format, so you can use any list comprehensions inside a function:
' '.join([d.meta['name']+': '+d.content for d in documents])
Other than strict f-string syntax, you can safely use the following backslash characters in the text parts of the prompt text: \n
, \t
, \r
. To use them in f-string expressions, pick the corresponding PromptTemplate variable from the table below.
Double quotes ("
) are automatically replaced with single quotes ('
) in the prompt text. To use double quotes in the prompt text, use {double_quote}
instead.
Special characters not allowed in prompt_text expressions | PromptTemplate variable to use instead |
---|---|
\n | new_line |
\t | tab |
\r | carriage_return |
" | double_quote |
Some of the ready-made templates also contain functions, for example:
PromptTemplate(
name="question-answering-with-references",
prompt_text="Create a concise and informative answer (no more than 50 words) for a given question "
"based solely on the given documents. You must only use information from the given documents. "
"Use an unbiased and journalistic tone. Do not repeat text. Cite the documents using Document[number] notation. "
"If multiple documents contain the answer, cite those documents like ‘as stated in Document[number], Document[number], etc.’. "
"If the documents do not contain the answer to the question, say that ‘answering is not possible given the available information.’\n"
"{join(documents, delimiter=new_line, pattern=new_line+'Document[$idx]: $content', str_replace={new_line: ' ', '[': '(', ']': ')'})} \n Question: {query}; Answer: ",
output_parser=AnswerParser(reference_pattern=r"Document\[(\d+)\]"),
),
Note that in this example, we're not using the str.join
Python function but our own convenience function join
.
join
and to_strings
Functions
join
and to_strings
FunctionsTwo functions that you may find useful are join
and to_strings
.
The join
function joins all documents into a single string, where the content of each document is separated by the delimiter you specify.
Example:
"{join(documents, delimiter=new_line, pattern=new_line+'Document[$idx]: $content', str_replace={new_line: ' ', '[': '(', ']': ')'})}
The to strings
function extracts the content field of documents and returns a list of strings. In the example below, it renders each document by its name (document.meta["name"]
) followed by a new line and the contents of the document:
"{to_strings(documents, pattern='$name'+new_line+'$content', str_replace={new_line: ' ', '[': '(', ']': ')'})}
Function Parameters
Parameter | Type | Description |
---|---|---|
documents | List | The documents whose rendering you want to format. Mandatory. |
pattern | String | The regex pattern used for parsing. Optional. You can use the following placeholders in pattern: - $content : The content of the document.- $idx : The index of the document in the list.- $id : The ID of the document.- $META_FIELD : The values of the metadata field called META_FIELD . |
delimiter | String Default: " " (single space) | Specifies the delimiter you want to use to separate documents. Used in the join function. Mandatory. |
str_replace | Dictionary of strings | Specifies the characters you want to replace. Use the format str_replace={"r":"R"} . Optional. |
Output Parsers
AnswerParser
With AnswerParser
, you can convert the plain string model output into proper Answer
objects. It takes care of populating the Answer
's fields like adding the prompt to meta or referencing source document_ids
. Using AnswerParser
makes PromptNode
publish its results in the answers
key. This way, you can use PromptNode
as plug-in replacements for any answer-returning nodes, like, for example Reader
or OpenAIAnswerGenerator
.
Parameter | Type | Description |
---|---|---|
pattern | String | The regex pattern to use for parsing the answer. Examples: [^\n]+$ will find "this is an answer" in string "this is an argument.\nthis is an answer".Answer: (.*) will find "this is an answer" in string "this is an argument. Answer: this is an answer".If None, the whole string is used as the answer. If specified, the first group of the regex is used as the answer. If there is no group, the whole match is used as the answer. |
reference_pattern | String | The regex pattern to use for parsing the document references. Example: \[(\d+)\] will find "1" in string "this is an answer"De".If None, no parsing is done and all documents are referenced. |
Adding Your Prompt
You can easily add your own template:
from haystack.nodes import PromptTemplate, PromptNode
# In `prompt_text`, tell the model what you want it to do.
PromptNode.add_prompt_template(PromptTemplate(name="sentiment-analysis-new",
prompt_text="Indicate the sentiment. Answer with positive, negative, or neutral. Context: {documents}; Answer:"))
For guidelines on how to construct the most efficient prompts, see Prompt Engineering Guidelines.
Setting a Default PromptTemplate
You can set a default template for a PromptNode instance. This way, you can reuse the same PromptNode in your pipeline for different tasks:
from haystack.nodes import PromptTemplate, PromptNode
from haystack.schema import Document
prompt_node = PromptNode()
sa = prompt_node.set_default_prompt_template("sentiment-analysis-new")
sa(documents=[Document("I am in love and I feel great!")])
# Node output:
['positive']
# You can then switch to another template to reuse the same PromptNode
# for a different task:
summarizer = sa.set_default_prompt_template("summarization")
Models
The default model for PromptModel and PromptNode is google/flan-t5-base
but you can use other LLMs that we specified earlier. To do this, specify the model's name and the API key.
Using OpenAI Models
You can replace the default model with a flan t5 model of a different size or a model by OpenAI.
This example uses a version of the GPT-3 model:
from haystack.nodes import PromptModel, PromptNode
openai_api_key = <type your OpenAI API key>
# Specify the model you want to use:
prompt_open_ai = PromptModel(model_name_or_path="text-davinci-003", api_key=openai_api_key)
# Make PromptNode use the model:
pn_open_ai = PromptNode(prompt_open_ai)
pn_open_ai("What's the coolest city to live in Germany?")
Using ChatGPT and GPT-4
You can also use the gpt-3.5-turbo
, gpt-4
and gpt-4-32k
models from OpenAI to build your own chat functionality. The API for this model includes three types of role
: system
, assistant
, and user
. To use Chat GPT, you simply initialize the PromptNode
with the gpt-3.5-turbo
model:
from haystack.nodes import PromptNode
openai_api_key = <type your OpenAI API key>
# Specify "gpt-3.5-turbo" as the model for PromptNode
prompt_node = PromptNode(model_name_or_path="gpt-3.5-turbo", api_key=openai_api_key)
Here's an example of how you can build a chat function that makes use of each role
and keep track of the chat flow:
messages = [{"role": "system", "content": "You are a helpful assistant"}]
def build_chat(user_input: str = "", asistant_input: str = ""):
if user_input != "":
messages.append({"role": "user", "content": user_input})
if asistant_input != "":
messages.append({"role": "assistant", "content": asistant_input})
def chat(input: str):
build_chat(user_input=input)
chat_gpt_answer = prompt_node(messages)
build_chat(asistant_input=chat_gpt_answer[0])
return chat_gpt_answer
Now you can use your chat()
function:
chat("Who is Barack Obama Married to?")
chat("And what year was she born?")
Using Azure OpenAI Service
In addition to working with APIs directly from OpenAI, you can use PromptModel
with Azure OpenAI APIs. For available models and versions for the service, check Azure documentation.
from haystack.nodes import PromptModel
prompt_azure_open_ai = PromptModel(
model_name_or_path="text-davinci-003",
api_key="<your-azure-openai-key>",
model_kwargs={
"api_version": "2022-12-01",
"azure_base_url":"https://<your-endpoint>.openai.azure.com",
"azure_deployment_name": "<your-deployment-name>",
}
)
pn_azure_open_ai = PromptNode(prompt_azure_open_ai)
Using ChatGPT on Azure
You can use ChatGPT API on Azure. Here's an example of how you could do that:
api_key = os.environ.get("AZURE_API_KEY")
deployment_name = os.environ.get("AZURE_DEPLOYMENT_NAME")
base_url = os.environ.get("AZURE_BASE_URL")
azure_chat = PromptModel(
model_name_or_path="gpt-35-turbo",
api_key=api_key,
model_kwargs={
"azure_deployment_name": deployment_name,
"azure_base_url": base_url,
},
)
There are two parameters that you pass as model_kwargs
:
azure_deployment_name
- the name of your Azure deployment.azure_base_url
- the URL of the Azure OpenAI endpoint.
Using Hugging Face Models
You can specify parameters for Hugging Face models using model_kwargs
. Check out all the available parameters in Hugging Face documentation.
Here's an example of how to set temperature
of the model:
from transformers import GenerationConfig
# Using a dictionary
node = PromptNode(model_kwargs={"generation_kwargs": {"do_sample": True, "temperature": 0.6}})
# Using a GenerationConfig object from HuggingFace
node = PromptNode(model_kwargs={"generation_kwargs": GenerationConfig(do_sample=True, top_p=0.9, temperature=0.6)})
Using Hugging Face Inference API
To see the models that can be used with Hugging Face Inference API, use this command:
curl -s https://api-inference.huggingface.co/framework/text-generation-inference
To use the selected model, simply define a PromptNode with Hugging Face token as an api_key
and selected model in model_name_or_path
.
Using Different Models in One Pipeline
You can also specify different LLMs for each PromptNode in your pipeline. This way, you create multiple PromptNode instances that use a single PromptNode, which saves computational resources.
from haystack.nodes. import PromptTemplate, PromptNode, PromptModel
from haystack.pipelines import Pipeline
# This is to set up the OpenAI model:
from getpass import getpass
api_key_prompt = "Enter OpenAI API key:"
api_key = getpass(api_key_prompt)
# Specify the model you want to use:
prompt_open_ai = PromptModel(model_name_or_path="text-davinci-003", api_key=api_key)
# This sets up the default model:
prompt_model = PromptModel()
# Now let make one PromptNode use the default model and the other one the OpenAI model:
node_default_model = PromptNode(prompt_model, default_prompt_template="question-generation", output_variable="questions")
node_openai = PromptNode(prompt_open_ai, default_prompt_template="question-answering")
pipe = Pipeline()
pipe.add_node(component=node_default_model, name="prompt_node1", inputs=["Query"])
pipe.add_node(component=node_openai, name="prompt_node_2", inputs=["prompt_node1"])
output = pipe.run(query="not relevant", documents=[Document("Berlin is the capital of Germany")])
output["results"]
Migrating from Previous PromptNode Versions
If you created PromptTemplates for previous PromptNode versions and you want to migrate to 1.15, here's what you need to do:
- Convert any
$var
into{var}
. - Convert any Shaper you used to modify PromptNode input (Shapers used before PromptNodes) into an f-string function:
- For Shapers with the
join_documents
function, use thejoin
function. - For Shapers with the
value_to_list
function, remove the Shapers altogether.
- For Shapers with the
- Convert Shapers used to modify PromptNode output (Shapers after PromptNode) into
output_parser
. For any Shapers with thestrings_to_answers
function, useoutput_parser=AnswerParser
.
Updated over 1 year ago