Shaper
The Shaper component is best described as PromptNode's helper. It makes it possible to use the full potential of PromptNode and integrate it with Haystack. But its scope and functionality are not limited to PromptNode, and you can use Shaper independently.
Position in a Pipeline | Before, after, or in-between PromptNodes |
Input | Depends on the function used |
Output | Depends on the function used |
Classes | Shaper |
Shaper and PromptNode
To understand Shaper, let's recall how PromptNode works. PromptNode injects the expected input variables into a user-defined PromptTemplate. It then uses the PromptTemplate to construct the prompt for the large language model (LLM).
In a pipeline, PromptNode receives these variables from the preceding node. It may happen that the variable names or shapes the PromptTemplate expects differ from the ones the PromptNode receives. Shaper solves this issue.
You can also use Shaper in a reverse situation. If the output of a PromptNode differs from the format the next node in the pipeline expects, Shaper can change it.
Let's look at a typical example: question answering. The PromptTemplate named question-answering
expects the input variables $questions
and $documents
:
Given the context please answer the question. Context: $documents;
Question: $questions; Answer:"
To pass a question forward, Haystack pipelines use the variable query
, not questions
. Haystack Retrievers return a list of documents
.
To make PromptNode generate one answer for each of the retrieved documents, you need to pass to the PromptTemplate one question and one document at a time.
You want the Shaper to:
- Rename
query
toquestions
. - Expand
questions="your query"
toquestions=["your query", ..., "your query"]
(a list of the same length asdocuments
).
This is how you configure the Shaper to do the renaming and expansion:
from haystack.pipelines import Pipeline
from haystack.nodes.other import Shaper
from haystack.schema import Document
# Shaper helps expand the `query` variable into a list of identical queries (length of documents)
# and store the list of queries in the `questions` variable
# (the variable used in the question answering template)
shaper = Shaper(func="value_to_list", inputs={"value": "query", "target_list":"documents"}, outputs=["questions"])
node = PromptNode(default_prompt_template="question-answering")
pipe = Pipeline()
pipe.add_node(component=shaper, name="shaper", inputs=["Query"])
pipe.add_node(component=node, name="prompt_node", inputs=["shaper"])
output = pipe.run(query="Which city is the capital city?",
documents=[Document("The capital of France is Paris"),
Document("The capital of Germany is Berlin")])
print(output["results"])
components:
- name: shaper
params:
func: value_to_list
inputs:
target_list: documents
value: query
outputs:
- questions
type: Shaper
- name: prompt_node
params:
default_prompt_template: question-answering
type: PromptNode
pipelines:
- name: query
nodes:
- inputs:
- Query
name: shaper
- inputs:
- shaper
name: prompt_node
version: 1.14.0rc0
The output of the pipeline above is:
# Node output:
['Paris', 'Berlin']
Usage
To initialize the Shaper, specify the function to use for modifying your variables. In this example, the node takes the value query
, and creates a list that contains this value as many times as it takes to match the length of the documents
list. This list is passed down the pipeline under the key questions
.
from haystack.nodes import Shaper
mapper = Shaper(
func="expand_value_to_list"
value="query",
target_list=[Documents],
output=questions
)
components:
- name: mapper
type: Shaper
params:
func: expand_value_to_list
inputs:
value: query
target_list: documents
outputs:
- questions
You can use multiple Shaper nodes in one pipeline.
Functions
When initializing Shaper, you specify the function you want it to invoke. The supported functions are:
-
rename
Renames values without changing them. Example: May come in handy if you use PromptNode at the beginning of the pipeline, where its input is the query. Haystack pipelines acceptquery
as the input, while PromptNode needsquestion
.- name: shaper type: Shaper params: func: rename inputs: value: query output: [question]
-
value_to_list
Use this function to turn a value into a list. The value is repeated in the list to match the length of the list. For example, if you set the list length to five, the value is repeated in this list five times. Example: If your PromptTemplate has two parameters:question
anddocuments
, and you want the question to be processed against each document, use Shaper with thevalue_to_list
function. It creates a list in which the question is repeated as many times as there are documents. PromptNode then processes each item from each list one by one against each other.- name:QuestionsShaper type: Shaper params: func: value_to_list inputs: value: query outputs: - questions params: target_list: [5]
-
join_strings
Takes a list of strings and changes it into a list that contains a single string. The new list contains all the original strings separated by the specified delimiter. -
join_documents
Takes a list of documents and changes it into a list containing a single document. The new list contains all the original documents separated by the specified delimiter. Example: If you have a pipeline with PromptNode and a PromptTemplate with two parameters, for example,question
anddocuments
. To make sure PromptNode runs the question against all documents, you can merge the documents into one:- name: joinDocs type: Shaper params: func: join_documents inputs: - documents outputs: - documents
-
join_lists
Joins multiple lists into a single list. -
strings_to_answers
Transforms a list of strings into a list of Answers.
Example: This function may come in handy if PromptNode is the last node in a pipeline. The output of the PromptNode is a string, while deepset Cloud pipelines expect the Answer object. You may then add a Shaper with the strings_to_answers option at the end of the pipeline after PromptNode.- name: OutputAnswerShaper type: Shaper params: func: strings_to_answers inputs: strings: results # the results PromptNode returns outputs: - answers
-
answers_to_strings
Extracts the content field of Answers and returns a list of strings. Example:- name: AnswerShaper type: Shaper params: func: answers_to_strings inputs: strings: answers outputs: - strings
-
strings_to_documents
Changes a list of strings into a list of documents. -
documents_to_strings
Extracts thecontent
field of each document you pass to it and puts it in a list of strings. Each item in this list is the content of thecontent
field of one document.
After performing a function, Shaper passes the new or modified values further down the pipeline.
Updated 17 days ago