HomeDocumentationAPI ReferenceTutorials
Haystack Homepage

Shaper

The Shaper component is best described as PromptNode's helper. It makes it possible to use the full potential of PromptNode and integrate it with Haystack. But its scope and functionality are not limited to PromptNode, and you can use Shaper independently.

Position in a PipelineBefore, after, or in-between PromptNodes
InputDepends on the function used
OutputDepends on the function used
ClassesShaper

Shaper and PromptNode

To understand Shaper, let's recall how PromptNode works. PromptNode injects the expected input variables into a user-defined PromptTemplate. It then uses the PromptTemplate to construct the prompt for the large language model (LLM).

In a pipeline, PromptNode receives these variables from the preceding node. It may happen that the variable names or shapes the PromptTemplate expects differ from the ones the PromptNode receives. Shaper solves this issue.

You can also use Shaper in a reverse situation. If the output of a PromptNode differs from the format the next node in the pipeline expects, Shaper can change it.

Let's look at a typical example: question answering. The PromptTemplate named question-answering expects the input variables $questions and $documents:

Given the context please answer the question. Context: $documents; 
Question: $questions; Answer:"

To pass a question forward, Haystack pipelines use the variable query, not questions. Haystack Retrievers return a list of documents.

To make PromptNode generate one answer for each of the retrieved documents, you need to pass to the PromptTemplate one question and one document at a time.
You want the Shaper to:

  • Rename query to questions.
  • Expand questions="your query" to questions=["your query", ..., "your query"] (a list of the same length as documents).

This is how you configure the Shaper to do the renaming and expansion:

from haystack.pipelines import Pipeline
from haystack.nodes.other import Shaper
from haystack.schema import Document

# Shaper helps expand the `query` variable into a list of identical queries (length of documents)
# and store the list of queries in the `questions` variable 
# (the variable used in the question answering template)
shaper = Shaper(func="value_to_list", inputs={"value": "query", "target_list":"documents"}, outputs=["questions"])

node = PromptNode(default_prompt_template="question-answering")
pipe = Pipeline()
pipe.add_node(component=shaper, name="shaper", inputs=["Query"])
pipe.add_node(component=node, name="prompt_node", inputs=["shaper"])

output = pipe.run(query="Which city is the capital city?", 
                  documents=[Document("The capital of France is Paris"), 
                             Document("The capital of Germany is Berlin")])

print(output["results"])
components:
- name: shaper
  params:
    func: value_to_list
    inputs:
      target_list: documents
      value: query
    outputs:
    - questions
  type: Shaper
- name: prompt_node
  params:
    default_prompt_template: question-answering
  type: PromptNode
pipelines:
- name: query
  nodes:
  - inputs:
    - Query
    name: shaper
  - inputs:
    - shaper
    name: prompt_node
version: 1.14.0rc0

The output of the pipeline above is:

# Node output:
['Paris', 'Berlin']

Usage

To initialize the Shaper, specify the function to use for modifying your variables. In this example, the node takes the value query, and creates a list that contains this value as many times as it takes to match the length of the documents list. This list is passed down the pipeline under the key questions.

from haystack.nodes import Shaper

mapper = Shaper(
  func="expand_value_to_list"
  value="query",
  target_list=[Documents],
  output=questions
)
components:
        - name: mapper
          type: Shaper
          params:
            func: expand_value_to_list
            inputs:
                value: query
                target_list: documents
            outputs: 
                - questions

You can use multiple Shaper nodes in one pipeline.

Functions

When initializing Shaper, you specify the function you want it to invoke. The supported functions are:

  • rename
    Renames values without changing them. Example: May come in handy if you use PromptNode at the beginning of the pipeline, where its input is the query. Haystack pipelines accept query as the input, while PromptNode needs question.

    - name: shaper
      type: Shaper
      params:
        func: rename
        inputs:
          value: query
        output: [question]
    
  • value_to_list
    Use this function to turn a value into a list. The value is repeated in the list to match the length of the list. For example, if you set the list length to five, the value is repeated in this list five times. Example: If your PromptTemplate has two parameters: question and documents, and you want the question to be processed against each document, use Shaper with the value_to_list function. It creates a list in which the question is repeated as many times as there are documents. PromptNode then processes each item from each list one by one against each other.

    - name:QuestionsShaper 
        type: Shaper
        params:
          func: value_to_list 
          inputs:
            value: query
          outputs:
            - questions
          params:
            target_list: [5]
    
  • join_strings
    Takes a list of strings and changes it into a list that contains a single string. The new list contains all the original strings separated by the specified delimiter.

  • join_documents
    Takes a list of documents and changes it into a list containing a single document. The new list contains all the original documents separated by the specified delimiter. Example: If you have a pipeline with PromptNode and a PromptTemplate with two parameters, for example, question and documents. To make sure PromptNode runs the question against all documents, you can merge the documents into one:

    - name: joinDocs
       type: Shaper
       params:
        func: join_documents
        inputs:
         - documents
        outputs:
         - documents
    
  • join_lists
    Joins multiple lists into a single list.

  • strings_to_answers
    Transforms a list of strings into a list of Answers.
    Example: This function may come in handy if PromptNode is the last node in a pipeline. The output of the PromptNode is a string, while deepset Cloud pipelines expect the Answer object. You may then add a Shaper with the strings_to_answers option at the end of the pipeline after PromptNode.

    - name: OutputAnswerShaper 
        type: Shaper
        params:
          func: strings_to_answers 
          inputs:
            strings: results # the results PromptNode returns
          outputs:
            - answers
    
  • answers_to_strings
    Extracts the content field of Answers and returns a list of strings. Example:

    - name: AnswerShaper 
        type: Shaper
        params:
          func: answers_to_strings
          inputs:
            strings: answers
          outputs:
            - strings
    
  • strings_to_documents
    Changes a list of strings into a list of documents.

  • documents_to_strings
    Extracts the content field of each document you pass to it and puts it in a list of strings. Each item in this list is the content of the content field of one document.

After performing a function, Shaper passes the new or modified values further down the pipeline.