A helper node with a variety of functions.
Module shaper
rename
def rename(value: Any) -> Tuple[Any]
Identity function. Can be used to rename values in the invocation context without changing them.
Example:
assert rename(1) == (1, )
value_to_list
def value_to_list(value: Any, target_list: List[Any]) -> Tuple[List[Any]]
Transforms a value into a list containing this value as many times as the length of the target list.
Example:
assert value_to_list(value=1, target_list=list(range(5))) == ([1, 1, 1, 1, 1], )
join_lists
def join_lists(lists: List[List[Any]]) -> Tuple[List[Any]]
Joins the passed lists into a single one.
Example:
assert join_lists(lists=[[1, 2, 3], [4, 5]]) == ([1, 2, 3, 4, 5], )
join_strings
def join_strings(strings: List[str], delimiter: str = " ") -> Tuple[str]
Transforms a list of strings into a single string. The content of this string is the content of all original strings separated by the delimiter you specify.
Example:
assert join_strings(strings=["first", "second", "third"], delimiter=" - ") == ("first - second - third", )
join_documents
def join_documents(documents: List[Document],
delimiter: str = " ") -> Tuple[List[Document]]
Transforms a list of documents into a list containing a single Document. The content of this list is the content of all original documents separated by the delimiter you specify.
All metadata is dropped. (TODO: fix)
Example:
assert join_documents(
documents=[
Document(content="first"),
Document(content="second"),
Document(content="third")
],
delimiter=" - "
) == ([Document(content="first - second - third")], )
strings_to_answers
def strings_to_answers(strings: List[str]) -> Tuple[List[Answer]]
Transforms a list of strings into a list of Answers.
Example:
assert strings_to_answers(strings=["first", "second", "third"]) == ([
Answer(answer="first"),
Answer(answer="second"),
Answer(answer="third"),
], )
answers_to_strings
def answers_to_strings(answers: List[Answer]) -> Tuple[List[str]]
Extracts the content field of Documents and returns a list of strings.
Example:
assert answers_to_strings(
answers=[
Answer(answer="first"),
Answer(answer="second"),
Answer(answer="third")
]
) == (["first", "second", "third"],)
strings_to_documents
def strings_to_documents(
strings: List[str],
meta: Union[List[Optional[Dict[str, Any]]],
Optional[Dict[str, Any]]] = None,
id_hash_keys: Optional[List[str]] = None) -> Tuple[List[Document]]
Transforms a list of strings into a list of Documents. If you pass the metadata in a single
dictionary, all Documents get the same metadata. If you pass the metadata as a list, the length of this list
must be the same as the length of the list of strings, and each Document gets its own metadata.
You can specify id_hash_keys
only once and it gets assigned to all Documents.
Example:
assert strings_to_documents(
strings=["first", "second", "third"],
meta=[{"position": i} for i in range(3)],
id_hash_keys=['content', 'meta]
) == ([
Document(content="first", metadata={"position": 1}, id_hash_keys=['content', 'meta])]),
Document(content="second", metadata={"position": 2}, id_hash_keys=['content', 'meta]),
Document(content="third", metadata={"position": 3}, id_hash_keys=['content', 'meta])
], )
documents_to_strings
def documents_to_strings(documents: List[Document]) -> Tuple[List[str]]
Extracts the content field of Documents and returns a list of strings.
Example:
assert documents_to_strings(
documents=[
Document(content="first"),
Document(content="second"),
Document(content="third")
]
) == (["first", "second", "third"],)
Shaper
class Shaper(BaseComponent)
Shaper is a component that can invoke arbitrary, registered functions on the invocation context (query, documents, and so on) of a pipeline. It then passes the new or modified variables further down the pipeline.
Using YAML configuration, the Shaper component is initialized with functions to invoke on pipeline invocation context.
For example, in the YAML snippet below:
components:
- name: shaper
type: Shaper
params:
func: value_to_list
inputs:
value: query
target_list: documents
output: [questions]
Shaper component is initialized with a directive to invoke function expand on the variable query and to store the result in the invocation context variable questions. All other invocation context variables are passed down the pipeline as they are.
Shaper is especially useful for pipelines with PromptNodes, where we need to modify the invocation context to match the templates of PromptNodes.
You can use multiple Shaper components in a pipeline to modify the invocation context as needed.
Shaper
supports the current functions:
value_to_list
join_strings
join_documents
join_lists
strings_to_documents
documents_to_strings
See their descriptions in the code for details about their inputs, outputs, and other parameters.
Shaper.__init__
def __init__(func: str,
outputs: List[str],
inputs: Optional[Dict[str, Union[List[str], str]]] = None,
params: Optional[Dict[str, Any]] = None,
publish_outputs: Union[bool, List[str]] = True)
Initializes the Shaper component.
Some examples:
- name: shaper
type: Shaper
params:
func: value_to_list
inputs:
value: query
target_list: documents
outputs:
- questions
This node takes the content of query
and creates a list that contains the value of query
len(documents)
times.
This list is stored in the invocation context under the key questions
.
- name: shaper
type: Shaper
params:
func: join_documents
inputs:
value: documents
params:
delimiter: ' - '
outputs:
- documents
This node overwrites the content of documents
in the invocation context with a list containing a single Document
whose content is the concatenation of all the original Documents. So if documents
contained
[Document("A"), Document("B"), Document("C")]
, this shaper overwrites it with [Document("A - B - C")]
- name: shaper
type: Shaper
params:
func: join_strings
params:
strings: ['a', 'b', 'c']
delimiter: ' . '
outputs:
- single_string
- name: shaper
type: Shaper
params:
func: strings_to_documents
inputs:
strings: single_string
metadata:
name: 'my_file.txt'
outputs:
- single_document
These two nodes, executed one after the other, first add a key in the invocation context called single_string
that contains a . b . c
, and then create another key called single_document
that contains instead
[Document(content="a . b . c", metadata={'name': 'my_file.txt'})]
.
Arguments:
func
: The function to apply.inputs
: Maps the function's input kwargs to the key-value pairs in the invocation context. For example,value_to_list
expects thevalue
andtarget_list
parameters, soinputs
might contain:{'value': 'query', 'target_list': 'documents'}
. It doesn't need to contain all keyword args, seeparams
.params
: Maps the function's input kwargs to some fixed values. For example,value_to_list
expectsvalue
andtarget_list
parameters, soparams
might contain{'value': 'A', 'target_list': [1, 1, 1, 1]}
and the node's output is["A", "A", "A", "A"]
. It doesn't need to contain all keyword args, seeinputs
. You can use params to provide fallback values for arguments ofrun
that you're not sure exist. So if you needquery
to exist, you can provide a fallback value in the params, which will be used only ifquery
is not passed to this node by the pipeline.outputs
: The key to store the outputs in the invocation context. The length of the outputs must match the number of outputs produced by the function invoked.publish_outputs
: Controls whether to publish the outputs to the pipeline's output. SetTrue
(default value) to publishes all outputs orFalse
to publish None. E.g. ifoutputs = ["documents"]
result forpublish_outputs = True
looks like
{
"invocation_context": {
"documents": [...]
},
"documents": [...]
}
For publish_outputs = False
result looks like
{
"invocation_context": {
"documents": [...]
},
}
If you want to have finer-grained control, pass a list of the outputs you want to publish.