Transformers
haystack_integrations.components.classifiers.transformers.zero_shot_document_classifier
TransformersZeroShotDocumentClassifier
Performs zero-shot classification of documents based on given labels and adds the predicted label to their metadata.
The component uses a Hugging Face pipeline for zero-shot classification. Provide the model and the set of labels to be used for categorization during initialization. Additionally, you can configure the component to allow multiple labels to be true.
Classification is run on the document's content field by default. If you want it to run on another field, set the
classification_field to one of the document's metadata fields.
Available models for the task of zero-shot-classification include:
- valhalla/distilbart-mnli-12-3
- cross-encoder/nli-distilroberta-base
- cross-encoder/nli-deberta-v3-xsmall
Usage example
The following is a pipeline that classifies documents based on predefined classification labels retrieved from a search pipeline:
from haystack import Document
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.core.pipeline import Pipeline
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack_integrations.components.classifiers.transformers import TransformersZeroShotDocumentClassifier
documents = [Document(id="0", content="Today was a nice day!"),
Document(id="1", content="Yesterday was a bad day!")]
document_store = InMemoryDocumentStore()
retriever = InMemoryBM25Retriever(document_store=document_store)
document_classifier = TransformersZeroShotDocumentClassifier(
model="cross-encoder/nli-deberta-v3-xsmall",
labels=["positive", "negative"],
)
document_store.write_documents(documents)
pipeline = Pipeline()
pipeline.add_component(instance=retriever, name="retriever")
pipeline.add_component(instance=document_classifier, name="document_classifier")
pipeline.connect("retriever", "document_classifier")
queries = ["How was your day today?", "How was your day yesterday?"]
expected_predictions = ["positive", "negative"]
for idx, query in enumerate(queries):
result = pipeline.run({"retriever": {"query": query, "top_k": 1}})
assert result["document_classifier"]["documents"][0].to_dict()["id"] == str(idx)
assert (result["document_classifier"]["documents"][0].to_dict()["classification"]["label"]
== expected_predictions[idx])
init
__init__(
model: str,
labels: list[str],
multi_label: bool = False,
classification_field: str | None = None,
device: ComponentDevice | None = None,
token: Secret | None = Secret.from_env_var(
["HF_API_TOKEN", "HF_TOKEN"], strict=False
),
huggingface_pipeline_kwargs: dict[str, Any] | None = None,
) -> None
Initializes the TransformersZeroShotDocumentClassifier.
See the Hugging Face website for the full list of zero-shot classification models (NLI) models.
Parameters:
- model (
str) – The name or path of a Hugging Face model for zero shot document classification. - labels (
list[str]) – The set of possible class labels to classify each document into, for example, ["positive", "negative"]. The labels depend on the selected model. - multi_label (
bool) – Whether or not multiple candidate labels can be true. IfFalse, the scores are normalized such that the sum of the label likelihoods for each sequence is 1. IfTrue, the labels are considered independent and probabilities are normalized for each candidate by doing a softmax of the entailment score vs. the contradiction score. - classification_field (
str | None) – Name of document's meta field to be used for classification. If not set,Document.contentis used by default. - device (
ComponentDevice | None) – The device on which the model is loaded. IfNone, the default device is automatically selected. If a device/device map is specified inhuggingface_pipeline_kwargs, it overrides this parameter. - token (
Secret | None) – The Hugging Face token to use as HTTP bearer authorization. Check your HF token in your account settings. - huggingface_pipeline_kwargs (
dict[str, Any] | None) – Dictionary containing keyword arguments used to initialize the Hugging Face pipeline for text classification.
warm_up
Initializes the component.
to_dict
Serializes the component to a dictionary.
Returns:
dict[str, Any]– Dictionary with serialized data.
from_dict
Deserializes the component from a dictionary.
Parameters:
- data (
dict[str, Any]) – Dictionary to deserialize from.
Returns:
TransformersZeroShotDocumentClassifier– Deserialized component.
run
Classifies the documents based on the provided labels and adds them to their metadata.
The classification results are stored in the classification dict within
each document's metadata. If multi_label is set to True, the scores for each label are available under
the details key within the classification dictionary.
Parameters:
- documents (
list[Document]) – Documents to process. - batch_size (
int) – Batch size used for processing the content in each document.
Returns:
dict[str, Any]– A dictionary with the following key:documents: A list of documents with an added metadata field calledclassification.
haystack_integrations.components.extractors.transformers.named_entity_extractor
NamedEntityAnnotation
Describes a single NER annotation.
Parameters:
- entity (
str) – Entity label. - start (
int) – Start index of the entity in the document. - end (
int) – End index of the entity in the document. - score (
float | None) – Score calculated by the model.
TransformersNamedEntityExtractor
Annotates named entities in a collection of documents.
The component can be used with any token classification model from the Hugging Face model hub. Annotations are stored as metadata in the documents.
Usage example:
from haystack import Document
from haystack_integrations.components.extractors.transformers import TransformersNamedEntityExtractor
documents = [
Document(content="I'm Merlin, the happy pig!"),
Document(content="My name is Clara and I live in Berkeley, California."),
]
extractor = TransformersNamedEntityExtractor(model="dslim/bert-base-NER")
results = extractor.run(documents=documents)["documents"]
annotations = [TransformersNamedEntityExtractor.get_stored_annotations(doc) for doc in results]
print(annotations)
init
__init__(
*,
model: str,
pipeline_kwargs: dict[str, Any] | None = None,
device: ComponentDevice | None = None,
token: Secret | None = Secret.from_env_var(
["HF_API_TOKEN", "HF_TOKEN"], strict=False
)
) -> None
Create a Named Entity extractor component.
Parameters:
- model (
str) – Name of the model or a path to the model on the local disk. - pipeline_kwargs (
dict[str, Any] | None) – Keyword arguments passed to the pipeline. The pipeline can override these arguments. - device (
ComponentDevice | None) – The device on which the model is loaded. IfNone, the default device is automatically selected. If a device/device map is specified inpipeline_kwargs, it overrides this parameter. - token (
Secret | None) – The API token to download private models from Hugging Face.
warm_up
Initialize the component.
Raises:
ComponentError– If the component fails to initialize successfully.
run
Annotate named entities in each document and store the annotations in the document's metadata.
Parameters:
- documents (
list[Document]) – Documents to process. - batch_size (
int) – Batch size used for processing the documents.
Returns:
dict[str, Any]– Processed documents.
Raises:
ComponentError– If the model fails to process a document.
to_dict
Serializes the component to a dictionary.
Returns:
dict[str, Any]– Dictionary with serialized data.
from_dict
Deserializes the component from a dictionary.
Parameters:
- data (
dict[str, Any]) – Dictionary to deserialize from.
Returns:
TransformersNamedEntityExtractor– Deserialized component.
initialized
Returns if the extractor is ready to annotate text.
get_stored_annotations
get_stored_annotations(
document: Document,
) -> list[NamedEntityAnnotation] | None
Returns the document's named entity annotations stored in its metadata, if any.
Parameters:
- document (
Document) – Document whose annotations are to be fetched.
Returns:
list[NamedEntityAnnotation] | None– The stored annotations.
haystack_integrations.components.generators.transformers.chat.chat_generator
default_tool_parser
Default implementation for parsing tool calls from model output text.
Uses DEFAULT_TOOL_PATTERN to extract tool calls.
Parameters:
- text (
str) – The text to parse for tool calls.
Returns:
list[ToolCall] | None– A list containing a single ToolCall if a valid tool call is found, None otherwise.
TransformersChatGenerator
Generates chat responses using models from Hugging Face that run locally.
Use this component with chat-based models,
such as Qwen/Qwen3-0.6B or meta-llama/Llama-2-7b-chat-hf.
LLMs running locally may need powerful hardware.
Usage example
from haystack.dataclasses import ChatMessage
from haystack_integrations.components.generators.transformers import TransformersChatGenerator
generator = TransformersChatGenerator(model="Qwen/Qwen3-0.6B")
messages = [ChatMessage.from_user("What's Natural Language Processing? Be brief.")]
print(generator.run(messages))
{'replies':
[ChatMessage(_role=<ChatRole.ASSISTANT: 'assistant'>, _content=[TextContent(text=
"Natural Language Processing (NLP) is a subfield of artificial intelligence that deals
with the interaction between computers and human language. It enables computers to understand, interpret, and
generate human language in a valuable way. NLP involves various techniques such as speech recognition, text
analysis, sentiment analysis, and machine translation. The ultimate goal is to make it easier for computers to
process and derive meaning from human language, improving communication between humans and machines.")],
_name=None,
_meta={'finish_reason': 'stop', 'index': 0, 'model':
'mistralai/Mistral-7B-Instruct-v0.2',
'usage': {'completion_tokens': 90, 'prompt_tokens': 19, 'total_tokens': 109}})
]
}
init
__init__(
model: str = "Qwen/Qwen3-0.6B",
task: (
Literal["text-generation", "text2text-generation", "image-text-to-text"]
| None
) = None,
device: ComponentDevice | None = None,
token: Secret | None = Secret.from_env_var(
["HF_API_TOKEN", "HF_TOKEN"], strict=False
),
chat_template: str | None = None,
generation_kwargs: dict[str, Any] | None = None,
huggingface_pipeline_kwargs: dict[str, Any] | None = None,
stop_words: list[str] | None = None,
streaming_callback: StreamingCallbackT | None = None,
tools: ToolsType | None = None,
tool_parsing_function: Callable[[str], list[ToolCall] | None] | None = None,
async_executor: ThreadPoolExecutor | None = None,
*,
enable_thinking: bool = False
) -> None
Initializes the TransformersChatGenerator component.
Parameters:
- model (
str) – The Hugging Face text generation model name or path, for example,mistralai/Mistral-7B-Instruct-v0.2orTheBloke/OpenHermes-2.5-Mistral-7B-16k-AWQ. The model must be a chat model supporting the ChatML messaging format. If the model is specified inhuggingface_pipeline_kwargs, this parameter is ignored. - task (
Literal['text-generation', 'text2text-generation', 'image-text-to-text'] | None) – The task for the Hugging Face pipeline. Possible options: text-generation: Supported by decoder models, like GPT.text2text-generation: Deprecated as of Transformers v5; usetext-generationinstead. Previously supported by encoder-decoder models such as T5.image-text-to-text: Supported by vision-language models. If the task is specified inhuggingface_pipeline_kwargs, this parameter is ignored. If not specified, the component calls the Hugging Face API to infer the task from the model name.- device (
ComponentDevice | None) – The device for loading the model. IfNone, automatically selects the default device. If a device or device map is specified inhuggingface_pipeline_kwargs, it overrides this parameter. - token (
Secret | None) – The token to use as HTTP bearer authorization for remote files. If the token is specified inhuggingface_pipeline_kwargs, this parameter is ignored. - chat_template (
str | None) – Specifies an optional Jinja template for formatting chat messages. Most high-quality chat models have their own templates, but for models without this feature or if you prefer a custom template, use this parameter. - generation_kwargs (
dict[str, Any] | None) – A dictionary with keyword arguments to customize text generation. Some examples:max_length,max_new_tokens,temperature,top_k,top_p. See Hugging Face's documentation for more information: -
- GenerationConfig
The only
generation_kwargsset by default ismax_new_tokens, which is set to 512 tokens.
- GenerationConfig
The only
- huggingface_pipeline_kwargs (
dict[str, Any] | None) – Dictionary with keyword arguments to initialize the Hugging Face pipeline for text generation. These keyword arguments provide fine-grained control over the Hugging Face pipeline. In case of duplication, these kwargs overridemodel,task,device, andtokeninit parameters. For kwargs, see Hugging Face documentation. In this dictionary, you can also includemodel_kwargsto specify the kwargs for model initialization - stop_words (
list[str] | None) – A list of stop words. If the model generates a stop word, the generation stops. If you provide this parameter, don't specify thestopping_criteriaingeneration_kwargs. For some chat models, the output includes both the new text and the original prompt. In these cases, make sure your prompt has no stop words. - streaming_callback (
StreamingCallbackT | None) – An optional callable for handling streaming responses. - tools (
ToolsType | None) – A list of Tool and/or Toolset objects, or a single Toolset for which the model can prepare calls. - tool_parsing_function (
Callable\[[str], list[ToolCall] | None] | None) – A callable that takes a string and returns a list of ToolCall objects or None. If None, the default_tool_parser will be used which extracts tool calls using a predefined pattern. - async_executor (
ThreadPoolExecutor | None) – Optional ThreadPoolExecutor to use for async calls. If not provided, a single-threaded executor will be initialized and used - enable_thinking (
bool) – Whether to enable thinking mode in the chat template for thinking-capable models. When enabled, the model generates intermediate reasoning before the final response. Defaults to False.
shutdown
Explicitly shutdown the executor if we own it.
warm_up
Initializes the component and warms up tools if provided.
to_dict
Serializes the component to a dictionary.
Returns:
dict[str, Any]– Dictionary with serialized data.
from_dict
Deserializes the component from a dictionary.
Parameters:
- data (
dict[str, Any]) – The dictionary to deserialize from.
Returns:
TransformersChatGenerator– The deserialized component.
run
run(
messages: list[ChatMessage] | str,
generation_kwargs: dict[str, Any] | None = None,
streaming_callback: StreamingCallbackT | None = None,
tools: ToolsType | None = None,
) -> dict[str, list[ChatMessage]]
Invoke text generation inference based on the provided messages and generation parameters.
Parameters:
- messages (
list[ChatMessage] | str) – A list of ChatMessage objects representing the input messages. If a string is provided, it is converted to a list containing a ChatMessage with user role. - generation_kwargs (
dict[str, Any] | None) – Additional keyword arguments for text generation. - streaming_callback (
StreamingCallbackT | None) – An optional callable for handling streaming responses. - tools (
ToolsType | None) – A list of Tool and/or Toolset objects, or a single Toolset for which the model can prepare calls. If set, it will override thetoolsparameter provided during initialization.
Returns:
dict[str, list[ChatMessage]]– A dictionary with the following keys:replies: A list containing the generated responses as ChatMessage instances.
create_message
create_message(
text: str,
index: int,
tokenizer: Union[PreTrainedTokenizer, PreTrainedTokenizerFast],
prompt: str,
generation_kwargs: dict[str, Any],
parse_tool_calls: bool = False,
) -> ChatMessage
Create a ChatMessage instance from the provided text, populated with metadata.
Parameters:
- text (
str) – The generated text. - index (
int) – The index of the generated text. - tokenizer (
Union[PreTrainedTokenizer, PreTrainedTokenizerFast]) – The tokenizer used for generation. - prompt (
str) – The prompt used for generation. - generation_kwargs (
dict[str, Any]) – The generation parameters. - parse_tool_calls (
bool) – Whether to attempt parsing tool calls from the text.
Returns:
ChatMessage– A ChatMessage instance.
run_async
run_async(
messages: list[ChatMessage] | str,
generation_kwargs: dict[str, Any] | None = None,
streaming_callback: StreamingCallbackT | None = None,
tools: ToolsType | None = None,
) -> dict[str, list[ChatMessage]]
Asynchronously invokes text generation inference based on the provided messages and generation parameters.
This is the asynchronous version of the run method. It has the same parameters
and return values but can be used with await in an async code.
Parameters:
- messages (
list[ChatMessage] | str) – A list of ChatMessage objects representing the input messages. - generation_kwargs (
dict[str, Any] | None) – Additional keyword arguments for text generation. - streaming_callback (
StreamingCallbackT | None) – An optional callable for handling streaming responses. - tools (
ToolsType | None) – A list of Tool and/or Toolset objects, or a single Toolset for which the model can prepare calls. If set, it will override thetoolsparameter provided during initialization.
Returns:
dict[str, list[ChatMessage]]– A dictionary with the following keys:replies: A list containing the generated responses as ChatMessage instances.
haystack_integrations.components.readers.transformers.extractive_reader
TransformersExtractiveReader
Locates and extracts answers to a given query from Documents.
The TransformersExtractiveReader component performs extractive question answering. It assigns a score to every possible answer span independently of other answer spans. This fixes a common issue of other implementations which make comparisons across documents harder by normalizing each document's answers independently.
Example usage:
from haystack import Document
from haystack_integrations.components.readers.transformers import TransformersExtractiveReader
docs = [
Document(content="Python is a popular programming language"),
Document(content="python ist eine beliebte Programmiersprache"),
]
reader = TransformersExtractiveReader()
question = "What is a popular programming language?"
result = reader.run(query=question, documents=docs)
assert "Python" in result["answers"][0].data
init
__init__(
model: Path | str = "deepset/roberta-base-squad2-distilled",
device: ComponentDevice | None = None,
token: Secret | None = Secret.from_env_var(
["HF_API_TOKEN", "HF_TOKEN"], strict=False
),
top_k: int = 20,
score_threshold: float | None = None,
max_seq_length: int = 384,
stride: int = 128,
max_batch_size: int | None = None,
answers_per_seq: int | None = None,
no_answer: bool = True,
calibration_factor: float = 0.1,
overlap_threshold: float | None = 0.01,
model_kwargs: dict[str, Any] | None = None,
) -> None
Creates an instance of TransformersExtractiveReader.
Parameters:
- model (
Path | str) – A Hugging Face transformers question answering model. Can either be a path to a folder containing the model files or an identifier for the Hugging Face hub. - device (
ComponentDevice | None) – The device on which the model is loaded. IfNone, the default device is automatically selected. - token (
Secret | None) – The API token used to download private models from Hugging Face. - top_k (
int) – Number of answers to return per query. It is required even if score_threshold is set. An additional answer with no text is returned if no_answer is set to True (default). - score_threshold (
float | None) – Returns only answers with the probability score above this threshold. - max_seq_length (
int) – Maximum number of tokens. If a sequence exceeds it, the sequence is split. - stride (
int) – Number of tokens that overlap when sequence is split because it exceeds max_seq_length. - max_batch_size (
int | None) – Maximum number of samples that are fed through the model at the same time. - answers_per_seq (
int | None) – Number of answer candidates to consider per sequence. This is relevant when a Document was split into multiple sequences because of max_seq_length. - no_answer (
bool) – Whether to return an additionalno answerwith an empty text and a score representing the probability that the other top_k answers are incorrect. - calibration_factor (
float) – Factor used for calibrating probabilities. - overlap_threshold (
float | None) – If set this will remove duplicate answers if they have an overlap larger than the supplied threshold. For example, for the answers "in the river in Maine" and "the river" we would remove one of these answers since the second answer has a 100% (1.0) overlap with the first answer. However, for the answers "the river in" and "in Maine" there is only a max overlap percentage of 25% so both of these answers could be kept if this variable is set to 0.24 or lower. If None is provided then all answers are kept. - model_kwargs (
dict[str, Any] | None) – Additional keyword arguments passed toAutoModelForQuestionAnswering.from_pretrainedwhen loading the model specified inmodel. For details on what kwargs you can pass, see the model's documentation.
to_dict
Serializes the component to a dictionary.
Returns:
dict[str, Any]– Dictionary with serialized data.
from_dict
Deserializes the component from a dictionary.
Parameters:
- data (
dict[str, Any]) – Dictionary to deserialize from.
Returns:
TransformersExtractiveReader– Deserialized component.
warm_up
Initializes the component.
deduplicate_by_overlap
deduplicate_by_overlap(
answers: list[ExtractedAnswer], overlap_threshold: float | None
) -> list[ExtractedAnswer]
De-duplicates overlapping Extractive Answers.
De-duplicates overlapping Extractive Answers from the same document based on how much the spans of the answers overlap.
Parameters:
- answers (
list[ExtractedAnswer]) – List of answers to be deduplicated. - overlap_threshold (
float | None) – If set this will remove duplicate answers if they have an overlap larger than the supplied threshold. For example, for the answers "in the river in Maine" and "the river" we would remove one of these answers since the second answer has a 100% (1.0) overlap with the first answer. However, for the answers "the river in" and "in Maine" there is only a max overlap percentage of 25% so both of these answers could be kept if this variable is set to 0.24 or lower. If None is provided then all answers are kept.
Returns:
list[ExtractedAnswer]– List of deduplicated answers.
run
run(
query: str,
documents: list[Document],
top_k: int | None = None,
score_threshold: float | None = None,
max_seq_length: int | None = None,
stride: int | None = None,
max_batch_size: int | None = None,
answers_per_seq: int | None = None,
no_answer: bool | None = None,
overlap_threshold: float | None = None,
) -> dict[str, Any]
Locates and extracts answers from the given Documents using the given query.
Parameters:
- query (
str) – Query string. - documents (
list[Document]) – List of Documents in which you want to search for an answer to the query. - top_k (
int | None) – The maximum number of answers to return. An additional answer is returned if no_answer is set to True (default). - score_threshold (
float | None) – Returns only answers with the score above this threshold. - max_seq_length (
int | None) – Maximum number of tokens. If a sequence exceeds it, the sequence is split. - stride (
int | None) – Number of tokens that overlap when sequence is split because it exceeds max_seq_length. - max_batch_size (
int | None) – Maximum number of samples that are fed through the model at the same time. - answers_per_seq (
int | None) – Number of answer candidates to consider per sequence. This is relevant when a Document was split into multiple sequences because of max_seq_length. - no_answer (
bool | None) – Whether to return no answer scores. - overlap_threshold (
float | None) – If set this will remove duplicate answers if they have an overlap larger than the supplied threshold. For example, for the answers "in the river in Maine" and "the river" we would remove one of these answers since the second answer has a 100% (1.0) overlap with the first answer. However, for the answers "the river in" and "in Maine" there is only a max overlap percentage of 25% so both of these answers could be kept if this variable is set to 0.24 or lower. If None is provided then all answers are kept.
Returns:
dict[str, Any]– List of answers sorted by (desc.) answer score.
haystack_integrations.components.routers.transformers.text_router
TransformersTextRouter
Routes the text strings to different connections based on a category label.
The labels are specific to each model and can be found it its description on Hugging Face.
Usage example
from haystack.components.builders import PromptBuilder
from haystack.components.generators import HuggingFaceLocalGenerator
from haystack.core.pipeline import Pipeline
from haystack_integrations.components.routers.transformers import TransformersTextRouter
p = Pipeline()
p.add_component(
instance=TransformersTextRouter(model="papluca/xlm-roberta-base-language-detection"),
name="text_router"
)
p.add_component(
instance=PromptBuilder(template="Answer the question: {{query}}\nAnswer:"),
name="english_prompt_builder"
)
p.add_component(
instance=PromptBuilder(template="Beantworte die Frage: {{query}}\nAntwort:"),
name="german_prompt_builder"
)
p.add_component(
instance=HuggingFaceLocalGenerator(model="DiscoResearch/Llama3-DiscoLeo-Instruct-8B-v0.1"),
name="german_llm"
)
p.add_component(
instance=HuggingFaceLocalGenerator(model="microsoft/Phi-3-mini-4k-instruct"),
name="english_llm"
)
p.connect("text_router.en", "english_prompt_builder.query")
p.connect("text_router.de", "german_prompt_builder.query")
p.connect("english_prompt_builder.prompt", "english_llm.prompt")
p.connect("german_prompt_builder.prompt", "german_llm.prompt")
# English Example
print(p.run({"text_router": {"text": "What is the capital of Germany?"}}))
# German Example
print(p.run({"text_router": {"text": "Was ist die Hauptstadt von Deutschland?"}}))
init
__init__(
model: str,
labels: list[str] | None = None,
device: ComponentDevice | None = None,
token: Secret | None = Secret.from_env_var(
["HF_API_TOKEN", "HF_TOKEN"], strict=False
),
huggingface_pipeline_kwargs: dict[str, Any] | None = None,
) -> None
Initializes the TransformersTextRouter component.
Parameters:
- model (
str) – The name or path of a Hugging Face model for text classification. - labels (
list[str] | None) – The list of labels. If not provided, the component fetches the labels from the model configuration file hosted on the Hugging Face Hub usingtransformers.AutoConfig.from_pretrained. - device (
ComponentDevice | None) – The device for loading the model. IfNone, automatically selects the default device. If a device or device map is specified inhuggingface_pipeline_kwargs, it overrides this parameter. - token (
Secret | None) – The API token used to download private models from Hugging Face. IfTrue, uses eitherHF_API_TOKENorHF_TOKENenvironment variables. To generate these tokens, runtransformers-cli login. - huggingface_pipeline_kwargs (
dict[str, Any] | None) – A dictionary of keyword arguments for initializing the Hugging Face text classification pipeline.
warm_up
Initializes the component.
to_dict
Serializes the component to a dictionary.
Returns:
dict[str, Any]– Dictionary with serialized data.
from_dict
Deserializes the component from a dictionary.
Parameters:
- data (
dict[str, Any]) – Dictionary to deserialize from.
Returns:
TransformersTextRouter– Deserialized component.
run
Routes the text strings to different connections based on a category label.
Parameters:
- text (
str) – A string of text to route.
Returns:
dict[str, str]– A dictionary with the label as key and the text as value.
Raises:
TypeError– If the input is not a str.
haystack_integrations.components.routers.transformers.zero_shot_text_router
TransformersZeroShotTextRouter
Routes the text strings to different connections based on a category label.
Specify the set of labels for categorization when initializing the component.
Usage example
from haystack import Document
from haystack.components.embedders import SentenceTransformersTextEmbedder, SentenceTransformersDocumentEmbedder
from haystack.components.retrievers import InMemoryEmbeddingRetriever
from haystack.core.pipeline import Pipeline
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack_integrations.components.routers.transformers import TransformersZeroShotTextRouter
document_store = InMemoryDocumentStore()
doc_embedder = SentenceTransformersDocumentEmbedder(model="intfloat/e5-base-v2")
docs = [
Document(
content="Germany, officially the Federal Republic of Germany, is a country in the western region of "
"Central Europe. The nation's capital and most populous city is Berlin and its main financial centre "
"is Frankfurt; the largest urban area is the Ruhr."
),
Document(
content="France, officially the French Republic, is a country located primarily in Western Europe. "
"France is a unitary semi-presidential republic with its capital in Paris, the country's largest city "
"and main cultural and commercial centre; other major urban areas include Marseille, Lyon, Toulouse, "
"Lille, Bordeaux, Strasbourg, Nantes and Nice."
)
]
docs_with_embeddings = doc_embedder.run(docs)
document_store.write_documents(docs_with_embeddings["documents"])
p = Pipeline()
p.add_component(instance=TransformersZeroShotTextRouter(labels=["passage", "query"]), name="text_router")
p.add_component(
instance=SentenceTransformersTextEmbedder(model="intfloat/e5-base-v2", prefix="passage: "),
name="passage_embedder"
)
p.add_component(
instance=SentenceTransformersTextEmbedder(model="intfloat/e5-base-v2", prefix="query: "),
name="query_embedder"
)
p.add_component(
instance=InMemoryEmbeddingRetriever(document_store=document_store),
name="query_retriever"
)
p.add_component(
instance=InMemoryEmbeddingRetriever(document_store=document_store),
name="passage_retriever"
)
p.connect("text_router.passage", "passage_embedder.text")
p.connect("passage_embedder.embedding", "passage_retriever.query_embedding")
p.connect("text_router.query", "query_embedder.text")
p.connect("query_embedder.embedding", "query_retriever.query_embedding")
# Query Example
p.run({"text_router": {"text": "What is the capital of Germany?"}})
# Passage Example
p.run({
"text_router":{
"text": "The United Kingdom of Great Britain and Northern Ireland, commonly known as the " "United Kingdom (UK) or Britain, is a country in Northwestern Europe, off the north-western coast of " "the continental mainland."
}
})
init
__init__(
labels: list[str],
multi_label: bool = False,
model: str = "MoritzLaurer/deberta-v3-base-zeroshot-v1.1-all-33",
device: ComponentDevice | None = None,
token: Secret | None = Secret.from_env_var(
["HF_API_TOKEN", "HF_TOKEN"], strict=False
),
huggingface_pipeline_kwargs: dict[str, Any] | None = None,
) -> None
Initializes the TransformersZeroShotTextRouter component.
Parameters:
- labels (
list[str]) – The set of labels to use for classification. Can be a single label, a string of comma-separated labels, or a list of labels. - multi_label (
bool) – Indicates if multiple labels can be true. IfFalse, label scores are normalized so their sum equals 1 for each sequence. IfTrue, the labels are considered independent and probabilities are normalized for each candidate by doing a softmax of the entailment score vs. the contradiction score. - model (
str) – The name or path of a Hugging Face model for zero-shot text classification. - device (
ComponentDevice | None) – The device for loading the model. IfNone, automatically selects the default device. If a device or device map is specified inhuggingface_pipeline_kwargs, it overrides this parameter. - token (
Secret | None) – The API token used to download private models from Hugging Face. IfTrue, uses eitherHF_API_TOKENorHF_TOKENenvironment variables. To generate these tokens, runtransformers-cli login. - huggingface_pipeline_kwargs (
dict[str, Any] | None) – A dictionary of keyword arguments for initializing the Hugging Face zero shot text classification.
warm_up
Initializes the component.
to_dict
Serializes the component to a dictionary.
Returns:
dict[str, Any]– Dictionary with serialized data.
from_dict
Deserializes the component from a dictionary.
Parameters:
- data (
dict[str, Any]) – Dictionary to deserialize from.
Returns:
TransformersZeroShotTextRouter– Deserialized component.
run
Routes the text strings to different connections based on a category label.
Parameters:
- text (
str) – A string of text to route.
Returns:
dict[str, str]– A dictionary with the label as key and the text as value.
Raises:
TypeError– If the input is not a str.