Skip to main content
Version: 2.30

TransformersChatGenerator

Provides an interface for chat completion using a Hugging Face model that runs locally.

Most common position in a pipelineAfter a ChatPromptBuilder
Mandatory init variablesNone
Mandatory run variablesmessages: A list of ChatMessage objects representing the chat or a plain string
Output variablesreplies: A list of ChatMessage objects generated by the LLM
API referenceTransformers
GitHub linkhttps://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/transformers
Package nametransformers-haystack

Overview

Keep in mind that if LLMs run locally, you may need a powerful machine to run them. This depends strongly on the model you select and its parameter count.

If a string is passed to messages, it is converted into a list containing a single ChatMessage with the user role.

Authentication with a Hugging Face API token is only required to access private or gated models. You can pass the token at initialization with token, or set the HF_API_TOKEN or HF_TOKEN environment variable:

python
generator = TransformersChatGenerator(
token=Secret.from_token("<your-api-key>"),
)

Streaming

This Generator supports streaming the tokens from the LLM directly in output. To do so, pass a function to the streaming_callback init parameter.

Usage

Install the transformers-haystack package to use the TransformersChatGenerator:

shell
pip install transformers-haystack

On its own

python
from haystack_integrations.components.generators.transformers import (
TransformersChatGenerator,
)
from haystack.dataclasses import ChatMessage

generator = TransformersChatGenerator(model="Qwen/Qwen3-0.6B")
messages = [ChatMessage.from_user("What's Natural Language Processing? Be brief.")]
print(generator.run(messages))

In a Pipeline

python
from haystack import Pipeline
from haystack.components.builders.prompt_builder import ChatPromptBuilder
from haystack_integrations.components.generators.transformers import (
TransformersChatGenerator,
)
from haystack.dataclasses import ChatMessage
from haystack.utils import Secret

prompt_builder = ChatPromptBuilder()
llm = TransformersChatGenerator(
model="Qwen/Qwen3-0.6B",
token=Secret.from_env_var("HF_API_TOKEN"),
)

pipe = Pipeline()
pipe.add_component("prompt_builder", prompt_builder)
pipe.add_component("llm", llm)
pipe.connect("prompt_builder.prompt", "llm.messages")
location = "Berlin"
messages = [
ChatMessage.from_system(
"Always respond in German even if some input data is in other languages.",
),
ChatMessage.from_user("Tell me about {{location}}"),
]
pipe.run(
data={
"prompt_builder": {
"template_variables": {"location": location},
"template": messages,
},
},
)