TransformersChatGenerator
Provides an interface for chat completion using a Hugging Face model that runs locally.
| Most common position in a pipeline | After a ChatPromptBuilder |
| Mandatory init variables | None |
| Mandatory run variables | messages: A list of ChatMessage objects representing the chat or a plain string |
| Output variables | replies: A list of ChatMessage objects generated by the LLM |
| API reference | Transformers |
| GitHub link | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/transformers |
| Package name | transformers-haystack |
Overview
Keep in mind that if LLMs run locally, you may need a powerful machine to run them. This depends strongly on the model you select and its parameter count.
If a string is passed to messages, it is converted into a list containing a single ChatMessage with the user role.
Authentication with a Hugging Face API token is only required to access private or gated models. You can pass the token at initialization with token, or set the HF_API_TOKEN or HF_TOKEN environment variable:
generator = TransformersChatGenerator(
token=Secret.from_token("<your-api-key>"),
)
Streaming
This Generator supports streaming the tokens from the LLM directly in output. To do so, pass a function to the streaming_callback init parameter.
Usage
Install the transformers-haystack package to use the TransformersChatGenerator:
On its own
from haystack_integrations.components.generators.transformers import (
TransformersChatGenerator,
)
from haystack.dataclasses import ChatMessage
generator = TransformersChatGenerator(model="Qwen/Qwen3-0.6B")
messages = [ChatMessage.from_user("What's Natural Language Processing? Be brief.")]
print(generator.run(messages))
In a Pipeline
from haystack import Pipeline
from haystack.components.builders.prompt_builder import ChatPromptBuilder
from haystack_integrations.components.generators.transformers import (
TransformersChatGenerator,
)
from haystack.dataclasses import ChatMessage
from haystack.utils import Secret
prompt_builder = ChatPromptBuilder()
llm = TransformersChatGenerator(
model="Qwen/Qwen3-0.6B",
token=Secret.from_env_var("HF_API_TOKEN"),
)
pipe = Pipeline()
pipe.add_component("prompt_builder", prompt_builder)
pipe.add_component("llm", llm)
pipe.connect("prompt_builder.prompt", "llm.messages")
location = "Berlin"
messages = [
ChatMessage.from_system(
"Always respond in German even if some input data is in other languages.",
),
ChatMessage.from_user("Tell me about {{location}}"),
]
pipe.run(
data={
"prompt_builder": {
"template_variables": {"location": location},
"template": messages,
},
},
)