HuggingFaceLocalChatGenerator
Provides an interface for chat completion using a Hugging Face model that runs locally.
Most common position in a pipeline | After a ChatPromptBuilder |
Mandatory init variables | "token": The Hugging Face API token. Can be set with HF_API_TOKEN or HF_TOKEN env var. |
Mandatory run variables | “messages”: A list of ChatMessage objects representing the chat |
Output variables | “replies”: A list of strings with all the replies generated by the LLM |
API reference | Generators |
GitHub link | https://github.com/deepset-ai/haystack/blob/main/haystack/components/generators/chat/hugging_face_local.py |
Overview
Keep in mind that if LLMs run locally, you may need a powerful machine to run them. This depends strongly on the model you select and its parameter count.
This component is designed for chat completion, not for text generation. If you want to use Hugging Face LLMs for text generation, use
HuggingFaceLocalGenerator
instead.
For remote file authorization, this component uses a HF_API_TOKEN
environment variable by default. Otherwise, you can pass a Hugging Face API token at initialization with token
:
local_generator = HuggingFaceLocalChatGenerator(token=Secret.from_token("<your-api-key>"))
Streaming
This Generator supports streaming the tokens from the LLM directly in output. To do so, pass a function to the streaming_callback
init parameter.
Usage
On its own
from haystack.components.generators.chat import HuggingFaceLocalChatGenerator
from haystack.dataclasses import ChatMessage
generator = HuggingFaceLocalChatGenerator(model="HuggingFaceH4/zephyr-7b-beta")
generator.warm_up()
messages = [ChatMessage.from_user("What's Natural Language Processing? Be brief.")]
print(generator.run(messages))
In a Pipeline
from haystack import Pipeline
from haystack.components.builders.prompt_builder import ChatPromptBuilder
from haystack.components.generators.chat import HuggingFaceLocalChatGenerator
from haystack.dataclasses import ChatMessage
prompt_builder = ChatPromptBuilder()
llm = HuggingFaceLocalChatGenerator(model="HuggingFaceH4/zephyr-7b-beta", token=Secret.from_env_var("HF_API_TOKEN"))
pipe = Pipeline()
pipe.add_component("prompt_builder", prompt_builder)
pipe.add_component("llm", llm)
pipe.connect("prompt_builder.prompt", "llm.messages")
location = "Berlin"
messages = [ChatMessage.from_system("Always respond in German even if some input data is in other languages."),
ChatMessage.from_user("Tell me about {{location}}")]
pipe.run(data={"prompt_builder": {"template_variables":{"location": location}, "template": messages}})
Updated 3 months ago