HuggingFaceTGIGenerator
HuggingFaceTGIGenerator enables text generation using Hugging Face Hub-hosted non-chat LLMs.
| Name | HuggingFaceTGIGenerator |
| Folder Path | /generators/hugging_face_tgi |
| Most common Position in a Pipeline | After a PromptBuilder |
| Mandatory Input variables | “prompt”: a string containing the prompt for the LLM |
| Output variables | “replies”: a list of strings with all the replies generated by the LLM ”meta”: a list of dictionaries with the metadata associated with each reply, such as token count, finish reason and others |
Overview
This component is designed to seamlessly utilize models deployed on the Text Generation Inference (TGI) backend.
For an example of this component being used, check out this 🧑🍳 Cookbook
Using Hugging Face Inference API
The component uses a HF_API_TOKEN environment variable by default. Otherwise, you can pass a Hugging Face API token at initialization with token – see code examples below.
You can use this component for LLMs hosted on Hugging Face Inference endpoints, the rate-limited Inference API tier:
from haystack.components.generators import HuggingFaceTGIGenerator
client = HuggingFaceTGIGenerator(model="mistralai/Mistral-7B-v0.1", token=Secret.from_token("<your-api-key>"))
client.warm_up()
response = client.run("What's Natural Language Processing?")
print(response)
For LLMs hosted on a paid endpoint or your own custom TGI endpoint, you'll need to provide the URL of the endpoint as well as a valid token:
from haystack.components.generators import HuggingFaceTGIGenerator
client = HuggingFaceTGIGenerator(model="mistralai/Mistral-7B-v0.1", url="<your-tgi-endpoint-url>", token=Secret.from_token("<your-api-key>"))
client.warm_up()
response = client.run("What's Natural Language Processing?")
print(response)
Key Features
- Hugging Face Inference Endpoints. Supports usage of TGI chat LLMs deployed on Hugging Face Inference endpoints.
- Inference API Support. Supports usage of TGI LLMs hosted on the rate-limited Inference API tier. Discover available LLMs using the following command:
wget -qO- https://api-inference.huggingface.co/framework/text-generation-inferenceand simply use the model ID as the model parameter for this component. You'll also need to provide a valid Hugging Face API token as the token parameter. - Custom TGI Endpoints. Supports usage of LLMs deployed on custom TGI endpoints. Anyone can deploy their own TGI endpoint using the TGI framework.
For more information on TGI, visit https://github.com/huggingface/text-generation-inference.
Learn more about the Inference API at https://huggingface.co/inference-api.
This component is designed for text generation, not for chat. If you want to use these LLMs for chat, use
HuggingFaceTGIChatGeneratorinstead.
Usage
On its own
from haystack.components.generators import HuggingFaceTGIGenerator
client = HuggingFaceTGIGenerator(model="mistralai/Mistral-7B-v0.1", token=Secret.from_token("<your-api-key>"))
client.warm_up()
response = client.run("What's Natural Language Processing?")
print(response)
In a Pipeline
from haystack import Pipeline
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.components.builders.prompt_builder import PromptBuilder
from haystack.components.generators import HuggingFaceTGIGenerator
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack import Document
docstore = InMemoryDocumentStore()
docstore.write_documents([Document(content="Rome is the capital of Italy"), Document(content="Paris is the capital of France")])
query = "What is the capital of France?"
template = """
Given the following information, answer the question.
Context:
{% for document in documents %}
{{ document.content }}
{% endfor %}
Question: {{ query }}?
"""
pipe = Pipeline()
pipe.add_component("retriever", InMemoryBM25Retriever(document_store=docstore))
pipe.add_component("prompt_builder", PromptBuilder(template=template))
pipe.add_component("llm", HuggingFaceTGIGenerator(model="mistralai/Mistral-7B-v0.1", token=Secret.from_token("<your-api-key>")))
pipe.connect("retriever", "prompt_builder.documents")
pipe.connect("prompt_builder", "llm")
res=pipe.run({
"prompt_builder": {
"query": query
},
"retriever": {
"query": query
}
})
print(res)
Updated over 1 year ago
See parameters details in our API reference:
