HuggingFaceAPITextEmbedder
Use this component to embed strings using various Hugging Face APIs.
Most common position in a pipeline | Before an embedding Retriever in a query/RAG pipeline |
Mandatory init variables | "api_type": The type of Hugging Face API to use "api_params": A dictionary with one of the following keys: - model : Hugging Face model ID. Required when api_type is SERVERLESS_INFERENCE_API .OR - url : URL of the inference endpoint. Required when api_type is INFERENCE_ENDPOINTS or TEXT_EMBEDDINGS_INFERENCE ."token": The Hugging Face API token. Can be set with HF_API_TOKEN or HF_TOKEN env var. |
Mandatory run variables | “text”: A string |
Output variables | “embedding”: A list of float numbers |
API reference | Embedders |
GitHub link | https://github.com/deepset-ai/haystack/blob/main/haystack/components/embedders/hugging_face_api_text_embedder.py |
Overview
HuggingFaceAPITextEmbedder
can be used to embed strings using different Hugging Face APIs:
This component should be used to embed plain text. To embed a list of documents, use
HuggingFaceAPIDocumentEmbedder
.
The component uses a HF_API_TOKEN
environment variable by default. Otherwise, you can pass a Hugging Face API token at initialization with token
– see code examples below.
The token is needed:
- If you use the Serverless Inference API, or
- If you use the Inference Endpoints.
Usage
Similarly to other text Embedders, this component allows adding prefixes (and postfixes) to include instructions.
For more fine-grained details, refer to the component’s API reference.
On its own
Using Free Serverless Inference API
Formerly known as (free) Hugging Face Inference API, this API allows you to quickly experiment with many models hosted on the Hugging Face Hub, offloading the inference to Hugging Face servers. It’s rate-limited and not meant for production.
To use this API, you need a free Hugging Face token.
The Embedder expects the model
in api_params
.
from haystack.components.embedders import HuggingFaceAPITextEmbedder
from haystack.utils import Secret
text_embedder = HuggingFaceAPITextEmbedder(api_type="serverless_inference_api",
api_params={"model": "BAAI/bge-small-en-v1.5"},
token=Secret.from_token("<your-api-key>"))
print(text_embedder.run("I love pizza!"))
# {'embedding': [0.017020374536514282, -0.023255806416273117, ...]}
Using Paid Inference Endpoints
In this case, a private instance of the model is deployed by Hugging Face, and you typically pay per hour.
To understand how to spin up an Inference Endpoint, visit Hugging Face documentation.
Additionally, in this case, you need to provide your Hugging Face token.
The Embedder expects the url
of your endpoint in api_params
.
from haystack.components.embedders import HuggingFaceAPITextEmbedder
from haystack.utils import Secret
text_embedder = HuggingFaceAPITextEmbedder(api_type="inference_endpoints",
api_params={"model": "BAAI/bge-small-en-v1.5"},
token=Secret.from_token("<your-api-key>"))
print(text_embedder.run("I love pizza!"))
# {'embedding': [0.017020374536514282, -0.023255806416273117, ...]}
Using Self-Hosted Text Embeddings Inference (TEI)
Hugging Face Text Embeddings Inference is a toolkit for efficiently deploying and serving text embedding models.
While it powers the most recent versions of Serverless Inference API and Inference Endpoints, it can be used easily on-premise through Docker.
For example, you can run a TEI container as follows:
model=BAAI/bge-large-en-v1.5
revision=refs/pr/5
volume=$PWD/data # share a volume with the Docker container to avoid downloading weights every run
docker run --gpus all -p 8080:80 -v $volume:/data --pull always ghcr.io/huggingface/text-embeddings-inference:1.2 --model-id $model --revision $revision
For more information, refer to the official TEI repository.
The Embedder expects the url
of your TEI instance in api_params
.
from haystack.components.embedders import HuggingFaceAPITextEmbedder
from haystack.utils import Secret
text_embedder = HuggingFaceAPITextEmbedder(api_type="text_embeddings_inference",
api_params={"url": "http://localhost:8080"})
print(text_embedder.run("I love pizza!"))
# {'embedding': [0.017020374536514282, -0.023255806416273117, ...],
In a pipeline
from haystack import Document
from haystack import Pipeline
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.embedders import HuggingFaceAPITextEmbedder, HuggingFaceAPIDocumentEmbedder
from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever
document_store = InMemoryDocumentStore(embedding_similarity_function="cosine")
documents = [Document(content="My name is Wolfgang and I live in Berlin"),
Document(content="I saw a black horse running"),
Document(content="Germany has many big cities")]
document_embedder = HuggingFaceAPIDocumentEmbedder(api_type="serverless_inference_api",
api_params={"model": "BAAI/bge-small-en-v1.5"})
documents_with_embeddings = document_embedder.run(documents)['documents']
document_store.write_documents(documents_with_embeddings)
text_embedder = HuggingFaceAPITextEmbedder(api_type="serverless_inference_api",
api_params={"model": "BAAI/bge-small-en-v1.5"})
query_pipeline = Pipeline()
query_pipeline.add_component("text_embedder", text_embedder)
query_pipeline.add_component("retriever", InMemoryEmbeddingRetriever(document_store=document_store))
query_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")
query = "Who lives in Berlin?"
result = query_pipeline.run({"text_embedder":{"text": query}})
print(result['retriever']['documents'][0])
# Document(id=..., content: 'My name is Wolfgang and I live in Berlin', ...)
Updated 5 months ago