Name	HuggingFaceTEIDocumentEmbedder
Folder Path	/embedders/
Most common Position in a Pipeline	Before a `DocumentWriter` in an indexing Pipeline
Mandatory Input variables	“documents”: a list of Document objects to be embedded
Output variables	“documents”: a list of Document objects to be embedded (enriched with embeddings)

This component should be used to embed a list of Documents. To embed a string, you should use HuggingFaceTEITextEmbedder.

Overview

This component is designed to compute embeddings using the Text Embeddings Inference (TEI) library. TEI is a toolkit for deploying and serving open source text embedding models with high performance on both GPU and CPU.
TEI has a permissive but not fully open source license.

The component uses a HF_API_TOKEN environment variable by default. Otherwise, you can pass a Hugging Face API token at initialization with token – see code examples below.
The token is needed:

If you use the Inference API
If you use the Inference Endpoints
If you use a self-hosted TEI endpoint with a private/gated model

If you use a self-hosted TEI endpoint with a totally open model, the token is not required.

Key Features

Hugging Face Inference Endpoints. Supports usage of embedding models deployed on Hugging Face Inference endpoints.
Inference API Support. Supports usage of embedding models hosted on the rate-limited Inference API tier. Discover available LLMs using the following command: wget -qO- https://api-inference.huggingface.co/framework/sentence-transformers, and use the model ID as the model parameter for this component. You'll also need to provide a valid Hugging Face API token as the token parameter. (This solution is only suitable for experimental purposes)
Custom TEI Endpoints. Supports usage of embedding models deployed on custom TEI endpoints. A custom TEI endpoint can be easily run using Docker (TEI documentation).

📘
More Information

For more information on TEI, visit https://github.com/huggingface/text-embeddings-inference.

For more information on TEI, visit https://github.com/huggingface/text-embeddings-inference.

Learn more about the Inference API at https://huggingface.co/inference-api.

Usage

On its own

You can use this component for embedding models hosted on Hugging Face Inference API, the rate-limited Inference API free tier:

from haystack.dataclasses import Document
from haystack.components.embedders import HuggingFaceTEIDocumentEmbedder
from haystack.utils import Secret

doc = Document(content="I love pizza!")

document_embedder = HuggingFaceTEIDocumentEmbedder(
    model="BAAI/bge-small-en-v1.5", token=Secret.from_token("<your-api-key>")
)

result = document_embedder.run([doc])
print(result["documents"][0].embedding)

# [0.017020374536514282, -0.023255806416273117, ...]

For embedding models hosted on paid https://huggingface.co/inference-endpoints endpoint and/or your own custom TEI endpoint. In these two cases, you'll need to provide the URL of the endpoint. In case you use the Inference Endpoints or a self-hosted endpoint with a private/gated model, you also need to pass a valid token.

from haystack.dataclasses import Document
from haystack.components.embedders import HuggingFaceTEIDocumentEmbedder

doc = Document(content="I love pizza!")

document_embedder = HuggingFaceTEIDocumentEmbedder(
    model="BAAI/bge-small-en-v1.5", url="<your-tei-endpoint-url>", token=Secret.from_token("<your-api-key>")
)

result = document_embedder.run([doc])
print(result["documents"][0].embedding)

# [0.017020374536514282, -0.023255806416273117, ...]

In a Pipeline

from haystack import Document
from haystack import Pipeline
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.embedders import HuggingFaceTEITextEmbedder, HuggingFaceTEIDocumentEmbedder
from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever

document_store = InMemoryDocumentStore(embedding_similarity_function="cosine")

documents = [Document(content="My name is Wolfgang and I live in Berlin"),
             Document(content="I saw a black horse running"),
             Document(content="Germany has many big cities")]

document_embedder = HuggingFaceTEIDocumentEmbedder()
documents_with_embeddings = document_embedder.run(documents)['documents']
document_store.write_documents(documents_with_embeddings)

query_pipeline = Pipeline()
query_pipeline.add_component("text_embedder", HuggingFaceTEITextEmbedder())
query_pipeline.add_component("retriever", InMemoryEmbeddingRetriever(document_store=document_store))
query_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")

query = "Who lives in Berlin?"

result = query_pipeline.run({"text_embedder":{"text": query}})

print(result['retriever']['documents'][0])

# Document(id=..., mimetype: 'text/plain', 
#  text: 'My name is Wolfgang and I live in Berlin')