DocumentationAPI ReferenceπŸ““ TutorialsπŸ§‘β€πŸ³ Cookbook🀝 IntegrationsπŸ’œ Discord🎨 Studio (Waitlist)
Documentation

OptimumDocumentEmbedder

A component to compute documents’ embeddings using models loaded with the Hugging Face Optimum library.

NameOptimumDocumentEmbedder
Sourcehttps://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/optimum
Most common position in a pipelineBefore aΒ DocumentWriterΒ in an indexing pipeline
Mandatory input variablesβ€œdocuments”: A list of documents
Output variablesβ€œdocuments”: A list of documents enriched with embeddings

Overview

OptimumDocumentEmbedder embeds text strings using models loaded with the HuggingFace Optimum library. It uses the ONNX runtime for high-speed inference.

The default model is sentence-transformers/all-mpnet-base-v2.

Similarly to other Embedders, this component allows adding prefixes (and suffixes) to include instructions. For more details, refer to the component’s API reference.

There are three useful parameters specific to the Optimum Embedder that you can control with various modes:

  • Pooling: generate a fixed-sized sentence embedding from a variable-sized sentence embedding
  • Optimization: apply graph optimization to the model and improve inference speed
  • Quantization: reduce the computational and memory costs

Find all the available mode details in our Optimum API Reference.

Authentication

The component uses aΒ HF_API_TOKENΒ environment variable by default. Otherwise, you can pass a Hugging Face API token at initialization withΒ token – see code examples below.

The token is needed:

  • If you use the Serverless Inference API, or
  • If you use the Inference Endpoints.

Usage

To start using this integration with Haystack, install it with:

pip install optimum-haystack

On its own

from haystack.dataclasses import Document
from haystack_integrations.components.embedders.optimum import OptimumDocumentEmbedder

doc = Document(content="I love pizza!")

document_embedder = OptimumDocumentEmbedder(model="sentence-transformers/all-mpnet-base-v2")
document_embedder.warm_up()

result = document_embedder.run([doc])
print(result["documents"][0].embedding)

# [0.017020374536514282, -0.023255806416273117, ...]

In a pipeline

from haystack import Pipeline
from haystack import Document
from haystack_integrations.components.embedders.optimum import (
    OptimumDocumentEmbedder,
    OptimumEmbedderPooling,
    OptimumEmbedderOptimizationConfig,
    OptimumEmbedderOptimizationMode,
)

documents = [
    Document(content="My name is Wolfgang and I live in Berlin"),
    Document(content="I saw a black horse running"),
    Document(content="Germany has many big cities"),
]

embedder = OptimumDocumentEmbedder(
    model="intfloat/e5-base-v2",
    normalize_embeddings=True,
    onnx_execution_provider="CUDAExecutionProvider",
    optimizer_settings=OptimumEmbedderOptimizationConfig(
        mode=OptimumEmbedderOptimizationMode.O4,
        for_gpu=True,
    ),
    working_dir="/tmp/optimum",
    pooling_mode=OptimumEmbedderPooling.MEAN,
)

pipeline = Pipeline()
pipeline.add_component("embedder", embedder)

pipeline.run({"embedder": {"documents": documents}})

print(results["embedder"]["embedding"])

Related Links

Check out the API reference in the GitHub repo or in our docs: