DocumentationAPI Reference📓 Tutorials🧑‍🍳 Cookbook🤝 Integrations💜 Discord🎨 Studio (Waitlist)
API Reference

FastEmbed integration for Haystack

Module haystack_integrations.components.embedders.fastembed.fastembed_document_embedder

FastembedDocumentEmbedder

FastembedDocumentEmbedder computes Document embeddings using Fastembed embedding models. The embedding of each Document is stored in the embedding field of the Document.

Usage example:

# To use this component, install the "fastembed-haystack" package.
# pip install fastembed-haystack

from haystack_integrations.components.embedders.fastembed import FastembedDocumentEmbedder
from haystack.dataclasses import Document

doc_embedder = FastembedDocumentEmbedder(
    model="BAAI/bge-small-en-v1.5",
    batch_size=256,
)

doc_embedder.warm_up()

# Text taken from PubMed QA Dataset (https://huggingface.co/datasets/pubmed_qa)
document_list = [
    Document(
        content=("Oxidative stress generated within inflammatory joints can produce autoimmune phenomena and joint "
                 "destruction. Radical species with oxidative activity, including reactive nitrogen species, "
                 "represent mediators of inflammation and cartilage damage."),
        meta={
            "pubid": "25,445,628",
            "long_answer": "yes",
        },
    ),
    Document(
        content=("Plasma levels of pancreatic polypeptide (PP) rise upon food intake. Although other pancreatic "
                 "islet hormones, such as insulin and glucagon, have been extensively investigated, PP secretion "
                 "and actions are still poorly understood."),
        meta={
            "pubid": "25,445,712",
            "long_answer": "yes",
        },
    ),
]

result = doc_embedder.run(document_list)
print(f"Document Text: {result['documents'][0].content}")
print(f"Document Embedding: {result['documents'][0].embedding}")
print(f"Embedding Dimension: {len(result['documents'][0].embedding)}")

FastembedDocumentEmbedder.__init__

def __init__(model: str = "BAAI/bge-small-en-v1.5",
             cache_dir: Optional[str] = None,
             threads: Optional[int] = None,
             prefix: str = "",
             suffix: str = "",
             batch_size: int = 256,
             progress_bar: bool = True,
             parallel: Optional[int] = None,
             local_files_only: bool = False,
             meta_fields_to_embed: Optional[List[str]] = None,
             embedding_separator: str = "\n")

Create an FastembedDocumentEmbedder component.

Arguments:

  • model: Local path or name of the model in Hugging Face's model hub, such as BAAI/bge-small-en-v1.5.
  • cache_dir: The path to the cache directory. Can be set using the FASTEMBED_CACHE_PATH env variable. Defaults to fastembed_cache in the system's temp directory.
  • threads: The number of threads single onnxruntime session can use. Defaults to None.
  • prefix: A string to add to the beginning of each text.
  • suffix: A string to add to the end of each text.
  • batch_size: Number of strings to encode at once.
  • progress_bar: If True, displays progress bar during embedding.
  • parallel: If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. If 0, use all available cores. If None, don't use data-parallel processing, use default onnxruntime threading instead.
  • local_files_only: If True, only use the model files in the cache_dir.
  • meta_fields_to_embed: List of meta fields that should be embedded along with the Document content.
  • embedding_separator: Separator used to concatenate the meta fields to the Document content.

FastembedDocumentEmbedder.to_dict

def to_dict() -> Dict[str, Any]

Serializes the component to a dictionary.

Returns:

Dictionary with serialized data.

FastembedDocumentEmbedder.warm_up

def warm_up()

Initializes the component.

FastembedDocumentEmbedder.run

@component.output_types(documents=List[Document])
def run(documents: List[Document])

Embeds a list of Documents.

Arguments:

  • documents: List of Documents to embed.

Returns:

A dictionary with the following keys:

  • documents: List of Documents with each Document's embedding field set to the computed embeddings.

Module haystack_integrations.components.embedders.fastembed.fastembed_text_embedder

FastembedTextEmbedder

FastembedTextEmbedder computes string embedding using fastembed embedding models.

Usage example:

from haystack_integrations.components.embedders.fastembed import FastembedTextEmbedder

text = ("It clearly says online this will work on a Mac OS system. "
        "The disk comes and it does not, only Windows. Do Not order this if you have a Mac!!")

text_embedder = FastembedTextEmbedder(
    model="BAAI/bge-small-en-v1.5"
)
text_embedder.warm_up()

embedding = text_embedder.run(text)["embedding"]

FastembedTextEmbedder.__init__

def __init__(model: str = "BAAI/bge-small-en-v1.5",
             cache_dir: Optional[str] = None,
             threads: Optional[int] = None,
             prefix: str = "",
             suffix: str = "",
             progress_bar: bool = True,
             parallel: Optional[int] = None,
             local_files_only: bool = False)

Create a FastembedTextEmbedder component.

Arguments:

  • model: Local path or name of the model in Fastembed's model hub, such as BAAI/bge-small-en-v1.5
  • cache_dir: The path to the cache directory. Can be set using the FASTEMBED_CACHE_PATH env variable. Defaults to fastembed_cache in the system's temp directory.
  • threads: The number of threads single onnxruntime session can use. Defaults to None.
  • prefix: A string to add to the beginning of each text.
  • suffix: A string to add to the end of each text.
  • progress_bar: If True, displays progress bar during embedding.
  • parallel: If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. If 0, use all available cores. If None, don't use data-parallel processing, use default onnxruntime threading instead.
  • local_files_only: If True, only use the model files in the cache_dir.

FastembedTextEmbedder.to_dict

def to_dict() -> Dict[str, Any]

Serializes the component to a dictionary.

Returns:

Dictionary with serialized data.

FastembedTextEmbedder.warm_up

def warm_up()

Initializes the component.

FastembedTextEmbedder.run

@component.output_types(embedding=List[float])
def run(text: str)

Embeds text using the Fastembed model.

Arguments:

  • text: A string to embed.

Raises:

  • TypeError: If the input is not a string.
  • RuntimeError: If the embedding model has not been loaded.

Returns:

A dictionary with the following keys:

  • embedding: A list of floats representing the embedding of the input text.

Module haystack_integrations.components.embedders.fastembed.fastembed_sparse_document_embedder

FastembedSparseDocumentEmbedder

FastembedSparseDocumentEmbedder computes Document embeddings using Fastembed sparse models.

Usage example:

from haystack_integrations.components.embedders.fastembed import FastembedSparseDocumentEmbedder
from haystack.dataclasses import Document

sparse_doc_embedder = FastembedSparseDocumentEmbedder(
    model="prithivida/Splade_PP_en_v1",
    batch_size=32,
)

sparse_doc_embedder.warm_up()

# Text taken from PubMed QA Dataset (https://huggingface.co/datasets/pubmed_qa)
document_list = [
    Document(
        content=("Oxidative stress generated within inflammatory joints can produce autoimmune phenomena and joint "
                 "destruction. Radical species with oxidative activity, including reactive nitrogen species, "
                 "represent mediators of inflammation and cartilage damage."),
        meta={
            "pubid": "25,445,628",
            "long_answer": "yes",
        },
    ),
    Document(
        content=("Plasma levels of pancreatic polypeptide (PP) rise upon food intake. Although other pancreatic "
                 "islet hormones, such as insulin and glucagon, have been extensively investigated, PP secretion "
                 "and actions are still poorly understood."),
        meta={
            "pubid": "25,445,712",
            "long_answer": "yes",
        },
    ),
]

result = sparse_doc_embedder.run(document_list)
print(f"Document Text: {result['documents'][0].content}")
print(f"Document Sparse Embedding: {result['documents'][0].sparse_embedding}")
print(f"Sparse Embedding Dimension: {len(result['documents'][0].sparse_embedding)}")

FastembedSparseDocumentEmbedder.__init__

def __init__(model: str = "prithivida/Splade_PP_en_v1",
             cache_dir: Optional[str] = None,
             threads: Optional[int] = None,
             batch_size: int = 32,
             progress_bar: bool = True,
             parallel: Optional[int] = None,
             local_files_only: bool = False,
             meta_fields_to_embed: Optional[List[str]] = None,
             embedding_separator: str = "\n",
             model_kwargs: Optional[Dict[str, Any]] = None)

Create an FastembedDocumentEmbedder component.

Arguments:

  • model: Local path or name of the model in Hugging Face's model hub, such as prithivida/Splade_PP_en_v1.
  • cache_dir: The path to the cache directory. Can be set using the FASTEMBED_CACHE_PATH env variable. Defaults to fastembed_cache in the system's temp directory.
  • threads: The number of threads single onnxruntime session can use.
  • batch_size: Number of strings to encode at once.
  • progress_bar: If True, displays progress bar during embedding.
  • parallel: If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. If 0, use all available cores. If None, don't use data-parallel processing, use default onnxruntime threading instead.
  • local_files_only: If True, only use the model files in the cache_dir.
  • meta_fields_to_embed: List of meta fields that should be embedded along with the Document content.
  • embedding_separator: Separator used to concatenate the meta fields to the Document content.
  • model_kwargs: Dictionary containing model parameters such as k, b, avg_len, language.

FastembedSparseDocumentEmbedder.to_dict

def to_dict() -> Dict[str, Any]

Serializes the component to a dictionary.

Returns:

Dictionary with serialized data.

FastembedSparseDocumentEmbedder.warm_up

def warm_up()

Initializes the component.

FastembedSparseDocumentEmbedder.run

@component.output_types(documents=List[Document])
def run(documents: List[Document])

Embeds a list of Documents.

Arguments:

  • documents: List of Documents to embed.

Returns:

A dictionary with the following keys:

  • documents: List of Documents with each Document's sparse_embedding field set to the computed embeddings.

Module haystack_integrations.components.embedders.fastembed.fastembed_sparse_text_embedder

FastembedSparseTextEmbedder

FastembedSparseTextEmbedder computes string embedding using fastembed sparse models.

Usage example:

from haystack_integrations.components.embedders.fastembed import FastembedSparseTextEmbedder

text = ("It clearly says online this will work on a Mac OS system. "
        "The disk comes and it does not, only Windows. Do Not order this if you have a Mac!!")

sparse_text_embedder = FastembedSparseTextEmbedder(
    model="prithivida/Splade_PP_en_v1"
)
sparse_text_embedder.warm_up()

sparse_embedding = sparse_text_embedder.run(text)["sparse_embedding"]

FastembedSparseTextEmbedder.__init__

def __init__(model: str = "prithivida/Splade_PP_en_v1",
             cache_dir: Optional[str] = None,
             threads: Optional[int] = None,
             progress_bar: bool = True,
             parallel: Optional[int] = None,
             local_files_only: bool = False,
             model_kwargs: Optional[Dict[str, Any]] = None)

Create a FastembedSparseTextEmbedder component.

Arguments:

  • model: Local path or name of the model in Fastembed's model hub, such as prithivida/Splade_PP_en_v1
  • cache_dir: The path to the cache directory. Can be set using the FASTEMBED_CACHE_PATH env variable. Defaults to fastembed_cache in the system's temp directory.
  • threads: The number of threads single onnxruntime session can use. Defaults to None.
  • progress_bar: If True, displays progress bar during embedding.
  • parallel: If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. If 0, use all available cores. If None, don't use data-parallel processing, use default onnxruntime threading instead.
  • local_files_only: If True, only use the model files in the cache_dir.
  • model_kwargs: Dictionary containing model parameters such as k, b, avg_len, language.

FastembedSparseTextEmbedder.to_dict

def to_dict() -> Dict[str, Any]

Serializes the component to a dictionary.

Returns:

Dictionary with serialized data.

FastembedSparseTextEmbedder.warm_up

def warm_up()

Initializes the component.

FastembedSparseTextEmbedder.run

@component.output_types(sparse_embedding=SparseEmbedding)
def run(text: str)

Embeds text using the Fastembed model.

Arguments:

  • text: A string to embed.

Raises:

  • TypeError: If the input is not a string.
  • RuntimeError: If the embedding model has not been loaded.

Returns:

A dictionary with the following keys:

  • embedding: A list of floats representing the embedding of the input text.

Module haystack_integrations.components.rankers.fastembed.ranker

FastembedRanker

Ranks Documents based on their similarity to the query using Fastembed models.

Documents are indexed from most to least semantically relevant to the query.

Usage example:

from haystack import Document
from haystack_integrations.components.rankers.fastembed import FastembedRanker

ranker = FastembedRanker(model_name="Xenova/ms-marco-MiniLM-L-6-v2", top_k=2)

docs = [Document(content="Paris"), Document(content="Berlin")]
query = "What is the capital of germany?"
output = ranker.run(query=query, documents=docs)
print(output["documents"][0].content)

# Berlin

FastembedRanker.__init__

def __init__(model_name: str = "Xenova/ms-marco-MiniLM-L-6-v2",
             top_k: int = 10,
             cache_dir: Optional[str] = None,
             threads: Optional[int] = None,
             batch_size: int = 64,
             parallel: Optional[int] = None,
             local_files_only: bool = False,
             meta_fields_to_embed: Optional[List[str]] = None,
             meta_data_separator: str = "\n")

Creates an instance of the 'FastembedRanker'.

Arguments:

  • model_name: Fastembed model name. Check the list of supported models in the Fastembed documentation.
  • top_k: The maximum number of documents to return.
  • cache_dir: The path to the cache directory. Can be set using the FASTEMBED_CACHE_PATH env variable. Defaults to fastembed_cache in the system's temp directory.
  • threads: The number of threads single onnxruntime session can use. Defaults to None.
  • batch_size: Number of strings to encode at once.
  • parallel: If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. If 0, use all available cores. If None, don't use data-parallel processing, use default onnxruntime threading instead.
  • local_files_only: If True, only use the model files in the cache_dir.
  • meta_fields_to_embed: List of meta fields that should be concatenated with the document content for reranking.
  • meta_data_separator: Separator used to concatenate the meta fields to the Document content.

FastembedRanker.to_dict

def to_dict() -> Dict[str, Any]

Serializes the component to a dictionary.

Returns:

Dictionary with serialized data.

FastembedRanker.from_dict

@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "FastembedRanker"

Deserializes the component from a dictionary.

Arguments:

  • data: The dictionary to deserialize from.

Returns:

The deserialized component.

FastembedRanker.warm_up

def warm_up()

Initializes the component.

FastembedRanker.run

@component.output_types(documents=List[Document])
def run(query: str, documents: List[Document], top_k: Optional[int] = None)

Returns a list of documents ranked by their similarity to the given query, using FastEmbed.

Arguments:

  • query: The input query to compare the documents to.
  • documents: A list of documents to be ranked.
  • top_k: The maximum number of documents to return.

Raises:

  • ValueError: If top_k is not > 0.

Returns:

A dictionary with the following keys:

  • documents: A list of documents closest to the query, sorted from most similar to least similar.