FastEmbed integration for Haystack
Module haystack_integrations.components.embedders.fastembed.fastembed_document_embedder
FastembedDocumentEmbedder
FastembedDocumentEmbedder computes Document embeddings using Fastembed embedding models.
The embedding of each Document is stored in the embedding field of the Document.
Usage example:
# To use this component, install the "fastembed-haystack" package.
# pip install fastembed-haystack
from haystack_integrations.components.embedders.fastembed import FastembedDocumentEmbedder
from haystack.dataclasses import Document
doc_embedder = FastembedDocumentEmbedder(
    model="BAAI/bge-small-en-v1.5",
    batch_size=256,
)
doc_embedder.warm_up()
# Text taken from PubMed QA Dataset (https://huggingface.co/datasets/pubmed_qa)
document_list = [
    Document(
        content=("Oxidative stress generated within inflammatory joints can produce autoimmune phenomena and joint "
                 "destruction. Radical species with oxidative activity, including reactive nitrogen species, "
                 "represent mediators of inflammation and cartilage damage."),
        meta={
            "pubid": "25,445,628",
            "long_answer": "yes",
        },
    ),
    Document(
        content=("Plasma levels of pancreatic polypeptide (PP) rise upon food intake. Although other pancreatic "
                 "islet hormones, such as insulin and glucagon, have been extensively investigated, PP secretion "
                 "and actions are still poorly understood."),
        meta={
            "pubid": "25,445,712",
            "long_answer": "yes",
        },
    ),
]
result = doc_embedder.run(document_list)
print(f"Document Text: {result['documents'][0].content}")
print(f"Document Embedding: {result['documents'][0].embedding}")
print(f"Embedding Dimension: {len(result['documents'][0].embedding)}")
FastembedDocumentEmbedder.__init__
def __init__(model: str = "BAAI/bge-small-en-v1.5",
             cache_dir: Optional[str] = None,
             threads: Optional[int] = None,
             prefix: str = "",
             suffix: str = "",
             batch_size: int = 256,
             progress_bar: bool = True,
             parallel: Optional[int] = None,
             local_files_only: bool = False,
             meta_fields_to_embed: Optional[List[str]] = None,
             embedding_separator: str = "\n")
Create an FastembedDocumentEmbedder component.
Arguments:
- model: Local path or name of the model in Hugging Face's model hub, such as- BAAI/bge-small-en-v1.5.
- cache_dir: The path to the cache directory. Can be set using the- FASTEMBED_CACHE_PATHenv variable. Defaults to- fastembed_cachein the system's temp directory.
- threads: The number of threads single onnxruntime session can use. Defaults to None.
- prefix: A string to add to the beginning of each text.
- suffix: A string to add to the end of each text.
- batch_size: Number of strings to encode at once.
- progress_bar: If- True, displays progress bar during embedding.
- parallel: If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. If 0, use all available cores. If None, don't use data-parallel processing, use default onnxruntime threading instead.
- local_files_only: If- True, only use the model files in the- cache_dir.
- meta_fields_to_embed: List of meta fields that should be embedded along with the Document content.
- embedding_separator: Separator used to concatenate the meta fields to the Document content.
FastembedDocumentEmbedder.to_dict
def to_dict() -> Dict[str, Any]
Serializes the component to a dictionary.
Returns:
Dictionary with serialized data.
FastembedDocumentEmbedder.warm_up
def warm_up()
Initializes the component.
FastembedDocumentEmbedder.run
@component.output_types(documents=List[Document])
def run(documents: List[Document]) -> Dict[str, List[Document]]
Embeds a list of Documents.
Arguments:
- documents: List of Documents to embed.
Returns:
A dictionary with the following keys:
- documents: List of Documents with each Document's- embeddingfield set to the computed embeddings.
Module haystack_integrations.components.embedders.fastembed.fastembed_text_embedder
FastembedTextEmbedder
FastembedTextEmbedder computes string embedding using fastembed embedding models.
Usage example:
from haystack_integrations.components.embedders.fastembed import FastembedTextEmbedder
text = ("It clearly says online this will work on a Mac OS system. "
        "The disk comes and it does not, only Windows. Do Not order this if you have a Mac!!")
text_embedder = FastembedTextEmbedder(
    model="BAAI/bge-small-en-v1.5"
)
text_embedder.warm_up()
embedding = text_embedder.run(text)["embedding"]
FastembedTextEmbedder.__init__
def __init__(model: str = "BAAI/bge-small-en-v1.5",
             cache_dir: Optional[str] = None,
             threads: Optional[int] = None,
             prefix: str = "",
             suffix: str = "",
             progress_bar: bool = True,
             parallel: Optional[int] = None,
             local_files_only: bool = False)
Create a FastembedTextEmbedder component.
Arguments:
- model: Local path or name of the model in Fastembed's model hub, such as- BAAI/bge-small-en-v1.5
- cache_dir: The path to the cache directory. Can be set using the- FASTEMBED_CACHE_PATHenv variable. Defaults to- fastembed_cachein the system's temp directory.
- threads: The number of threads single onnxruntime session can use. Defaults to None.
- prefix: A string to add to the beginning of each text.
- suffix: A string to add to the end of each text.
- progress_bar: If- True, displays progress bar during embedding.
- parallel: If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. If 0, use all available cores. If None, don't use data-parallel processing, use default onnxruntime threading instead.
- local_files_only: If- True, only use the model files in the- cache_dir.
FastembedTextEmbedder.to_dict
def to_dict() -> Dict[str, Any]
Serializes the component to a dictionary.
Returns:
Dictionary with serialized data.
FastembedTextEmbedder.warm_up
def warm_up()
Initializes the component.
FastembedTextEmbedder.run
@component.output_types(embedding=List[float])
def run(text: str) -> Dict[str, List[float]]
Embeds text using the Fastembed model.
Arguments:
- text: A string to embed.
Raises:
- TypeError: If the input is not a string.
- RuntimeError: If the embedding model has not been loaded.
Returns:
A dictionary with the following keys:
- embedding: A list of floats representing the embedding of the input text.
Module haystack_integrations.components.embedders.fastembed.fastembed_sparse_document_embedder
FastembedSparseDocumentEmbedder
FastembedSparseDocumentEmbedder computes Document embeddings using Fastembed sparse models.
Usage example:
from haystack_integrations.components.embedders.fastembed import FastembedSparseDocumentEmbedder
from haystack.dataclasses import Document
sparse_doc_embedder = FastembedSparseDocumentEmbedder(
    model="prithivida/Splade_PP_en_v1",
    batch_size=32,
)
sparse_doc_embedder.warm_up()
# Text taken from PubMed QA Dataset (https://huggingface.co/datasets/pubmed_qa)
document_list = [
    Document(
        content=("Oxidative stress generated within inflammatory joints can produce autoimmune phenomena and joint "
                 "destruction. Radical species with oxidative activity, including reactive nitrogen species, "
                 "represent mediators of inflammation and cartilage damage."),
        meta={
            "pubid": "25,445,628",
            "long_answer": "yes",
        },
    ),
    Document(
        content=("Plasma levels of pancreatic polypeptide (PP) rise upon food intake. Although other pancreatic "
                 "islet hormones, such as insulin and glucagon, have been extensively investigated, PP secretion "
                 "and actions are still poorly understood."),
        meta={
            "pubid": "25,445,712",
            "long_answer": "yes",
        },
    ),
]
result = sparse_doc_embedder.run(document_list)
print(f"Document Text: {result['documents'][0].content}")
print(f"Document Sparse Embedding: {result['documents'][0].sparse_embedding}")
print(f"Sparse Embedding Dimension: {len(result['documents'][0].sparse_embedding)}")
FastembedSparseDocumentEmbedder.__init__
def __init__(model: str = "prithivida/Splade_PP_en_v1",
             cache_dir: Optional[str] = None,
             threads: Optional[int] = None,
             batch_size: int = 32,
             progress_bar: bool = True,
             parallel: Optional[int] = None,
             local_files_only: bool = False,
             meta_fields_to_embed: Optional[List[str]] = None,
             embedding_separator: str = "\n",
             model_kwargs: Optional[Dict[str, Any]] = None)
Create an FastembedDocumentEmbedder component.
Arguments:
- model: Local path or name of the model in Hugging Face's model hub, such as- prithivida/Splade_PP_en_v1.
- cache_dir: The path to the cache directory. Can be set using the- FASTEMBED_CACHE_PATHenv variable. Defaults to- fastembed_cachein the system's temp directory.
- threads: The number of threads single onnxruntime session can use.
- batch_size: Number of strings to encode at once.
- progress_bar: If- True, displays progress bar during embedding.
- parallel: If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. If 0, use all available cores. If None, don't use data-parallel processing, use default onnxruntime threading instead.
- local_files_only: If- True, only use the model files in the- cache_dir.
- meta_fields_to_embed: List of meta fields that should be embedded along with the Document content.
- embedding_separator: Separator used to concatenate the meta fields to the Document content.
- model_kwargs: Dictionary containing model parameters such as- k,- b,- avg_len,- language.
FastembedSparseDocumentEmbedder.to_dict
def to_dict() -> Dict[str, Any]
Serializes the component to a dictionary.
Returns:
Dictionary with serialized data.
FastembedSparseDocumentEmbedder.warm_up
def warm_up()
Initializes the component.
FastembedSparseDocumentEmbedder.run
@component.output_types(documents=List[Document])
def run(documents: List[Document]) -> Dict[str, List[Document]]
Embeds a list of Documents.
Arguments:
- documents: List of Documents to embed.
Returns:
A dictionary with the following keys:
- documents: List of Documents with each Document's- sparse_embeddingfield set to the computed embeddings.
Module haystack_integrations.components.embedders.fastembed.fastembed_sparse_text_embedder
FastembedSparseTextEmbedder
FastembedSparseTextEmbedder computes string embedding using fastembed sparse models.
Usage example:
from haystack_integrations.components.embedders.fastembed import FastembedSparseTextEmbedder
text = ("It clearly says online this will work on a Mac OS system. "
        "The disk comes and it does not, only Windows. Do Not order this if you have a Mac!!")
sparse_text_embedder = FastembedSparseTextEmbedder(
    model="prithivida/Splade_PP_en_v1"
)
sparse_text_embedder.warm_up()
sparse_embedding = sparse_text_embedder.run(text)["sparse_embedding"]
FastembedSparseTextEmbedder.__init__
def __init__(model: str = "prithivida/Splade_PP_en_v1",
             cache_dir: Optional[str] = None,
             threads: Optional[int] = None,
             progress_bar: bool = True,
             parallel: Optional[int] = None,
             local_files_only: bool = False,
             model_kwargs: Optional[Dict[str, Any]] = None)
Create a FastembedSparseTextEmbedder component.
Arguments:
- model: Local path or name of the model in Fastembed's model hub, such as- prithivida/Splade_PP_en_v1
- cache_dir: The path to the cache directory. Can be set using the- FASTEMBED_CACHE_PATHenv variable. Defaults to- fastembed_cachein the system's temp directory.
- threads: The number of threads single onnxruntime session can use. Defaults to None.
- progress_bar: If- True, displays progress bar during embedding.
- parallel: If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. If 0, use all available cores. If None, don't use data-parallel processing, use default onnxruntime threading instead.
- local_files_only: If- True, only use the model files in the- cache_dir.
- model_kwargs: Dictionary containing model parameters such as- k,- b,- avg_len,- language.
FastembedSparseTextEmbedder.to_dict
def to_dict() -> Dict[str, Any]
Serializes the component to a dictionary.
Returns:
Dictionary with serialized data.
FastembedSparseTextEmbedder.warm_up
def warm_up()
Initializes the component.
FastembedSparseTextEmbedder.run
@component.output_types(sparse_embedding=SparseEmbedding)
def run(text: str) -> Dict[str, SparseEmbedding]
Embeds text using the Fastembed model.
Arguments:
- text: A string to embed.
Raises:
- TypeError: If the input is not a string.
- RuntimeError: If the embedding model has not been loaded.
Returns:
A dictionary with the following keys:
- embedding: A list of floats representing the embedding of the input text.
Module haystack_integrations.components.rankers.fastembed.ranker
FastembedRanker
Ranks Documents based on their similarity to the query using Fastembed models.
Documents are indexed from most to least semantically relevant to the query.
Usage example:
from haystack import Document
from haystack_integrations.components.rankers.fastembed import FastembedRanker
ranker = FastembedRanker(model_name="Xenova/ms-marco-MiniLM-L-6-v2", top_k=2)
docs = [Document(content="Paris"), Document(content="Berlin")]
query = "What is the capital of germany?"
output = ranker.run(query=query, documents=docs)
print(output["documents"][0].content)
# Berlin
FastembedRanker.__init__
def __init__(model_name: str = "Xenova/ms-marco-MiniLM-L-6-v2",
             top_k: int = 10,
             cache_dir: Optional[str] = None,
             threads: Optional[int] = None,
             batch_size: int = 64,
             parallel: Optional[int] = None,
             local_files_only: bool = False,
             meta_fields_to_embed: Optional[List[str]] = None,
             meta_data_separator: str = "\n")
Creates an instance of the 'FastembedRanker'.
Arguments:
- model_name: Fastembed model name. Check the list of supported models in the Fastembed documentation.
- top_k: The maximum number of documents to return.
- cache_dir: The path to the cache directory. Can be set using the- FASTEMBED_CACHE_PATHenv variable. Defaults to- fastembed_cachein the system's temp directory.
- threads: The number of threads single onnxruntime session can use. Defaults to None.
- batch_size: Number of strings to encode at once.
- parallel: If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. If 0, use all available cores. If None, don't use data-parallel processing, use default onnxruntime threading instead.
- local_files_only: If- True, only use the model files in the- cache_dir.
- meta_fields_to_embed: List of meta fields that should be concatenated with the document content for reranking.
- meta_data_separator: Separator used to concatenate the meta fields to the Document content.
FastembedRanker.to_dict
def to_dict() -> Dict[str, Any]
Serializes the component to a dictionary.
Returns:
Dictionary with serialized data.
FastembedRanker.from_dict
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "FastembedRanker"
Deserializes the component from a dictionary.
Arguments:
- data: The dictionary to deserialize from.
Returns:
The deserialized component.
FastembedRanker.warm_up
def warm_up()
Initializes the component.
FastembedRanker.run
@component.output_types(documents=List[Document])
def run(query: str,
        documents: List[Document],
        top_k: Optional[int] = None) -> Dict[str, List[Document]]
Returns a list of documents ranked by their similarity to the given query, using FastEmbed.
Arguments:
- query: The input query to compare the documents to.
- documents: A list of documents to be ranked.
- top_k: The maximum number of documents to return.
Raises:
- ValueError: If- top_kis not > 0.
Returns:
A dictionary with the following keys:
- documents: A list of documents closest to the query, sorted from most similar to least similar.
