FastEmbed integration for Haystack
Module haystack_integrations.components.embedders.fastembed.fastembed_document_embedder
FastembedDocumentEmbedder
FastembedDocumentEmbedder computes Document embeddings using Fastembed embedding models.
The embedding of each Document is stored in the embedding
field of the Document.
Usage example:
# To use this component, install the "fastembed-haystack" package.
# pip install fastembed-haystack
from haystack_integrations.components.embedders.fastembed import FastembedDocumentEmbedder
from haystack.dataclasses import Document
doc_embedder = FastembedDocumentEmbedder(
model="BAAI/bge-small-en-v1.5",
batch_size=256,
)
doc_embedder.warm_up()
# Text taken from PubMed QA Dataset (https://huggingface.co/datasets/pubmed_qa)
document_list = [
Document(
content=("Oxidative stress generated within inflammatory joints can produce autoimmune phenomena and joint "
"destruction. Radical species with oxidative activity, including reactive nitrogen species, "
"represent mediators of inflammation and cartilage damage."),
meta={
"pubid": "25,445,628",
"long_answer": "yes",
},
),
Document(
content=("Plasma levels of pancreatic polypeptide (PP) rise upon food intake. Although other pancreatic "
"islet hormones, such as insulin and glucagon, have been extensively investigated, PP secretion "
"and actions are still poorly understood."),
meta={
"pubid": "25,445,712",
"long_answer": "yes",
},
),
]
result = doc_embedder.run(document_list)
print(f"Document Text: {result['documents'][0].content}")
print(f"Document Embedding: {result['documents'][0].embedding}")
print(f"Embedding Dimension: {len(result['documents'][0].embedding)}")
FastembedDocumentEmbedder.__init__
def __init__(model: str = "BAAI/bge-small-en-v1.5",
cache_dir: Optional[str] = None,
threads: Optional[int] = None,
prefix: str = "",
suffix: str = "",
batch_size: int = 256,
progress_bar: bool = True,
parallel: Optional[int] = None,
local_files_only: bool = False,
meta_fields_to_embed: Optional[List[str]] = None,
embedding_separator: str = "\n")
Create an FastembedDocumentEmbedder component.
Arguments:
model
: Local path or name of the model in Hugging Face's model hub, such asBAAI/bge-small-en-v1.5
.cache_dir
: The path to the cache directory. Can be set using theFASTEMBED_CACHE_PATH
env variable. Defaults tofastembed_cache
in the system's temp directory.threads
: The number of threads single onnxruntime session can use. Defaults to None.prefix
: A string to add to the beginning of each text.suffix
: A string to add to the end of each text.batch_size
: Number of strings to encode at once.progress_bar
: IfTrue
, displays progress bar during embedding.parallel
: If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. If 0, use all available cores. If None, don't use data-parallel processing, use default onnxruntime threading instead.local_files_only
: IfTrue
, only use the model files in thecache_dir
.meta_fields_to_embed
: List of meta fields that should be embedded along with the Document content.embedding_separator
: Separator used to concatenate the meta fields to the Document content.
FastembedDocumentEmbedder.to_dict
def to_dict() -> Dict[str, Any]
Serializes the component to a dictionary.
Returns:
Dictionary with serialized data.
FastembedDocumentEmbedder.warm_up
def warm_up()
Initializes the component.
FastembedDocumentEmbedder.run
@component.output_types(documents=List[Document])
def run(documents: List[Document])
Embeds a list of Documents.
Arguments:
documents
: List of Documents to embed.
Returns:
A dictionary with the following keys:
documents
: List of Documents with each Document'sembedding
field set to the computed embeddings.
Module haystack_integrations.components.embedders.fastembed.fastembed_text_embedder
FastembedTextEmbedder
FastembedTextEmbedder computes string embedding using fastembed embedding models.
Usage example:
from haystack_integrations.components.embedders.fastembed import FastembedTextEmbedder
text = ("It clearly says online this will work on a Mac OS system. "
"The disk comes and it does not, only Windows. Do Not order this if you have a Mac!!")
text_embedder = FastembedTextEmbedder(
model="BAAI/bge-small-en-v1.5"
)
text_embedder.warm_up()
embedding = text_embedder.run(text)["embedding"]
FastembedTextEmbedder.__init__
def __init__(model: str = "BAAI/bge-small-en-v1.5",
cache_dir: Optional[str] = None,
threads: Optional[int] = None,
prefix: str = "",
suffix: str = "",
progress_bar: bool = True,
parallel: Optional[int] = None,
local_files_only: bool = False)
Create a FastembedTextEmbedder component.
Arguments:
model
: Local path or name of the model in Fastembed's model hub, such asBAAI/bge-small-en-v1.5
cache_dir
: The path to the cache directory. Can be set using theFASTEMBED_CACHE_PATH
env variable. Defaults tofastembed_cache
in the system's temp directory.threads
: The number of threads single onnxruntime session can use. Defaults to None.prefix
: A string to add to the beginning of each text.suffix
: A string to add to the end of each text.progress_bar
: IfTrue
, displays progress bar during embedding.parallel
: If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. If 0, use all available cores. If None, don't use data-parallel processing, use default onnxruntime threading instead.local_files_only
: IfTrue
, only use the model files in thecache_dir
.
FastembedTextEmbedder.to_dict
def to_dict() -> Dict[str, Any]
Serializes the component to a dictionary.
Returns:
Dictionary with serialized data.
FastembedTextEmbedder.warm_up
def warm_up()
Initializes the component.
FastembedTextEmbedder.run
@component.output_types(embedding=List[float])
def run(text: str)
Embeds text using the Fastembed model.
Arguments:
text
: A string to embed.
Raises:
TypeError
: If the input is not a string.RuntimeError
: If the embedding model has not been loaded.
Returns:
A dictionary with the following keys:
embedding
: A list of floats representing the embedding of the input text.
Module haystack_integrations.components.embedders.fastembed.fastembed_sparse_document_embedder
FastembedSparseDocumentEmbedder
FastembedSparseDocumentEmbedder computes Document embeddings using Fastembed sparse models.
Usage example:
from haystack_integrations.components.embedders.fastembed import FastembedSparseDocumentEmbedder
from haystack.dataclasses import Document
sparse_doc_embedder = FastembedSparseDocumentEmbedder(
model="prithvida/Splade_PP_en_v1",
batch_size=32,
)
sparse_doc_embedder.warm_up()
# Text taken from PubMed QA Dataset (https://huggingface.co/datasets/pubmed_qa)
document_list = [
Document(
content=("Oxidative stress generated within inflammatory joints can produce autoimmune phenomena and joint "
"destruction. Radical species with oxidative activity, including reactive nitrogen species, "
"represent mediators of inflammation and cartilage damage."),
meta={
"pubid": "25,445,628",
"long_answer": "yes",
},
),
Document(
content=("Plasma levels of pancreatic polypeptide (PP) rise upon food intake. Although other pancreatic "
"islet hormones, such as insulin and glucagon, have been extensively investigated, PP secretion "
"and actions are still poorly understood."),
meta={
"pubid": "25,445,712",
"long_answer": "yes",
},
),
]
result = sparse_doc_embedder.run(document_list)
print(f"Document Text: {result['documents'][0].content}")
print(f"Document Sparse Embedding: {result['documents'][0].sparse_embedding}")
print(f"Sparse Embedding Dimension: {len(result['documents'][0].sparse_embedding)}")
FastembedSparseDocumentEmbedder.__init__
def __init__(model: str = "prithvida/Splade_PP_en_v1",
cache_dir: Optional[str] = None,
threads: Optional[int] = None,
batch_size: int = 32,
progress_bar: bool = True,
parallel: Optional[int] = None,
local_files_only: bool = False,
meta_fields_to_embed: Optional[List[str]] = None,
embedding_separator: str = "\n",
model_kwargs: Optional[Dict[str, Any]] = None)
Create an FastembedDocumentEmbedder component.
Arguments:
model
: Local path or name of the model in Hugging Face's model hub, such asprithvida/Splade_PP_en_v1
.cache_dir
: The path to the cache directory. Can be set using theFASTEMBED_CACHE_PATH
env variable. Defaults tofastembed_cache
in the system's temp directory.threads
: The number of threads single onnxruntime session can use.batch_size
: Number of strings to encode at once.progress_bar
: IfTrue
, displays progress bar during embedding.parallel
: If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. If 0, use all available cores. If None, don't use data-parallel processing, use default onnxruntime threading instead.local_files_only
: IfTrue
, only use the model files in thecache_dir
.meta_fields_to_embed
: List of meta fields that should be embedded along with the Document content.embedding_separator
: Separator used to concatenate the meta fields to the Document content.model_kwargs
: Dictionary containing model parameters such ask
,b
,avg_len
,language
.
FastembedSparseDocumentEmbedder.to_dict
def to_dict() -> Dict[str, Any]
Serializes the component to a dictionary.
Returns:
Dictionary with serialized data.
FastembedSparseDocumentEmbedder.warm_up
def warm_up()
Initializes the component.
FastembedSparseDocumentEmbedder.run
@component.output_types(documents=List[Document])
def run(documents: List[Document])
Embeds a list of Documents.
Arguments:
documents
: List of Documents to embed.
Returns:
A dictionary with the following keys:
documents
: List of Documents with each Document'ssparse_embedding
field set to the computed embeddings.
Module haystack_integrations.components.embedders.fastembed.fastembed_sparse_text_embedder
FastembedSparseTextEmbedder
FastembedSparseTextEmbedder computes string embedding using fastembed sparse models.
Usage example:
from haystack_integrations.components.embedders.fastembed import FastembedSparseTextEmbedder
text = ("It clearly says online this will work on a Mac OS system. "
"The disk comes and it does not, only Windows. Do Not order this if you have a Mac!!")
sparse_text_embedder = FastembedSparseTextEmbedder(
model="prithvida/Splade_PP_en_v1"
)
sparse_text_embedder.warm_up()
sparse_embedding = sparse_text_embedder.run(text)["sparse_embedding"]
FastembedSparseTextEmbedder.__init__
def __init__(model: str = "prithvida/Splade_PP_en_v1",
cache_dir: Optional[str] = None,
threads: Optional[int] = None,
progress_bar: bool = True,
parallel: Optional[int] = None,
local_files_only: bool = False,
model_kwargs: Optional[Dict[str, Any]] = None)
Create a FastembedSparseTextEmbedder component.
Arguments:
model
: Local path or name of the model in Fastembed's model hub, such asprithvida/Splade_PP_en_v1
cache_dir
: The path to the cache directory. Can be set using theFASTEMBED_CACHE_PATH
env variable. Defaults tofastembed_cache
in the system's temp directory.threads
: The number of threads single onnxruntime session can use. Defaults to None.progress_bar
: IfTrue
, displays progress bar during embedding.parallel
: If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. If 0, use all available cores. If None, don't use data-parallel processing, use default onnxruntime threading instead.local_files_only
: IfTrue
, only use the model files in thecache_dir
.model_kwargs
: Dictionary containing model parameters such ask
,b
,avg_len
,language
.
FastembedSparseTextEmbedder.to_dict
def to_dict() -> Dict[str, Any]
Serializes the component to a dictionary.
Returns:
Dictionary with serialized data.
FastembedSparseTextEmbedder.warm_up
def warm_up()
Initializes the component.
FastembedSparseTextEmbedder.run
@component.output_types(sparse_embedding=SparseEmbedding)
def run(text: str)
Embeds text using the Fastembed model.
Arguments:
text
: A string to embed.
Raises:
TypeError
: If the input is not a string.RuntimeError
: If the embedding model has not been loaded.
Returns:
A dictionary with the following keys:
embedding
: A list of floats representing the embedding of the input text.
Module haystack_integrations.components.rankers.fastembed.ranker
FastembedRanker
Ranks Documents based on their similarity to the query using Fastembed models.
Documents are indexed from most to least semantically relevant to the query.
Usage example:
from haystack import Document
from haystack_integrations.components.rankers.fastembed import FastembedRanker
ranker = FastembedRanker(model_name="Xenova/ms-marco-MiniLM-L-6-v2", top_k=2)
docs = [Document(content="Paris"), Document(content="Berlin")]
query = "What is the capital of germany?"
output = ranker.run(query=query, documents=docs)
print(output["documents"][0].content)
# Berlin
FastembedRanker.__init__
def __init__(model_name: str = "Xenova/ms-marco-MiniLM-L-6-v2",
top_k: int = 10,
cache_dir: Optional[str] = None,
threads: Optional[int] = None,
batch_size: int = 64,
parallel: Optional[int] = None,
local_files_only: bool = False,
meta_fields_to_embed: Optional[List[str]] = None,
meta_data_separator: str = "\n")
Creates an instance of the 'FastembedRanker'.
Arguments:
model_name
: Fastembed model name. Check the list of supported models in the Fastembed documentation.top_k
: The maximum number of documents to return.cache_dir
: The path to the cache directory. Can be set using theFASTEMBED_CACHE_PATH
env variable. Defaults tofastembed_cache
in the system's temp directory.threads
: The number of threads single onnxruntime session can use. Defaults to None.batch_size
: Number of strings to encode at once.parallel
: If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. If 0, use all available cores. If None, don't use data-parallel processing, use default onnxruntime threading instead.local_files_only
: IfTrue
, only use the model files in thecache_dir
.meta_fields_to_embed
: List of meta fields that should be concatenated with the document content for reranking.meta_data_separator
: Separator used to concatenate the meta fields to the Document content.
FastembedRanker.to_dict
def to_dict() -> Dict[str, Any]
Serializes the component to a dictionary.
Returns:
Dictionary with serialized data.
FastembedRanker.from_dict
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "FastembedRanker"
Deserializes the component from a dictionary.
Arguments:
data
: The dictionary to deserialize from.
Returns:
The deserialized component.
FastembedRanker.warm_up
def warm_up()
Initializes the component.
FastembedRanker.run
@component.output_types(documents=List[Document])
def run(query: str, documents: List[Document], top_k: Optional[int] = None)
Returns a list of documents ranked by their similarity to the given query, using FastEmbed.
Arguments:
query
: The input query to compare the documents to.documents
: A list of documents to be ranked.top_k
: The maximum number of documents to return.
Raises:
ValueError
: Iftop_k
is not > 0.
Returns:
A dictionary with the following keys:
documents
: A list of documents closest to the query, sorted from most similar to least similar.