FastEmbed
haystack_integrations.components.embedders.fastembed.fastembed_document_embedder
FastembedDocumentEmbedder
FastembedDocumentEmbedder computes Document embeddings using Fastembed embedding models.
The embedding of each Document is stored in the embedding field of the Document.
Usage example:
# To use this component, install the "fastembed-haystack" package.
# pip install fastembed-haystack
from haystack_integrations.components.embedders.fastembed import FastembedDocumentEmbedder
from haystack.dataclasses import Document
doc_embedder = FastembedDocumentEmbedder(
model="BAAI/bge-small-en-v1.5",
batch_size=256,
)
# Text taken from PubMed QA Dataset (https://huggingface.co/datasets/pubmed_qa)
document_list = [
Document(
content=("Oxidative stress generated within inflammatory joints can produce autoimmune phenomena and joint "
"destruction. Radical species with oxidative activity, including reactive nitrogen species, "
"represent mediators of inflammation and cartilage damage."),
meta={
"pubid": "25,445,628",
"long_answer": "yes",
},
),
Document(
content=("Plasma levels of pancreatic polypeptide (PP) rise upon food intake. Although other pancreatic "
"islet hormones, such as insulin and glucagon, have been extensively investigated, PP secretion "
"and actions are still poorly understood."),
meta={
"pubid": "25,445,712",
"long_answer": "yes",
},
),
]
result = doc_embedder.run(document_list)
print(f"Document Text: {result['documents'][0].content}")
print(f"Document Embedding: {result['documents'][0].embedding}")
print(f"Embedding Dimension: {len(result['documents'][0].embedding)}")
init
__init__(
model: str = "BAAI/bge-small-en-v1.5",
cache_dir: str | None = None,
threads: int | None = None,
prefix: str = "",
suffix: str = "",
batch_size: int = 256,
progress_bar: bool = True,
parallel: int | None = None,
local_files_only: bool = False,
meta_fields_to_embed: list[str] | None = None,
embedding_separator: str = "\n",
) -> None
Create an FastembedDocumentEmbedder component.
Parameters:
- model (
str) – Local path or name of the model in Hugging Face's model hub, such asBAAI/bge-small-en-v1.5. - cache_dir (
str | None) – The path to the cache directory. Can be set using theFASTEMBED_CACHE_PATHenv variable. Defaults tofastembed_cachein the system's temp directory. - threads (
int | None) – The number of threads single onnxruntime session can use. Defaults to None. - prefix (
str) – A string to add to the beginning of each text. - suffix (
str) – A string to add to the end of each text. - batch_size (
int) – Number of strings to encode at once. - progress_bar (
bool) – IfTrue, displays progress bar during embedding. - parallel (
int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. If 0, use all available cores. If None, don't use data-parallel processing, use default onnxruntime threading instead. - local_files_only (
bool) – IfTrue, only use the model files in thecache_dir. - meta_fields_to_embed (
list[str] | None) – List of meta fields that should be embedded along with the Document content. - embedding_separator (
str) – Separator used to concatenate the meta fields to the Document content.
to_dict
Serializes the component to a dictionary.
Returns:
dict[str, Any]– Dictionary with serialized data.
warm_up
Initializes the component.
run
Embeds a list of Documents.
Parameters:
- documents (
list[Document]) – List of Documents to embed.
Returns:
dict[str, list[Document]]– A dictionary with the following keys:documents: List of Documents with each Document'sembeddingfield set to the computed embeddings.
Raises:
TypeError– If the input is not a list of Documents.
haystack_integrations.components.embedders.fastembed.fastembed_sparse_document_embedder
FastembedSparseDocumentEmbedder
FastembedSparseDocumentEmbedder computes Document embeddings using Fastembed sparse models.
Usage example:
from haystack_integrations.components.embedders.fastembed import FastembedSparseDocumentEmbedder
from haystack.dataclasses import Document
sparse_doc_embedder = FastembedSparseDocumentEmbedder(
model="prithivida/Splade_PP_en_v1",
batch_size=32,
)
# Text taken from PubMed QA Dataset (https://huggingface.co/datasets/pubmed_qa)
document_list = [
Document(
content=("Oxidative stress generated within inflammatory joints can produce autoimmune phenomena and joint "
"destruction. Radical species with oxidative activity, including reactive nitrogen species, "
"represent mediators of inflammation and cartilage damage."),
meta={
"pubid": "25,445,628",
"long_answer": "yes",
},
),
Document(
content=("Plasma levels of pancreatic polypeptide (PP) rise upon food intake. Although other pancreatic "
"islet hormones, such as insulin and glucagon, have been extensively investigated, PP secretion "
"and actions are still poorly understood."),
meta={
"pubid": "25,445,712",
"long_answer": "yes",
},
),
]
result = sparse_doc_embedder.run(document_list)
print(f"Document Text: {result['documents'][0].content}")
print(f"Document Sparse Embedding: {result['documents'][0].sparse_embedding}")
print(f"Sparse Embedding Dimension: {len(result['documents'][0].sparse_embedding)}")
init
__init__(
model: str = "prithivida/Splade_PP_en_v1",
cache_dir: str | None = None,
threads: int | None = None,
batch_size: int = 32,
progress_bar: bool = True,
parallel: int | None = None,
local_files_only: bool = False,
meta_fields_to_embed: list[str] | None = None,
embedding_separator: str = "\n",
model_kwargs: dict[str, Any] | None = None,
) -> None
Create an FastembedDocumentEmbedder component.
Parameters:
- model (
str) – Local path or name of the model in Hugging Face's model hub, such asprithivida/Splade_PP_en_v1. - cache_dir (
str | None) – The path to the cache directory. Can be set using theFASTEMBED_CACHE_PATHenv variable. Defaults tofastembed_cachein the system's temp directory. - threads (
int | None) – The number of threads single onnxruntime session can use. - batch_size (
int) – Number of strings to encode at once. - progress_bar (
bool) – IfTrue, displays progress bar during embedding. - parallel (
int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. If 0, use all available cores. If None, don't use data-parallel processing, use default onnxruntime threading instead. - local_files_only (
bool) – IfTrue, only use the model files in thecache_dir. - meta_fields_to_embed (
list[str] | None) – List of meta fields that should be embedded along with the Document content. - embedding_separator (
str) – Separator used to concatenate the meta fields to the Document content. - model_kwargs (
dict[str, Any] | None) – Dictionary containing model parameters such ask,b,avg_len,language.
to_dict
Serializes the component to a dictionary.
Returns:
dict[str, Any]– Dictionary with serialized data.
warm_up
Initializes the component.
run
Embeds a list of Documents.
Parameters:
- documents (
list[Document]) – List of Documents to embed.
Returns:
dict[str, list[Document]]– A dictionary with the following keys:documents: List of Documents with each Document'ssparse_embeddingfield set to the computed embeddings.
Raises:
TypeError– If the input is not a list of Documents.
haystack_integrations.components.embedders.fastembed.fastembed_sparse_text_embedder
FastembedSparseTextEmbedder
FastembedSparseTextEmbedder computes string embedding using fastembed sparse models.
Usage example:
from haystack_integrations.components.embedders.fastembed import FastembedSparseTextEmbedder
text = ("It clearly says online this will work on a Mac OS system. "
"The disk comes and it does not, only Windows. Do Not order this if you have a Mac!!")
sparse_text_embedder = FastembedSparseTextEmbedder(
model="prithivida/Splade_PP_en_v1"
)
sparse_embedding = sparse_text_embedder.run(text)["sparse_embedding"]
init
__init__(
model: str = "prithivida/Splade_PP_en_v1",
cache_dir: str | None = None,
threads: int | None = None,
progress_bar: bool = True,
parallel: int | None = None,
local_files_only: bool = False,
model_kwargs: dict[str, Any] | None = None,
) -> None
Create a FastembedSparseTextEmbedder component.
Parameters:
- model (
str) – Local path or name of the model in Fastembed's model hub, such asprithivida/Splade_PP_en_v1 - cache_dir (
str | None) – The path to the cache directory. Can be set using theFASTEMBED_CACHE_PATHenv variable. Defaults tofastembed_cachein the system's temp directory. - threads (
int | None) – The number of threads single onnxruntime session can use. Defaults to None. - progress_bar (
bool) – IfTrue, displays progress bar during embedding. - parallel (
int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. If 0, use all available cores. If None, don't use data-parallel processing, use default onnxruntime threading instead. - local_files_only (
bool) – IfTrue, only use the model files in thecache_dir. - model_kwargs (
dict[str, Any] | None) – Dictionary containing model parameters such ask,b,avg_len,language.
to_dict
Serializes the component to a dictionary.
Returns:
dict[str, Any]– Dictionary with serialized data.
warm_up
Initializes the component.
run
Embeds text using the Fastembed model.
Parameters:
- text (
str) – A string to embed.
Returns:
dict[str, SparseEmbedding]– A dictionary with the following keys:embedding: A list of floats representing the embedding of the input text.
Raises:
TypeError– If the input is not a string.
haystack_integrations.components.embedders.fastembed.fastembed_text_embedder
FastembedTextEmbedder
FastembedTextEmbedder computes string embedding using fastembed embedding models.
Usage example:
from haystack_integrations.components.embedders.fastembed import FastembedTextEmbedder
text = ("It clearly says online this will work on a Mac OS system. "
"The disk comes and it does not, only Windows. Do Not order this if you have a Mac!!")
text_embedder = FastembedTextEmbedder(
model="BAAI/bge-small-en-v1.5"
)
embedding = text_embedder.run(text)["embedding"]
init
__init__(
model: str = "BAAI/bge-small-en-v1.5",
cache_dir: str | None = None,
threads: int | None = None,
prefix: str = "",
suffix: str = "",
progress_bar: bool = True,
parallel: int | None = None,
local_files_only: bool = False,
) -> None
Create a FastembedTextEmbedder component.
Parameters:
- model (
str) – Local path or name of the model in Fastembed's model hub, such asBAAI/bge-small-en-v1.5 - cache_dir (
str | None) – The path to the cache directory. Can be set using theFASTEMBED_CACHE_PATHenv variable. Defaults tofastembed_cachein the system's temp directory. - threads (
int | None) – The number of threads single onnxruntime session can use. Defaults to None. - prefix (
str) – A string to add to the beginning of each text. - suffix (
str) – A string to add to the end of each text. - progress_bar (
bool) – IfTrue, displays progress bar during embedding. - parallel (
int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. If 0, use all available cores. If None, don't use data-parallel processing, use default onnxruntime threading instead. - local_files_only (
bool) – IfTrue, only use the model files in thecache_dir.
to_dict
Serializes the component to a dictionary.
Returns:
dict[str, Any]– Dictionary with serialized data.
warm_up
Initializes the component.
run
Embeds text using the Fastembed model.
Parameters:
- text (
str) – A string to embed.
Returns:
dict[str, list[float]]– A dictionary with the following keys:embedding: A list of floats representing the embedding of the input text.
Raises:
TypeError– If the input is not a string.
haystack_integrations.components.rankers.fastembed.ranker
FastembedRanker
Ranks Documents based on their similarity to the query using Fastembed models.
Documents are indexed from most to least semantically relevant to the query.
Usage example:
from haystack import Document
from haystack_integrations.components.rankers.fastembed import FastembedRanker
ranker = FastembedRanker(model_name="Xenova/ms-marco-MiniLM-L-6-v2", top_k=2)
docs = [Document(content="Paris"), Document(content="Berlin")]
query = "What is the capital of germany?"
output = ranker.run(query=query, documents=docs)
print(output["documents"][0].content)
# Berlin
init
__init__(
model_name: str = "Xenova/ms-marco-MiniLM-L-6-v2",
top_k: int = 10,
cache_dir: str | None = None,
threads: int | None = None,
batch_size: int = 64,
parallel: int | None = None,
local_files_only: bool = False,
meta_fields_to_embed: list[str] | None = None,
meta_data_separator: str = "\n",
)
Creates an instance of the 'FastembedRanker'.
Parameters:
- model_name (
str) – Fastembed model name. Check the list of supported models in the Fastembed documentation. - top_k (
int) – The maximum number of documents to return. - cache_dir (
str | None) – The path to the cache directory. Can be set using theFASTEMBED_CACHE_PATHenv variable. Defaults tofastembed_cachein the system's temp directory. - threads (
int | None) – The number of threads single onnxruntime session can use. Defaults to None. - batch_size (
int) – Number of strings to encode at once. - parallel (
int | None) – If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. If 0, use all available cores. If None, don't use data-parallel processing, use default onnxruntime threading instead. - local_files_only (
bool) – IfTrue, only use the model files in thecache_dir. - meta_fields_to_embed (
list[str] | None) – List of meta fields that should be concatenated with the document content for reranking. - meta_data_separator (
str) – Separator used to concatenate the meta fields to the Document content.
to_dict
Serializes the component to a dictionary.
Returns:
dict[str, Any]– Dictionary with serialized data.
from_dict
Deserializes the component from a dictionary.
Parameters:
- data (
dict[str, Any]) – The dictionary to deserialize from.
Returns:
FastembedRanker– The deserialized component.
warm_up
Initializes the component.
run
run(
query: str, documents: list[Document], top_k: int | None = None
) -> dict[str, list[Document]]
Returns a list of documents ranked by their similarity to the given query, using FastEmbed.
Parameters:
- query (
str) – The input query to compare the documents to. - documents (
list[Document]) – A list of documents to be ranked. - top_k (
int | None) – The maximum number of documents to return.
Returns:
dict[str, list[Document]]– A dictionary with the following keys:documents: A list of documents closest to the query, sorted from most similar to least similar.
Raises:
ValueError– Iftop_kis not > 0.