Skip to main content
Version: 2.25-unstable

ValkeyEmbeddingRetriever

This is an embedding Retriever compatible with the Valkey Document Store.

Most common position in a pipeline1. After a Text Embedder and before a ChatPromptBuilder or PromptBuilder in a RAG pipeline 2. The last component in a semantic search pipeline 3. After a Text Embedder and before an ExtractiveReader in an extractive QA pipeline
Mandatory init variablesdocument_store: An instance of a ValkeyDocumentStore
Mandatory run variablesquery_embedding: A list of floats
Output variablesdocuments: A list of documents
API referenceValkey
GitHub linkhttps://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/valkey

Overview

The ValkeyEmbeddingRetriever is an embedding-based Retriever compatible with the ValkeyDocumentStore. It compares the query and Document embeddings and fetches the Documents most relevant to the query from the ValkeyDocumentStore based on vector similarity.

Parameters

When using the ValkeyEmbeddingRetriever in your system, ensure the query and Document embeddings are available. You can do so by adding a Document embedder to your indexing pipeline and a text embedder to your query pipeline.

In addition to the query_embedding, the ValkeyEmbeddingRetriever accepts other optional parameters, including top_k (the maximum number of Documents to retrieve) and filters to narrow down the search space.

Usage

Installation

To start using Valkey with Haystack, install the package with:

shell
pip install valkey-haystack

On its own

This Retriever needs an instance of ValkeyDocumentStore and indexed Documents to run.

python
from haystack_integrations.document_stores.valkey import ValkeyDocumentStore
from haystack_integrations.components.retrievers.valkey import ValkeyEmbeddingRetriever

document_store = ValkeyDocumentStore(
nodes_list=[("localhost", 6379)],
index_name="my_documents",
embedding_dim=768,
distance_metric="cosine",
)

retriever = ValkeyEmbeddingRetriever(document_store=document_store)

# Using a fake vector to keep the example simple
retriever.run(query_embedding=[0.1] * 768)

In a Pipeline

python
from haystack import Document, Pipeline
from haystack.components.embedders import (
SentenceTransformersDocumentEmbedder,
SentenceTransformersTextEmbedder,
)
from haystack.components.writers import DocumentWriter
from haystack_integrations.document_stores.valkey import ValkeyDocumentStore
from haystack_integrations.components.retrievers.valkey import ValkeyEmbeddingRetriever

document_store = ValkeyDocumentStore(
nodes_list=[("localhost", 6379)],
index_name="my_documents",
embedding_dim=768,
distance_metric="cosine",
)

documents = [
Document(content="There are over 7,000 languages spoken around the world today."),
Document(
content="Elephants have been observed to behave in a way that indicates a high level of self-awareness, such as recognizing themselves in mirrors.",
),
Document(
content="In certain parts of the world, like the Maldives, Puerto Rico, and San Diego, you can witness the phenomenon of bioluminescent waves.",
),
]

indexing = Pipeline()
indexing.add_component("embedder", SentenceTransformersDocumentEmbedder())
indexing.add_component("writer", DocumentWriter(document_store))
indexing.connect("embedder.documents", "writer.documents")
indexing.run({"embedder": {"documents": documents}})

query_pipeline = Pipeline()
query_pipeline.add_component("text_embedder", SentenceTransformersTextEmbedder())
query_pipeline.add_component(
"retriever",
ValkeyEmbeddingRetriever(document_store=document_store),
)
query_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")

query = "How many languages are there?"
result = query_pipeline.run({"text_embedder": {"text": query}})

print(result["retriever"]["documents"][0])

For a full RAG example with ValkeyEmbeddingRetriever, see the ValkeyDocumentStore documentation.