VespaEmbeddingRetriever
An embedding-based Retriever compatible with the Vespa Document Store.
| Most common position in a pipeline | 1. After a Text Embedder and before a PromptBuilder in a RAG pipeline 2. The last component in the semantic search pipeline 3. After a Text Embedder and before an ExtractiveReader in an extractive QA pipeline |
| Mandatory init variables | document_store: An instance of a VespaDocumentStore |
| Mandatory run variables | query_embedding: A vector representing the query (a list of floats) |
| Output variables | documents: A list of documents |
| API reference | Vespa |
| GitHub link | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/vespa |
| Package name | vespa-haystack |
Overview
The VespaEmbeddingRetriever is a dense embedding-based Retriever compatible with the VespaDocumentStore. It uses Vespa's nearest-neighbor search to find Documents whose embedding is closest to the query embedding and applies a configurable rank profile to score them.
When using the VespaEmbeddingRetriever in your Pipeline, make sure it has the query and Document embeddings available. You can do so by adding a Document Embedder to your indexing Pipeline and a Text Embedder to your query Pipeline.
In addition to the query_embedding, the VespaEmbeddingRetriever accepts other optional parameters, including top_k (the maximum number of Documents to retrieve) and filters to narrow down the search space.
The retriever expects the underlying Vespa application to expose:
- A tensor field for embeddings (named
embeddingby default, configurable on the Document Store viaembedding_field). - A rank profile that scores nearest-neighbor candidates (named
semanticby default, configurable via therankingparameter). The profile typically usescloseness(field, embedding)and takes a query input tensor (namedquery_embeddingby default, configurable viaquery_tensor_name).
You can additionally tune retrieval with target_hits, which sets how many neighbors each Vespa content node considers per query before first-phase ranking.
Installation
Install the vespa-haystack integration:
To run Vespa locally, see the Vespa quick start.
Usage
On its own
This Retriever needs the VespaDocumentStore and indexed Documents to run. Set the VESPA_URL environment variable (or pass url=... to the Document Store) to connect to your Vespa application.
from haystack_integrations.document_stores.vespa import VespaDocumentStore
from haystack_integrations.components.retrievers.vespa import (
VespaEmbeddingRetriever,
)
document_store = VespaDocumentStore(schema="doc", namespace="doc")
retriever = VespaEmbeddingRetriever(document_store=document_store)
## using a fake vector to keep the example simple
retriever.run(query_embedding=[0.1] * 768)
In a Pipeline
from haystack import Document, Pipeline
from haystack.components.embedders import (
SentenceTransformersDocumentEmbedder,
SentenceTransformersTextEmbedder,
)
from haystack.components.writers import DocumentWriter
from haystack_integrations.document_stores.vespa import VespaDocumentStore
from haystack_integrations.components.retrievers.vespa import (
VespaEmbeddingRetriever,
)
document_store = VespaDocumentStore(
schema="doc",
namespace="doc",
content_field="content",
embedding_field="embedding",
metadata_fields=["category"],
)
documents = [
Document(
content="Haystack integrates with Vespa for search.",
meta={"category": "docs"},
),
Document(
content="Vespa supports lexical and vector retrieval.",
meta={"category": "docs"},
),
Document(content="Cats sleep most of the day.", meta={"category": "animals"}),
]
indexing = Pipeline()
indexing.add_component("embedder", SentenceTransformersDocumentEmbedder())
indexing.add_component("writer", DocumentWriter(document_store=document_store))
indexing.connect("embedder", "writer")
indexing.run({"embedder": {"documents": documents}})
query_pipeline = Pipeline()
query_pipeline.add_component("text_embedder", SentenceTransformersTextEmbedder())
query_pipeline.add_component(
"retriever",
VespaEmbeddingRetriever(
document_store=document_store,
top_k=2,
query_tensor_name="query_embedding",
),
)
query_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")
query = "semantic vector search"
result = query_pipeline.run({"text_embedder": {"text": query}})
print(result["retriever"]["documents"][0])