DocumentationAPI ReferenceπŸ““ TutorialsπŸ§‘β€πŸ³ Cookbook🀝 IntegrationsπŸ’œ Discord

Qdrant integration for Haystack

Module haystack_integrations.components.retrievers.qdrant.retriever

QdrantEmbeddingRetriever

A component for retrieving documents from an QdrantDocumentStore using dense vectors.

Usage example:

from haystack_integrations.components.retrievers.qdrant import QdrantEmbeddingRetriever
from haystack_integrations.document_stores.qdrant import QdrantDocumentStore

document_store = QdrantDocumentStore(
    ":memory:",
    recreate_index=True,
    return_embedding=True,
)

document_store.write_documents([Document(content="test", embedding=[0.5]*768)])

retriever = QdrantEmbeddingRetriever(document_store=document_store)

# using a fake vector to keep the example simple
retriever.run(query_embedding=[0.1]*768)

QdrantEmbeddingRetriever.__init__

def __init__(document_store: QdrantDocumentStore,
             filters: Optional[Dict[str, Any]] = None,
             top_k: int = 10,
             scale_score: bool = True,
             return_embedding: bool = False)

Create a QdrantEmbeddingRetriever component.

Arguments:

  • document_store: An instance of QdrantDocumentStore.
  • filters: A dictionary with filters to narrow down the search space. Default is None.
  • top_k: The maximum number of documents to retrieve. Default is 10.
  • scale_score: Whether to scale the scores of the retrieved documents or not. Default is True.
  • return_embedding: Whether to return the embedding of the retrieved Documents. Default is False.

Raises:

  • ValueError: If 'document_store' is not an instance of QdrantDocumentStore.

QdrantEmbeddingRetriever.to_dict

def to_dict() -> Dict[str, Any]

Serializes the component to a dictionary.

Returns:

Dictionary with serialized data.

QdrantEmbeddingRetriever.from_dict

@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "QdrantEmbeddingRetriever"

Deserializes the component from a dictionary.

Arguments:

  • data: Dictionary to deserialize from.

Returns:

Deserialized component.

QdrantEmbeddingRetriever.run

@component.output_types(documents=List[Document])
def run(query_embedding: List[float],
        filters: Optional[Dict[str, Any]] = None,
        top_k: Optional[int] = None,
        scale_score: Optional[bool] = None,
        return_embedding: Optional[bool] = None)

Run the Embedding Retriever on the given input data.

Arguments:

  • query_embedding: Embedding of the query.
  • filters: A dictionary with filters to narrow down the search space.
  • top_k: The maximum number of documents to return.
  • scale_score: Whether to scale the scores of the retrieved documents or not.
  • return_embedding: Whether to return the embedding of the retrieved Documents.

Returns:

The retrieved documents.

QdrantSparseEmbeddingRetriever

A component for retrieving documents from an QdrantDocumentStore using sparse vectors.

Usage example:

from haystack_integrations.components.retrievers.qdrant import QdrantSparseEmbeddingRetriever
from haystack_integrations.document_stores.qdrant import QdrantDocumentStore
from haystack.dataclasses.sparse_embedding import SparseEmbedding

document_store = QdrantDocumentStore(
    ":memory:",
    use_sparse_embeddings=True,
    recreate_index=True,
    return_embedding=True,
)

doc = Document(content="test", sparse_embedding=SparseEmbedding(indices=[0, 3, 5], values=[0.1, 0.5, 0.12]))
document_store.write_documents([doc])

retriever = QdrantSparseEmbeddingRetriever(document_store=document_store)
sparse_embedding = SparseEmbedding(indices=[0, 1, 2, 3], values=[0.1, 0.8, 0.05, 0.33])
retriever.run(query_sparse_embedding=sparse_embedding)

QdrantSparseEmbeddingRetriever.__init__

def __init__(document_store: QdrantDocumentStore,
             filters: Optional[Dict[str, Any]] = None,
             top_k: int = 10,
             scale_score: bool = True,
             return_embedding: bool = False)

Create a QdrantSparseEmbeddingRetriever component.

Arguments:

  • document_store: An instance of QdrantDocumentStore.
  • filters: A dictionary with filters to narrow down the search space. Default is None.
  • top_k: The maximum number of documents to retrieve. Default is 10.
  • scale_score: Whether to scale the scores of the retrieved documents or not. Default is True.
  • return_embedding: Whether to return the sparse embedding of the retrieved Documents. Default is False.

Raises:

  • ValueError: If 'document_store' is not an instance of QdrantDocumentStore.

QdrantSparseEmbeddingRetriever.to_dict

def to_dict() -> Dict[str, Any]

Serializes the component to a dictionary.

Returns:

Dictionary with serialized data.

QdrantSparseEmbeddingRetriever.from_dict

@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "QdrantSparseEmbeddingRetriever"

Deserializes the component from a dictionary.

Arguments:

  • data: Dictionary to deserialize from.

Returns:

Deserialized component.

QdrantSparseEmbeddingRetriever.run

@component.output_types(documents=List[Document])
def run(query_sparse_embedding: SparseEmbedding,
        filters: Optional[Dict[str, Any]] = None,
        top_k: Optional[int] = None,
        scale_score: Optional[bool] = None,
        return_embedding: Optional[bool] = None)

Run the Sparse Embedding Retriever on the given input data.

Arguments:

  • query_sparse_embedding: Sparse Embedding of the query.
  • filters: A dictionary with filters to narrow down the search space.
  • top_k: The maximum number of documents to return.
  • scale_score: Whether to scale the scores of the retrieved documents or not.
  • return_embedding: Whether to return the embedding of the retrieved Documents.

Returns:

The retrieved documents.

QdrantHybridRetriever

A component for retrieving documents from an QdrantDocumentStore using both dense and sparse vectors and fusing the results using Reciprocal Rank Fusion.

Usage example:

from haystack_integrations.components.retrievers.qdrant import QdrantHybridRetriever
from haystack_integrations.document_stores.qdrant import QdrantDocumentStore
from haystack.dataclasses.sparse_embedding import SparseEmbedding

document_store = QdrantDocumentStore(
    ":memory:",
    use_sparse_embeddings=True,
    recreate_index=True,
    return_embedding=True,
    wait_result_from_api=True,
)

doc = Document(content="test",
               embedding=[0.5]*768,
               sparse_embedding=SparseEmbedding(indices=[0, 3, 5], values=[0.1, 0.5, 0.12]))

document_store.write_documents([doc])

retriever = QdrantHybridRetriever(document_store=document_store)
embedding = [0.1]*768
sparse_embedding = SparseEmbedding(indices=[0, 1, 2, 3], values=[0.1, 0.8, 0.05, 0.33])
retriever.run(query_embedding=embedding, query_sparse_embedding=sparse_embedding)

QdrantHybridRetriever.__init__

def __init__(document_store: QdrantDocumentStore,
             filters: Optional[Dict[str, Any]] = None,
             top_k: int = 10,
             return_embedding: bool = False)

Create a QdrantHybridRetriever component.

Arguments:

  • document_store: An instance of QdrantDocumentStore.
  • filters: A dictionary with filters to narrow down the search space.
  • top_k: The maximum number of documents to retrieve.
  • return_embedding: Whether to return the embeddings of the retrieved Documents.

Raises:

  • ValueError: If 'document_store' is not an instance of QdrantDocumentStore.

QdrantHybridRetriever.to_dict

def to_dict() -> Dict[str, Any]

Serializes the component to a dictionary.

Returns:

Dictionary with serialized data.

QdrantHybridRetriever.from_dict

@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "QdrantHybridRetriever"

Deserializes the component from a dictionary.

Arguments:

  • data: Dictionary to deserialize from.

Returns:

Deserialized component.

QdrantHybridRetriever.run

@component.output_types(documents=List[Document])
def run(query_embedding: List[float],
        query_sparse_embedding: SparseEmbedding,
        filters: Optional[Dict[str, Any]] = None,
        top_k: Optional[int] = None,
        return_embedding: Optional[bool] = None)

Run the Sparse Embedding Retriever on the given input data.

Arguments:

  • query_embedding: Dense embedding of the query.
  • query_sparse_embedding: Sparse embedding of the query.
  • filters: A dictionary with filters to narrow down the search space.
  • top_k: The maximum number of documents to return.
  • return_embedding: Whether to return the embedding of the retrieved Documents.

Returns:

The retrieved documents.

Module haystack_integrations.document_stores.qdrant.document_store

get_batches_from_generator

def get_batches_from_generator(iterable, n)

Batch elements of an iterable into fixed-length chunks or blocks.

Module haystack_integrations.document_stores.qdrant.migrate_to_sparse

migrate_to_sparse_embeddings_support

def migrate_to_sparse_embeddings_support(
        old_document_store: QdrantDocumentStore, new_index: str)

Utility function to migrate an existing QdrantDocumentStore to a new one with support for sparse embeddings.

With qdrant-hasytack v3.3.0, support for sparse embeddings has been added to QdrantDocumentStore. This feature is disabled by default and can be enabled by setting use_sparse_embeddings=True in the init parameters. To store sparse embeddings, Document stores/collections created with this feature disabled must be migrated to a new collection with the feature enabled.

This utility function applies to on-premise and cloud instances of Qdrant. It does not work for local in-memory/disk-persisted instances.

The utility function merely migrates the existing documents so that they are ready to store sparse embeddings. It does not compute sparse embeddings. To do this, you need to use a Sparse Embedder component.

Example usage:

from haystack_integrations.document_stores.qdrant import QdrantDocumentStore
from haystack_integrations.document_stores.qdrant import migrate_to_sparse_embeddings_support

old_document_store = QdrantDocumentStore(url="http://localhost:6333",
                                         index="Document",
                                         use_sparse_embeddings=False)
new_index = "Document_sparse"

migrate_to_sparse_embeddings_support(old_document_store, new_index)

# now you can use the new document store with sparse embeddings support
new_document_store = QdrantDocumentStore(url="http://localhost:6333",
                                         index=new_index,
                                         use_sparse_embeddings=True)

Arguments:

  • old_document_store: The existing QdrantDocumentStore instance to migrate from.
  • new_index: The name of the new index/collection to create with sparse embeddings support.