DocumentationAPI Reference📓 Tutorials🧑‍🍳 Cookbook🤝 Integrations💜 Discord🎨 Studio
API Reference

Azure AI Search integration for Haystack

Module haystack_integrations.components.retrievers.azure_ai_search.embedding_retriever

AzureAISearchEmbeddingRetriever

Retrieves documents from the AzureAISearchDocumentStore using a vector similarity metric. Must be connected to the AzureAISearchDocumentStore to run.

AzureAISearchEmbeddingRetriever.__init__

def __init__(*,
             document_store: AzureAISearchDocumentStore,
             filters: Optional[Dict[str, Any]] = None,
             top_k: int = 10,
             filter_policy: Union[str, FilterPolicy] = FilterPolicy.REPLACE,
             **kwargs)

Create the AzureAISearchEmbeddingRetriever component.

Arguments:

  • document_store: An instance of AzureAISearchDocumentStore to use with the Retriever.
  • filters: Filters applied when fetching documents from the Document Store.
  • top_k: Maximum number of documents to return.
  • filter_policy: Policy to determine how filters are applied.
  • kwargs: Additional keyword arguments to pass to the Azure AI's search endpoint. Some of the supported parameters: - query_type: A string indicating the type of query to perform. Possible values are 'simple','full' and 'semantic'. - semantic_configuration_name: The name of semantic configuration to be used when processing semantic queries. For more information on parameters, see the official Azure AI Search documentation.

AzureAISearchEmbeddingRetriever.to_dict

def to_dict() -> Dict[str, Any]

Serializes the component to a dictionary.

Returns:

Dictionary with serialized data.

AzureAISearchEmbeddingRetriever.from_dict

@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "AzureAISearchEmbeddingRetriever"

Deserializes the component from a dictionary.

Arguments:

  • data: Dictionary to deserialize from.

Returns:

Deserialized component.

AzureAISearchEmbeddingRetriever.run

@component.output_types(documents=List[Document])
def run(query_embedding: List[float],
        filters: Optional[Dict[str, Any]] = None,
        top_k: Optional[int] = None)

Retrieve documents from the AzureAISearchDocumentStore.

Arguments:

  • query_embedding: A list of floats representing the query embedding.
  • filters: Filters applied to the retrieved Documents. The way runtime filters are applied depends on the filter_policy chosen at retriever initialization. See __init__ method docstring for more details.
  • top_k: The maximum number of documents to retrieve.

Returns:

Dictionary with the following keys:

  • documents: A list of documents retrieved from the AzureAISearchDocumentStore.

Module haystack_integrations.document_stores.azure_ai_search.document_store

AzureAISearchDocumentStore

AzureAISearchDocumentStore.__init__

def __init__(*,
             api_key: Secret = Secret.from_env_var("AZURE_SEARCH_API_KEY",
                                                   strict=False),
             azure_endpoint: Secret = Secret.from_env_var(
                 "AZURE_SEARCH_SERVICE_ENDPOINT", strict=True),
             index_name: str = "default",
             embedding_dimension: int = 768,
             metadata_fields: Optional[Dict[str, type]] = None,
             vector_search_configuration: VectorSearch = None,
             **index_creation_kwargs)

A document store using Azure AI Search

as the backend.

Arguments:

  • azure_endpoint: The URL endpoint of an Azure AI Search service.
  • api_key: The API key to use for authentication.
  • index_name: Name of index in Azure AI Search, if it doesn't exist it will be created.
  • embedding_dimension: Dimension of the embeddings.
  • metadata_fields: A dictionary of metadata keys and their types to create additional fields in index schema. As fields in Azure SearchIndex cannot be dynamic, it is necessary to specify the metadata fields in advance. (e.g. metadata_fields = {"author": str, "date": datetime})
  • vector_search_configuration: Configuration option related to vector search. Default configuration uses the HNSW algorithm with cosine similarity to handle vector searches.
  • index_creation_kwargs: Optional keyword parameters to be passed to SearchIndex class during index creation. Some of the supported parameters: - semantic_search: Defines semantic configuration of the search index. This parameter is needed to enable semantic search capabilities in index. - similarity: The type of similarity algorithm to be used when scoring and ranking the documents matching a search query. The similarity algorithm can only be defined at index creation time and cannot be modified on existing indexes.

For more information on parameters, see the official Azure AI Search documentation.

AzureAISearchDocumentStore.to_dict

def to_dict() -> Dict[str, Any]

Serializes the component to a dictionary.

Returns:

Dictionary with serialized data.

AzureAISearchDocumentStore.from_dict

@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "AzureAISearchDocumentStore"

Deserializes the component from a dictionary.

Arguments:

  • data: Dictionary to deserialize from.

Returns:

Deserialized component.

AzureAISearchDocumentStore.count_documents

def count_documents() -> int

Returns how many documents are present in the search index.

Returns:

list of retrieved documents.

AzureAISearchDocumentStore.write_documents

def write_documents(documents: List[Document],
                    policy: DuplicatePolicy = DuplicatePolicy.NONE) -> int

Writes the provided documents to search index.

Arguments:

  • documents: documents to write to the index.
  • policy: Policy to determine how duplicates are handled.

Raises:

  • ValueError: If the documents are not of type Document.
  • TypeError: If the document ids are not strings.

Returns:

the number of documents added to index.

AzureAISearchDocumentStore.delete_documents

def delete_documents(document_ids: List[str]) -> None

Deletes all documents with a matching document_ids from the search index.

Arguments:

  • document_ids: ids of the documents to be deleted.

AzureAISearchDocumentStore.search_documents

def search_documents(search_text: str = "*",
                     top_k: int = 10) -> List[Document]

Returns all documents that match the provided search_text.

If search_text is None, returns all documents.

Arguments:

  • search_text: the text to search for in the Document list.
  • top_k: Maximum number of documents to return.

Returns:

A list of Documents that match the given search_text.

AzureAISearchDocumentStore.filter_documents

def filter_documents(
        filters: Optional[Dict[str, Any]] = None) -> List[Document]

Returns the documents that match the provided filters.

Filters should be given as a dictionary supporting filtering by metadata. For details on filters, see the metadata filtering documentation.

Arguments:

  • filters: the filters to apply to the document list.

Returns:

A list of Documents that match the given filters.