DocumentationAPI ReferenceπŸ““ TutorialsπŸ§‘β€πŸ³ Cookbook🀝 IntegrationsπŸ’œ Discord

Reorders a set of Documents based on their relevance to the Query.

Module sentence_transformers

SentenceTransformersRanker

class SentenceTransformersRanker(BaseRanker)

Sentence Transformer based pre-trained Cross-Encoder model for Document Re-ranking (https://huggingface.co/cross-encoder). Re-Ranking can be used on top of a retriever to boost the performance for document search. This is particularly useful if the retriever has a high recall but is bad in sorting the documents by relevance.

SentenceTransformerRanker handles Cross-Encoder models - use a single logit as similarity score e.g. cross-encoder/ms-marco-MiniLM-L-12-v2 - use two output logits (no_answer, has_answer) e.g. deepset/gbert-base-germandpr-reranking https://www.sbert.net/docs/pretrained-models/ce-msmarco.html#usage-with-transformers

With a SentenceTransformersRanker, you can:

  • directly get predictions via predict()

Usage example:

retriever = BM25Retriever(document_store=document_store)
ranker = SentenceTransformersRanker(model_name_or_path="cross-encoder/ms-marco-MiniLM-L-12-v2")
p = Pipeline()
p.add_node(component=retriever, name="Retriever", inputs=["Query"])
p.add_node(component=ranker, name="Ranker", inputs=["ESRetriever"])

SentenceTransformersRanker.__init__

def __init__(model_name_or_path: Union[str, Path],
             model_version: Optional[str] = None,
             top_k: int = 10,
             use_gpu: bool = True,
             devices: Optional[List[Union[str, "torch.device"]]] = None,
             batch_size: int = 16,
             scale_score: bool = True,
             progress_bar: bool = True,
             use_auth_token: Optional[Union[str, bool]] = None,
             embed_meta_fields: Optional[List[str]] = None,
             model_kwargs: Optional[dict] = None)

Arguments:

  • model_name_or_path: Directory of a saved model or the name of a public model e.g. 'cross-encoder/ms-marco-MiniLM-L-12-v2'. See https://huggingface.co/cross-encoder for full list of available models
  • model_version: The version of model to use from the HuggingFace model hub. Can be tag name, branch name, or commit hash.
  • top_k: The maximum number of documents to return
  • use_gpu: Whether to use all available GPUs or the CPU. Falls back on CPU if no GPU is available.
  • batch_size: Number of documents to process at a time.
  • scale_score: The raw predictions will be transformed using a Sigmoid activation function in case the model only predicts a single label. For multi-label predictions, no scaling is applied. Set this to False if you do not want any scaling of the raw predictions.
  • progress_bar: Whether to show a progress bar while processing the documents.
  • use_auth_token: The API token used to download private models from Huggingface. If this parameter is set to True, then the token generated when running transformers-cli login (stored in ~/.huggingface) will be used. Additional information can be found here https://huggingface.co/transformers/main_classes/model.html#transformers.PreTrainedModel.from_pretrained
  • devices: List of torch devices (e.g. cuda, cpu, mps) to limit inference to specific devices. A list containing torch device objects and/or strings is supported (For example [torch.device('cuda:0'), "mps", "cuda:1"]). When specifying use_gpu=False the devices parameter is not used and a single cpu device is used for inference.
  • embed_meta_fields: Concatenate the provided meta fields and into the text passage that is then used in reranking. The original documents are returned so the concatenated metadata is not included in the returned documents.

SentenceTransformersRanker.predict

def predict(query: str,
            documents: List[Document],
            top_k: Optional[int] = None) -> List[Document]

Use loaded ranker model to re-rank the supplied list of Document.

Returns list of Document sorted by (desc.) similarity with the query.

Arguments:

  • query: Query string
  • documents: List of Document to be re-ranked
  • top_k: The maximum number of documents to return

Returns:

List of Document

SentenceTransformersRanker.predict_batch

def predict_batch(
    queries: List[str],
    documents: Union[List[Document], List[List[Document]]],
    top_k: Optional[int] = None,
    batch_size: Optional[int] = None
) -> Union[List[Document], List[List[Document]]]

Use loaded ranker model to re-rank the supplied lists of Documents.

Returns lists of Documents sorted by (desc.) similarity with the corresponding queries.

  • If you provide a list containing a single query...

    • ... and a single list of Documents, the single list of Documents will be re-ranked based on the supplied query.
    • ... and a list of lists of Documents, each list of Documents will be re-ranked individually based on the supplied query.
  • If you provide a list of multiple queries...

    • ... you need to provide a list of lists of Documents. Each list of Documents will be re-ranked based on its corresponding query.

Arguments:

  • queries: Single query string or list of queries
  • documents: Single list of Documents or list of lists of Documents to be reranked.
  • top_k: The maximum number of documents to return per Document list.
  • batch_size: Number of Documents to process at a time.

SentenceTransformersRanker.run

def run(query: str, documents: List[Document], top_k: Optional[int] = None)

Arguments:

  • query: Query string.
  • documents: List of Documents to process.
  • top_k: The maximum number of Documents to return.

SentenceTransformersRanker.run_batch

def run_batch(queries: List[str],
              documents: Union[List[Document], List[List[Document]]],
              top_k: Optional[int] = None,
              batch_size: Optional[int] = None)

Arguments:

  • queries: List of query strings.
  • documents: List of list of Documents to process.
  • top_k: The maximum number of answers to return.
  • batch_size: Number of Documents to process at a time.

SentenceTransformersRanker.timing

def timing(fn, attr_name)

Wrapper method used to time functions.

SentenceTransformersRanker.eval

def eval(label_index: str = "label",
         doc_index: str = "eval_document",
         label_origin: str = "gold_label",
         top_k: int = 10,
         open_domain: bool = False,
         return_preds: bool = False) -> dict

Performs evaluation of the Ranker.

Ranker is evaluated in the same way as a Retriever based on whether it finds the correct document given the query string and at which position in the ranking of documents the correct document is.

Returns a dict containing the following metrics:

- "recall": Proportion of questions for which correct document is among retrieved documents
- "mrr": Mean of reciprocal rank. Rewards retrievers that give relevant documents a higher rank.
  Only considers the highest ranked relevant document.
- "map": Mean of average precision for each question. Rewards retrievers that give relevant
  documents a higher rank. Considers all retrieved relevant documents. If ``open_domain=True``,
  average precision is normalized by the number of retrieved relevant documents per query.
  If ``open_domain=False``, average precision is normalized by the number of all relevant documents
  per query.

Arguments:

  • label_index: Index/Table in DocumentStore where labeled questions are stored
  • doc_index: Index/Table in DocumentStore where documents that are used for evaluation are stored
  • top_k: How many documents to return per query
  • open_domain: If True, retrieval will be evaluated by checking if the answer string to a question is contained in the retrieved docs (common approach in open-domain QA). If False, retrieval uses a stricter evaluation that checks if the retrieved document ids are within ids explicitly stated in the labels.
  • return_preds: Whether to add predictions in the returned dictionary. If True, the returned dictionary contains the keys "predictions" and "metrics".

Module recentness_ranker

RecentnessRanker

class RecentnessRanker(BaseRanker)

RecentnessRanker.__init__

def __init__(date_meta_field: str,
             weight: float = 0.5,
             top_k: Optional[int] = None,
             ranking_mode: Literal["reciprocal_rank_fusion",
                                   "score"] = "reciprocal_rank_fusion")

This Node is used to rerank retrieved documents based on their age. Newer documents will rank higher.

The importance of recentness is parametrized through the weight parameter.

Arguments:

  • date_meta_field: Identifier pointing to the date field in the metadata. This is a required parameter, since we need dates for sorting.
  • weight: in range [0,1]. 0 disables sorting by age. 0.5 content and age have the same impact. 1 means sorting only by age, most recent comes first.
  • top_k: (optional) How many documents to return. If not provided, all documents will be returned. It can make sense to have large top-k values from the initial retrievers and filter docs down in the RecentnessRanker with this top_k parameter.
  • ranking_mode: The mode used to combine retriever and recentness. Possible values are 'reciprocal_rank_fusion' (default) and 'score'. Make sure to use 'score' mode only with retrievers/rankers that give back OK score in range [0,1].

RecentnessRanker.predict

def predict(query: str,
            documents: List[Document],
            top_k: Optional[int] = None) -> List[Document]

This method is used to rank a list of documents based on their age and relevance by:

  1. Adjusting the relevance score from the previous node (or, for RRF, calculating it from scratch, then adjusting) based on the chosen weight in initial parameters.
  2. Sorting the documents based on their age in the metadata, calculating the recentness score, adjusting it by weight as well.
  3. Returning top-k documents (or all, if top-k not provided) in the documents dictionary sorted by final score (relevance score + recentness score).

Arguments:

  • query: Not used in practice (so can be left blank), as this ranker does not perform sorting based on semantic closeness of documents to the query.
  • documents: Documents provided for ranking.
  • top_k: (optional) How many documents to return at the end. If not provided, all documents will be returned, sorted by relevance and recentness (adjusted by weight).

RecentnessRanker.predict_batch

def predict_batch(
    queries: List[str],
    documents: Union[List[Document], List[List[Document]]],
    top_k: Optional[int] = None,
    batch_size: Optional[int] = None
) -> Union[List[Document], List[List[Document]]]

This method is used to rank A) a list or B) a list of lists (in case the previous node is JoinDocuments) of documents based on their age and relevance.

In case A, the predict method defined earlier is applied to the provided list. In case B, predict method is applied to each individual list in the list of lists provided, then the results are returned as list of lists.

Arguments:

  • queries: Not used in practice (so can be left blank), as this ranker does not perform sorting based on semantic closeness of documents to the query.
  • documents: Documents provided for ranking in a list or a list of lists.
  • top_k: (optional) How many documents to return at the end (per list). If not provided, all documents will be returned, sorted by relevance and recentness (adjusted by weight).
  • batch_size: Not used in practice, so can be left blank.

RecentnessRanker.run

def run(query: str, documents: List[Document], top_k: Optional[int] = None)

Arguments:

  • query: Query string.
  • documents: List of Documents to process.
  • top_k: The maximum number of Documents to return.

RecentnessRanker.run_batch

def run_batch(queries: List[str],
              documents: Union[List[Document], List[List[Document]]],
              top_k: Optional[int] = None,
              batch_size: Optional[int] = None)

Arguments:

  • queries: List of query strings.
  • documents: List of list of Documents to process.
  • top_k: The maximum number of answers to return.
  • batch_size: Number of Documents to process at a time.

RecentnessRanker.timing

def timing(fn, attr_name)

Wrapper method used to time functions.

RecentnessRanker.eval

def eval(label_index: str = "label",
         doc_index: str = "eval_document",
         label_origin: str = "gold_label",
         top_k: int = 10,
         open_domain: bool = False,
         return_preds: bool = False) -> dict

Performs evaluation of the Ranker.

Ranker is evaluated in the same way as a Retriever based on whether it finds the correct document given the query string and at which position in the ranking of documents the correct document is.

Returns a dict containing the following metrics:

- "recall": Proportion of questions for which correct document is among retrieved documents
- "mrr": Mean of reciprocal rank. Rewards retrievers that give relevant documents a higher rank.
  Only considers the highest ranked relevant document.
- "map": Mean of average precision for each question. Rewards retrievers that give relevant
  documents a higher rank. Considers all retrieved relevant documents. If ``open_domain=True``,
  average precision is normalized by the number of retrieved relevant documents per query.
  If ``open_domain=False``, average precision is normalized by the number of all relevant documents
  per query.

Arguments:

  • label_index: Index/Table in DocumentStore where labeled questions are stored
  • doc_index: Index/Table in DocumentStore where documents that are used for evaluation are stored
  • top_k: How many documents to return per query
  • open_domain: If True, retrieval will be evaluated by checking if the answer string to a question is contained in the retrieved docs (common approach in open-domain QA). If False, retrieval uses a stricter evaluation that checks if the retrieved document ids are within ids explicitly stated in the labels.
  • return_preds: Whether to add predictions in the returned dictionary. If True, the returned dictionary contains the keys "predictions" and "metrics".

Module cohere

CohereRanker

class CohereRanker(BaseRanker)

You can use re-ranking on top of a Retriever to boost the performance for document search. This is particularly useful if the Retriever has a high recall but is bad in sorting the documents by relevance.

Cohere models are trained with a context length of 512 tokens - the model takes into account both the input from the query and document. If your query is larger than 256 tokens, it will be truncated to the first 256 tokens.

Cohere breaks down a query-document pair into 512 token chunks. For example, if your query is 50 tokens and your document is 1024 tokens, your document will be broken into the following chunks:

relevance_score_1 = <query[0,50], document[0,460]>
relevance_score_2 = <query[0,50], document[460,920]>
relevance_score_3 = <query[0,50], document[920,1024]>
relevance_score = max(relevance_score_1, relevance_score_2, relevance_score_3)

Find more best practices for reranking in the Cohere documentation.

CohereRanker.__init__

def __init__(api_key: str,
             model_name_or_path: str,
             top_k: int = 10,
             max_chunks_per_doc: Optional[int] = None,
             embed_meta_fields: Optional[List[str]] = None)

Creates an instance of CohereInvocationLayer for the specified Cohere model.

Arguments:

  • api_key: Cohere API key.
  • model_name_or_path: Cohere model name. Check the list of supported models in the Cohere documentation.
  • top_k: The maximum number of documents to return.
  • max_chunks_per_doc: If your document exceeds 512 tokens, this determines the maximum number of chunks a document can be split into. If None, the default of 10 is used. For example, if your document is 6000 tokens, with the default of 10, the document will be split into 10 chunks each of 512 tokens and the last 880 tokens will be disregarded.
  • embed_meta_fields: Concatenate the provided meta fields and into the text passage that is then used in reranking. The original documents are returned so the concatenated metadata is not included in the returned documents.

CohereRanker.predict

def predict(query: str,
            documents: List[Document],
            top_k: Optional[int] = None) -> List[Document]

Use the Cohere Reranker to re-rank the supplied list of documents based on the query.

Arguments:

  • query: The query string.
  • documents: List of Document to be re-ranked.
  • top_k: The maximum number of documents to return.

CohereRanker.predict_batch

def predict_batch(
    queries: List[str],
    documents: Union[List[Document], List[List[Document]]],
    top_k: Optional[int] = None,
    batch_size: Optional[int] = None
) -> Union[List[Document], List[List[Document]]]

Use Cohere Reranking endpoint to re-rank the supplied lists of Documents.

Returns a lists of Documents sorted by (descending) similarity with the corresponding queries.

  • If you provide a list containing a single query...

    • ... and a single list of Documents, the single list of Documents will be re-ranked based on the supplied query.
    • ... and a list of lists of Documents, each list of Documents will be re-ranked individually based on the supplied query.
  • If you provide a list of multiple queries...

    • ... you need to provide a list of lists of Documents. Each list of Documents will be re-ranked based on its corresponding query.

Arguments:

  • queries: List of queries.
  • documents: Single list of Documents or list of lists of Documents to be reranked.
  • top_k: The maximum number of documents to return per Document list.
  • batch_size: Not relevant.

CohereRanker.run

def run(query: str, documents: List[Document], top_k: Optional[int] = None)

Arguments:

  • query: Query string.
  • documents: List of Documents to process.
  • top_k: The maximum number of Documents to return.

CohereRanker.run_batch

def run_batch(queries: List[str],
              documents: Union[List[Document], List[List[Document]]],
              top_k: Optional[int] = None,
              batch_size: Optional[int] = None)

Arguments:

  • queries: List of query strings.
  • documents: List of list of Documents to process.
  • top_k: The maximum number of answers to return.
  • batch_size: Number of Documents to process at a time.

CohereRanker.timing

def timing(fn, attr_name)

Wrapper method used to time functions.

CohereRanker.eval

def eval(label_index: str = "label",
         doc_index: str = "eval_document",
         label_origin: str = "gold_label",
         top_k: int = 10,
         open_domain: bool = False,
         return_preds: bool = False) -> dict

Performs evaluation of the Ranker.

Ranker is evaluated in the same way as a Retriever based on whether it finds the correct document given the query string and at which position in the ranking of documents the correct document is.

Returns a dict containing the following metrics:

- "recall": Proportion of questions for which correct document is among retrieved documents
- "mrr": Mean of reciprocal rank. Rewards retrievers that give relevant documents a higher rank.
  Only considers the highest ranked relevant document.
- "map": Mean of average precision for each question. Rewards retrievers that give relevant
  documents a higher rank. Considers all retrieved relevant documents. If ``open_domain=True``,
  average precision is normalized by the number of retrieved relevant documents per query.
  If ``open_domain=False``, average precision is normalized by the number of all relevant documents
  per query.

Arguments:

  • label_index: Index/Table in DocumentStore where labeled questions are stored
  • doc_index: Index/Table in DocumentStore where documents that are used for evaluation are stored
  • top_k: How many documents to return per query
  • open_domain: If True, retrieval will be evaluated by checking if the answer string to a question is contained in the retrieved docs (common approach in open-domain QA). If False, retrieval uses a stricter evaluation that checks if the retrieved document ids are within ids explicitly stated in the labels.
  • return_preds: Whether to add predictions in the returned dictionary. If True, the returned dictionary contains the keys "predictions" and "metrics".

Module diversity

DiversityRanker

class DiversityRanker(BaseRanker)

Implements a document ranking algorithm that orders documents in such a way as to maximize the overall diversity of the documents.

DiversityRanker.__init__

def __init__(model_name_or_path: Union[str, Path] = "all-MiniLM-L6-v2",
             top_k: Optional[int] = None,
             use_gpu: Optional[bool] = True,
             devices: Optional[List[Union[str, "torch.device"]]] = None,
             similarity: Literal["dot_product", "cosine"] = "dot_product")

Initialize a DiversityRanker.

Arguments:

  • model_name_or_path: Path to a pretrained sentence-transformers model.
  • top_k: The maximum number of documents to return.
  • use_gpu: Whether to use GPU (if available). If no GPUs are available, it falls back on a CPU.
  • devices: List of torch devices (for example, cuda:0, cpu, mps) to limit inference to specific devices.
  • similarity: Whether to use dot product or cosine similarity. Can be set to "dot_product" (default) or "cosine".

DiversityRanker.predict

def predict(query: str,
            documents: List[Document],
            top_k: Optional[int] = None) -> List[Document]

Rank the documents based on their diversity and return the top_k documents.

Arguments:

  • query: The query.
  • documents: A list of Document objects that should be ranked.
  • top_k: The maximum number of documents to return.

Returns:

A list of top_k documents ranked based on diversity.

DiversityRanker.greedy_diversity_order

def greedy_diversity_order(query: str,
                           documents: List[Document]) -> List[Document]

Orders the given list of documents to maximize diversity. The algorithm first calculates embeddings for

each document and the query. It starts by selecting the document that is semantically closest to the query. Then, for each remaining document, it selects the one that, on average, is least similar to the already selected documents. This process continues until all documents are selected, resulting in a list where each subsequent document contributes the most to the overall diversity of the selected set.

Arguments:

  • query: The search query.
  • documents: The list of Document objects to be ranked.

Returns:

A list of documents ordered to maximize diversity.

DiversityRanker.predict_batch

def predict_batch(
    queries: List[str],
    documents: Union[List[Document], List[List[Document]]],
    top_k: Optional[float] = None,
    batch_size: Optional[int] = None
) -> Union[List[Document], List[List[Document]]]

Rank the documents based on their diversity and return the top_k documents.

Arguments:

  • queries: The queries.
  • documents: A list (or a list of lists) of Document objects that should be ranked.
  • top_k: The maximum number of documents to return.
  • batch_size: The number of documents to process in one batch.

Returns:

A list (or a list of lists) of top_k documents ranked based on diversity.

DiversityRanker.run

def run(query: str, documents: List[Document], top_k: Optional[int] = None)

Arguments:

  • query: Query string.
  • documents: List of Documents to process.
  • top_k: The maximum number of Documents to return.

DiversityRanker.run_batch

def run_batch(queries: List[str],
              documents: Union[List[Document], List[List[Document]]],
              top_k: Optional[int] = None,
              batch_size: Optional[int] = None)

Arguments:

  • queries: List of query strings.
  • documents: List of list of Documents to process.
  • top_k: The maximum number of answers to return.
  • batch_size: Number of Documents to process at a time.

DiversityRanker.timing

def timing(fn, attr_name)

Wrapper method used to time functions.

DiversityRanker.eval

def eval(label_index: str = "label",
         doc_index: str = "eval_document",
         label_origin: str = "gold_label",
         top_k: int = 10,
         open_domain: bool = False,
         return_preds: bool = False) -> dict

Performs evaluation of the Ranker.

Ranker is evaluated in the same way as a Retriever based on whether it finds the correct document given the query string and at which position in the ranking of documents the correct document is.

Returns a dict containing the following metrics:

- "recall": Proportion of questions for which correct document is among retrieved documents
- "mrr": Mean of reciprocal rank. Rewards retrievers that give relevant documents a higher rank.
  Only considers the highest ranked relevant document.
- "map": Mean of average precision for each question. Rewards retrievers that give relevant
  documents a higher rank. Considers all retrieved relevant documents. If ``open_domain=True``,
  average precision is normalized by the number of retrieved relevant documents per query.
  If ``open_domain=False``, average precision is normalized by the number of all relevant documents
  per query.

Arguments:

  • label_index: Index/Table in DocumentStore where labeled questions are stored
  • doc_index: Index/Table in DocumentStore where documents that are used for evaluation are stored
  • top_k: How many documents to return per query
  • open_domain: If True, retrieval will be evaluated by checking if the answer string to a question is contained in the retrieved docs (common approach in open-domain QA). If False, retrieval uses a stricter evaluation that checks if the retrieved document ids are within ids explicitly stated in the labels.
  • return_preds: Whether to add predictions in the returned dictionary. If True, the returned dictionary contains the keys "predictions" and "metrics".

Module lost_in_the_middle

LostInTheMiddleRanker

class LostInTheMiddleRanker(BaseRanker)

The LostInTheMiddleRanker implements a ranker that reorders documents based on the "lost in the middle" order. "Lost in the Middle: How Language Models Use Long Contexts" paper by Liu et al. aims to lay out paragraphs into LLM context so that the relevant paragraphs are at the beginning or end of the input context, while the least relevant information is in the middle of the context.

See https://arxiv.org/abs/2307.03172 for more details.

LostInTheMiddleRanker.__init__

def __init__(word_count_threshold: Optional[int] = None,
             top_k: Optional[int] = None)

Creates an instance of LostInTheMiddleRanker.

If 'word_count_threshold' is specified, this ranker includes all documents up until the point where adding another document would exceed the 'word_count_threshold'. The last document that causes the threshold to be breached will be included in the resulting list of documents, but all subsequent documents will be discarded.

Arguments:

  • word_count_threshold: The maximum total number of words across all documents selected by the ranker.
  • top_k: The maximum number of documents to return.

LostInTheMiddleRanker.reorder_documents

def reorder_documents(documents: List[Document]) -> List[Document]

Ranks documents based on the "lost in the middle" order. Assumes that all documents are ordered by relevance.

Arguments:

  • documents: List of Documents to merge.

Returns:

Documents in the "lost in the middle" order.

LostInTheMiddleRanker.predict

def predict(query: str,
            documents: List[Document],
            top_k: Optional[int] = None) -> List[Document]

Reranks documents based on the "lost in the middle" order.

Arguments:

  • query: The query to reorder documents for (ignored).
  • documents: List of Documents to reorder.
  • top_k: The number of documents to return.

Returns:

The reordered documents.

LostInTheMiddleRanker.predict_batch

def predict_batch(
    queries: List[str],
    documents: Union[List[Document], List[List[Document]]],
    top_k: Optional[int] = None,
    batch_size: Optional[int] = None
) -> Union[List[Document], List[List[Document]]]

Reranks batch of documents based on the "lost in the middle" order.

Arguments:

  • queries: The queries to reorder documents for (ignored).
  • documents: List of Documents to reorder.
  • top_k: The number of documents to return.
  • batch_size: The number of queries to process in one batch (ignored).

Returns:

The reordered documents.

LostInTheMiddleRanker.run

def run(query: str, documents: List[Document], top_k: Optional[int] = None)

Arguments:

  • query: Query string.
  • documents: List of Documents to process.
  • top_k: The maximum number of Documents to return.

LostInTheMiddleRanker.run_batch

def run_batch(queries: List[str],
              documents: Union[List[Document], List[List[Document]]],
              top_k: Optional[int] = None,
              batch_size: Optional[int] = None)

Arguments:

  • queries: List of query strings.
  • documents: List of list of Documents to process.
  • top_k: The maximum number of answers to return.
  • batch_size: Number of Documents to process at a time.

LostInTheMiddleRanker.timing

def timing(fn, attr_name)

Wrapper method used to time functions.

LostInTheMiddleRanker.eval

def eval(label_index: str = "label",
         doc_index: str = "eval_document",
         label_origin: str = "gold_label",
         top_k: int = 10,
         open_domain: bool = False,
         return_preds: bool = False) -> dict

Performs evaluation of the Ranker.

Ranker is evaluated in the same way as a Retriever based on whether it finds the correct document given the query string and at which position in the ranking of documents the correct document is.

Returns a dict containing the following metrics:

- "recall": Proportion of questions for which correct document is among retrieved documents
- "mrr": Mean of reciprocal rank. Rewards retrievers that give relevant documents a higher rank.
  Only considers the highest ranked relevant document.
- "map": Mean of average precision for each question. Rewards retrievers that give relevant
  documents a higher rank. Considers all retrieved relevant documents. If ``open_domain=True``,
  average precision is normalized by the number of retrieved relevant documents per query.
  If ``open_domain=False``, average precision is normalized by the number of all relevant documents
  per query.

Arguments:

  • label_index: Index/Table in DocumentStore where labeled questions are stored
  • doc_index: Index/Table in DocumentStore where documents that are used for evaluation are stored
  • top_k: How many documents to return per query
  • open_domain: If True, retrieval will be evaluated by checking if the answer string to a question is contained in the retrieved docs (common approach in open-domain QA). If False, retrieval uses a stricter evaluation that checks if the retrieved document ids are within ids explicitly stated in the labels.
  • return_preds: Whether to add predictions in the returned dictionary. If True, the returned dictionary contains the keys "predictions" and "metrics".