DocumentationAPI ReferenceπŸ““ TutorialsπŸ§‘β€πŸ³ Cookbook🀝 IntegrationsπŸ’œ Discord

Abstract class for Rankers.

Module base

BaseRanker

class BaseRanker(BaseComponent)

BaseRanker.run

def run(query: str, documents: List[Document], top_k: Optional[int] = None)

Arguments:

  • query: Query string.
  • documents: List of Documents to process.
  • top_k: The maximum number of Documents to return.

BaseRanker.run_batch

def run_batch(queries: List[str],
              documents: Union[List[Document], List[List[Document]]],
              top_k: Optional[int] = None,
              batch_size: Optional[int] = None)

Arguments:

  • queries: List of query strings.
  • documents: List of list of Documents to process.
  • top_k: The maximum number of answers to return.
  • batch_size: Number of Documents to process at a time.

BaseRanker.timing

def timing(fn, attr_name)

Wrapper method used to time functions.

BaseRanker.eval

def eval(label_index: str = "label",
         doc_index: str = "eval_document",
         label_origin: str = "gold_label",
         top_k: int = 10,
         open_domain: bool = False,
         return_preds: bool = False) -> dict

Performs evaluation of the Ranker.

Ranker is evaluated in the same way as a Retriever based on whether it finds the correct document given the query string and at which position in the ranking of documents the correct document is.

Returns a dict containing the following metrics:

- "recall": Proportion of questions for which correct document is among retrieved documents
- "mrr": Mean of reciprocal rank. Rewards retrievers that give relevant documents a higher rank.
  Only considers the highest ranked relevant document.
- "map": Mean of average precision for each question. Rewards retrievers that give relevant
  documents a higher rank. Considers all retrieved relevant documents. If ``open_domain=True``,
  average precision is normalized by the number of retrieved relevant documents per query.
  If ``open_domain=False``, average precision is normalized by the number of all relevant documents
  per query.

Arguments:

  • label_index: Index/Table in DocumentStore where labeled questions are stored
  • doc_index: Index/Table in DocumentStore where documents that are used for evaluation are stored
  • top_k: How many documents to return per query
  • open_domain: If True, retrieval will be evaluated by checking if the answer string to a question is contained in the retrieved docs (common approach in open-domain QA). If False, retrieval uses a stricter evaluation that checks if the retrieved document ids are within ids explicitly stated in the labels.
  • return_preds: Whether to add predictions in the returned dictionary. If True, the returned dictionary contains the keys "predictions" and "metrics".