Evaluators

Suggest Edits

Evaluator	Description
AnswerExactMatchEvaluator	Evaluates answers predicted by Haystack pipelines using ground truth labels. It checks character by character whether a predicted answer exactly matches the ground truth answer.
ContextRelevanceEvaluator	Uses an LLM to evaluate whether a generated answer can be inferred from the provided contexts.
DeepEvalEvaluator	Use DeepEval to evaluate generative pipelines.
DocumentMAPEvaluator	Evaluates documents retrieved by Haystack pipelines using ground truth labels. It checks to what extent the list of retrieved documents contains only relevant documents as specified in the ground truth labels or also non-relevant documents.
DocumentMRREvaluator	Evaluates documents retrieved by Haystack pipelines using ground truth labels. It checks at what rank ground truth documents appear in the list of retrieved documents.
DocumentNDCGEvaluator	Evaluates documents retrieved by Haystack pipelines using ground truth labels. It checks at what rank ground truth documents appear in the list of retrieved documents. This metric is called normalized discounted cumulative gain (NDCG).
DocumentRecallEvaluator	Evaluates documents retrieved by Haystack pipelines using ground truth labels. It checks how many of the ground truth documents were retrieved.
FaithfulnessEvaluator	Uses an LLM to evaluate whether a generated answer can be inferred from the provided contexts. Does not require ground truth labels.
LLMEvaluator	Uses an LLM to evaluate inputs based on a prompt containing user-defined instructions and examples.
RagasEvaluator	Use Ragas framework to evaluate a retrieval-augmented generative pipeline.
SASEvaluator	Evaluates answers predicted by Haystack pipelines using ground truth labels. It checks the semantic similarity of a predicted answer and the ground truth answer using a fine-tuned language model.

Updated 9 months ago