Evaluators
Evaluator | Description |
---|---|
AnswerExactMatchEvaluator | Evaluates answers predicted by Haystack pipelines using ground truth labels. It checks character by character whether a predicted answer exactly matches the ground truth answer. |
ContextRelevanceEvaluator | Uses an LLM to evaluate whether a generated answer can be inferred from the provided contexts. |
DeepEvalEvaluator | Use DeepEval to evaluate generative pipelines. |
DocumentMAPEvaluator | Evaluates documents retrieved by Haystack pipelines using ground truth labels. It checks to what extent the list of retrieved documents contains only relevant documents as specified in the ground truth labels or also non-relevant documents. |
DocumentMRREvaluator | Evaluates documents retrieved by Haystack pipelines using ground truth labels. It checks at what rank ground truth documents appear in the list of retrieved documents. |
DocumentNDCGEvaluator | Evaluates documents retrieved by Haystack pipelines using ground truth labels. It checks at what rank ground truth documents appear in the list of retrieved documents. This metric is called normalized discounted cumulative gain (NDCG). |
DocumentRecallEvaluator | Evaluates documents retrieved by Haystack pipelines using ground truth labels. It checks how many of the ground truth documents were retrieved. |
FaithfulnessEvaluator | Uses an LLM to evaluate whether a generated answer can be inferred from the provided contexts. Does not require ground truth labels. |
LLMEvaluator | Uses an LLM to evaluate inputs based on a prompt containing user-defined instructions and examples. |
RagasEvaluator | Use Ragas framework to evaluate a retrieval-augmented generative pipeline. |
SASEvaluator | Evaluates answers predicted by Haystack pipelines using ground truth labels. It checks the semantic similarity of a predicted answer and the ground truth answer using a fine-tuned language model. |
Updated about 1 month ago