DocumentationAPI Reference📓 Tutorials🧑‍🍳 Cookbook🤝 Integrations💜 Discord🎨 Studio
API Reference

Ragas integration for Haystack

Module haystack_integrations.components.evaluators.ragas.evaluator

RagasEvaluator

A component that uses the Ragas framework to evaluate inputs against specified Ragas metrics.

Usage example:

from haystack_integrations.components.evaluators.ragas import RagasEvaluator
from ragas.metrics import ContextPrecision
from ragas.llms import LangchainLLMWrapper
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini")
evaluator_llm = LangchainLLMWrapper(llm)

evaluator = RagasEvaluator(
    ragas_metrics=[ContextPrecision()],
    evaluator_llm=evaluator_llm
)
output = evaluator.run(
    query="Which is the most popular global sport?",
    documents=[
        "Football is undoubtedly the world's most popular sport with"
        " major events like the FIFA World Cup and sports personalities"
        " like Ronaldo and Messi, drawing a followership of more than 4"
        " billion people."
    ],
    reference="Football is the most popular sport with around 4 billion"
              " followers worldwide",
)

output['result']

RagasEvaluator.__init__

def __init__(ragas_metrics: List[Metric],
             evaluator_llm: Optional[Union[BaseRagasLLM, LangchainLLM]] = None,
             evaluator_embedding: Optional[Union[BaseRagasEmbeddings,
                                                 LangchainEmbeddings]] = None)

Constructs a new Ragas evaluator.

Arguments:

  • ragas_metrics: A list of evaluation metrics from the Ragas library.
  • evaluator_llm: A language model used by metrics that require LLMs for evaluation.
  • evaluator_embedding: An embedding model used by metrics that require embeddings for evaluation.

RagasEvaluator.run

@component.output_types(result=EvaluationResult)
def run(query: Optional[str] = None,
        response: Optional[Union[List[ChatMessage], str]] = None,
        documents: Optional[List[Union[Document, str]]] = None,
        reference_contexts: Optional[List[str]] = None,
        multi_responses: Optional[List[str]] = None,
        reference: Optional[str] = None,
        rubrics: Optional[Dict[str, str]] = None) -> Dict[str, Any]

Evaluates the provided query against the documents and returns the evaluation result.

Arguments:

  • query: The input query from the user.
  • response: A list of ChatMessage responses (typically from a language model or agent).
  • documents: A list of Haystack Document or strings that were retrieved for the query.
  • reference_contexts: A list of reference contexts that should have been retrieved for the query.
  • multi_responses: List of multiple responses generated for the query.
  • reference: A string reference answer for the query.
  • rubrics: A dictionary of evaluation rubric, where keys represent the score and the values represent the corresponding evaluation criteria.

Returns:

A dictionary containing the evaluation result.