Module haystack_integrations.components.evaluators.uptrain.evaluator

UpTrainEvaluator

@component
class UpTrainEvaluator()

A component that uses the UpTrain framework to evaluate inputs against a specific metric. Supported metrics are defined by UpTrainMetric.

Usage example:

from haystack_integrations.components.evaluators.uptrain import UpTrainEvaluator, UpTrainMetric
from haystack.utils import Secret

evaluator = UpTrainEvaluator(
    metric=UpTrainMetric.FACTUAL_ACCURACY,
    api="openai",
    api_key=Secret.from_env_var("OPENAI_API_KEY"),
)
output = evaluator.run(
    questions=["Which is the most popular global sport?"],
    contexts=[
        [
            "Football is undoubtedly the world's most popular sport with"
            "major events like the FIFA World Cup and sports personalities"
            "like Ronaldo and Messi, drawing a followership of more than 4"
            "billion people."
        ]
    ],
    responses=["Football is the most popular sport with around 4 billion" "followers worldwide"],
)
print(output["results"])

UpTrainEvaluator.init

def __init__(metric: Union[str, UpTrainMetric],
             metric_params: Optional[Dict[str, Any]] = None,
             *,
             api: str = "openai",
             api_key: Secret = Secret.from_env_var("OPENAI_API_KEY"),
             api_params: Optional[Dict[str, Any]] = None)

Construct a new UpTrain evaluator.

Arguments:

metric: The metric to use for evaluation.
metric_params: Parameters to pass to the metric's constructor. Refer to the UpTrainMetric class for more details on required parameters.
api: The API to use for evaluation. Supported APIs: openai, uptrain.
api_key: The API key to use.
api_params: Additional parameters to pass to the API client. Required parameters for the UpTrain API: project_name.

UpTrainEvaluator.run

@component.output_types(results=List[List[Dict[str, Any]]])
def run(**inputs) -> Dict[str, Any]

Run the UpTrain evaluator on the provided inputs.

Arguments:

inputs: The inputs to evaluate. These are determined by the metric being calculated. See UpTrainMetric for more information.

Returns:

A dictionary with a single results entry that contains a nested list of metric results. Each input can have one or more results, depending on the metric. Each result is a dictionary containing the following keys and values:

name - The name of the metric.
score - The score of the metric.
explanation - An optional explanation of the score.

UpTrainEvaluator.to_dict

def to_dict() -> Dict[str, Any]

Serializes the component to a dictionary.

Raises:

DeserializationError: If the component cannot be serialized.

Returns:

Dictionary with serialized data.

UpTrainEvaluator.from_dict

@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "UpTrainEvaluator"

Deserializes the component from a dictionary.

Arguments:

data: Dictionary to deserialize from.

Returns:

Deserialized component.

Module haystack_integrations.components.evaluators.uptrain.metrics

UpTrainMetric

class UpTrainMetric(Enum)

Metrics supported by UpTrain.

CONTEXT_RELEVANCE

Context relevance.
Inputs - questions: List[str], contexts: List[List[str]]

FACTUAL_ACCURACY

Factual accuracy.
Inputs - questions: List[str], contexts: List[List[str]], responses: List[str]

RESPONSE_RELEVANCE

Response relevance.
Inputs - questions: List[str], responses: List[str]

RESPONSE_COMPLETENESS

Response completeness.
Inputs - questions: List[str], responses: List[str]

RESPONSE_COMPLETENESS_WRT_CONTEXT

Response completeness with respect to context.
Inputs - questions: List[str], contexts: List[List[str]], responses: List[str]

RESPONSE_CONSISTENCY

Response consistency.
Inputs - questions: List[str], contexts: List[List[str]], responses: List[str]

RESPONSE_CONCISENESS

Response conciseness.
Inputs - questions: List[str], responses: List[str]

CRITIQUE_LANGUAGE

Language critique.
Inputs - responses: List[str]

CRITIQUE_TONE

Tone critique.
Inputs - responses: List[str]
Parameters - llm_persona: str

GUIDELINE_ADHERENCE

Guideline adherence.
Inputs - questions: List[str], responses: List[str]
Parameters - guideline: str, guideline_name: str, response_schema: Optional[str]

RESPONSE_MATCHING

Response matching.
Inputs - responses: List[str], ground_truths: List[str]
Parameters - method: str

UpTrainMetric.from_str

@classmethod
def from_str(cls, string: str) -> "UpTrainMetric"

Create a metric type from a string.

Arguments:

string: The string to convert.

Returns:

The metric.

Module haystack_integrations.components.evaluators.uptrain.evaluator

UpTrainEvaluator

UpTrainEvaluator.__init__

UpTrainEvaluator.run

UpTrainEvaluator.to_dict

UpTrainEvaluator.from_dict

Module haystack_integrations.components.evaluators.uptrain.metrics

UpTrainMetric

CONTEXT_RELEVANCE

FACTUAL_ACCURACY

RESPONSE_RELEVANCE

RESPONSE_COMPLETENESS

RESPONSE_COMPLETENESS_WRT_CONTEXT

RESPONSE_CONSISTENCY

RESPONSE_CONCISENESS

CRITIQUE_LANGUAGE

CRITIQUE_TONE

GUIDELINE_ADHERENCE

RESPONSE_MATCHING

UpTrainMetric.from_str

UpTrainEvaluator.init