DocumentationAPI Reference📓 Tutorials🧑‍🍳 Cookbook🤝 Integrations💜 Discord🎨 Studio (Waitlist)
API Reference

Validators validate LLM outputs

Module json_schema

is_valid_json

def is_valid_json(s: str) -> bool

Check if the provided string is a valid JSON.

Arguments:

  • s: The string to be checked.

Returns:

True if the string is a valid JSON; otherwise, False.

JsonSchemaValidator

Validates JSON content of ChatMessage against a specified JSON Schema.

If JSON content of a message conforms to the provided schema, the message is passed along the "validated" output. If the JSON content does not conform to the schema, the message is passed along the "validation_error" output. In the latter case, the error message is constructed using the provided error_template or a default template. These error ChatMessages can be used by LLMs in Haystack 2.x recovery loops.

Usage example:

from typing import List

from haystack import Pipeline
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.components.others import Multiplexer
from haystack.components.validators import JsonSchemaValidator
from haystack import component
from haystack.dataclasses import ChatMessage


@component
class MessageProducer:

    @component.output_types(messages=List[ChatMessage])
    def run(self, messages: List[ChatMessage]) -> dict:
        return {"messages": messages}


p = Pipeline()
p.add_component("llm", OpenAIChatGenerator(model="gpt-4-1106-preview",
                                           generation_kwargs={"response_format": {"type": "json_object"}}))
p.add_component("schema_validator", JsonSchemaValidator())
p.add_component("mx_for_llm", Multiplexer(List[ChatMessage]))
p.add_component("message_producer", MessageProducer())

p.connect("message_producer.messages", "mx_for_llm")
p.connect("mx_for_llm", "llm")
p.connect("llm.replies", "schema_validator.messages")
p.connect("schema_validator.validation_error", "mx_for_llm")

result = p.run(data={
    "message_producer": {
        "messages":[ChatMessage.from_user("Generate JSON for person with name 'John' and age 30")]},
        "schema_validator": {
            "json_schema": {
                "type": "object",
                "properties": {"name": {"type": "string"},
                "age": {"type": "integer"}
            }
        }
    }
})
print(result)
>> {'schema_validator': {'validated': [ChatMessage(content='\n{\n  "name": "John",\n  "age": 30\n}',
role=<ChatRole.ASSISTANT: 'assistant'>, name=None, meta={'model': 'gpt-4-1106-preview', 'index': 0,
'finish_reason': 'stop', 'usage': {'completion_tokens': 17, 'prompt_tokens': 20, 'total_tokens': 37}})]}}

JsonSchemaValidator.__init__

def __init__(json_schema: Optional[Dict[str, Any]] = None,
             error_template: Optional[str] = None)

Initialize the JsonSchemaValidator component.

Arguments:

  • json_schema: A dictionary representing the JSON schema against which the messages' content is validated.
  • error_template: A custom template string for formatting the error message in case of validation failure.

JsonSchemaValidator.run

@component.output_types(validated=List[ChatMessage],
                        validation_error=List[ChatMessage])
def run(messages: List[ChatMessage],
        json_schema: Optional[Dict[str, Any]] = None,
        error_template: Optional[str] = None) -> Dict[str, List[ChatMessage]]

Validates the last of the provided messages against the specified json schema.

If it does, the message is passed along the "validated" output. If it does not, the message is passed along the "validation_error" output.

Arguments:

  • messages: A list of ChatMessage instances to be validated. The last message in this list is the one that is validated.
  • json_schema: A dictionary representing the JSON schema against which the messages' content is validated. If not provided, the schema from the component init is used.
  • error_template: A custom template string for formatting the error message in case of validation. If not provided, the error_template from the component init is used.

Raises:

  • ValueError: If no JSON schema is provided or if the message content is not a dictionary or a list of dictionaries.

Returns:

A dictionary with the following keys:

  • "validated": A list of messages if the last message is valid.
  • "validation_error": A list of messages if the last message is invalid.