Core classes that carry data through the system.
Module answer
ExtractedAnswer
ExtractedAnswer.to_dict
def to_dict() -> Dict[str, Any]
Serialize the object to a dictionary.
Returns:
Serialized dictionary representation of the object.
ExtractedAnswer.from_dict
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "ExtractedAnswer"
Deserialize the object from a dictionary.
Arguments:
data
: Dictionary representation of the object.
Returns:
Deserialized object.
GeneratedAnswer
GeneratedAnswer.to_dict
def to_dict() -> Dict[str, Any]
Serialize the object to a dictionary.
Returns:
Serialized dictionary representation of the object.
GeneratedAnswer.from_dict
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "GeneratedAnswer"
Deserialize the object from a dictionary.
Arguments:
data
: Dictionary representation of the object.
Returns:
Deserialized object.
Module byte_stream
ByteStream
Base data class representing a binary object in the Haystack API.
Arguments:
data
: The binary data stored in Bytestream.meta
: Additional metadata to be stored with the ByteStream.mime_type
: The mime type of the binary data.
ByteStream.to_file
def to_file(destination_path: Path) -> None
Write the ByteStream to a file. Note: the metadata will be lost.
Arguments:
destination_path
: The path to write the ByteStream to.
ByteStream.from_file_path
@classmethod
def from_file_path(cls,
filepath: Path,
mime_type: Optional[str] = None,
meta: Optional[Dict[str, Any]] = None,
guess_mime_type: bool = False) -> "ByteStream"
Create a ByteStream from the contents read from a file.
Arguments:
filepath
: A valid path to a file.mime_type
: The mime type of the file.meta
: Additional metadata to be stored with the ByteStream.guess_mime_type
: Whether to guess the mime type from the file.
ByteStream.from_string
@classmethod
def from_string(cls,
text: str,
encoding: str = "utf-8",
mime_type: Optional[str] = None,
meta: Optional[Dict[str, Any]] = None) -> "ByteStream"
Create a ByteStream encoding a string.
Arguments:
text
: The string to encodeencoding
: The encoding used to convert the string into bytesmime_type
: The mime type of the file.meta
: Additional metadata to be stored with the ByteStream.
ByteStream.to_string
def to_string(encoding: str = "utf-8") -> str
Convert the ByteStream to a string, metadata will not be included.
Arguments:
encoding
: The encoding used to convert the bytes to a string. Defaults to "utf-8".
Raises:
None
: UnicodeDecodeError: If the ByteStream data cannot be decoded with the specified encoding.
Returns:
The string representation of the ByteStream.
ByteStream.__repr__
def __repr__() -> str
Return a string representation of the ByteStream, truncating the data to 100 bytes.
ByteStream.to_dict
def to_dict() -> Dict[str, Any]
Convert the ByteStream to a dictionary representation.
Returns:
A dictionary with keys 'data', 'meta', and 'mime_type'.
ByteStream.from_dict
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "ByteStream"
Create a ByteStream from a dictionary representation.
Arguments:
data
: A dictionary with keys 'data', 'meta', and 'mime_type'.
Returns:
A ByteStream instance.
Module chat_message
ChatRole
Enumeration representing the roles within a chat.
USER
The user role. A message from the user contains only text.
SYSTEM
The system role. A message from the system contains only text.
ASSISTANT
The assistant role. A message from the assistant can contain text and Tool calls. It can also store metadata.
TOOL
The tool role. A message from a tool contains the result of a Tool invocation.
ChatRole.from_str
@staticmethod
def from_str(string: str) -> "ChatRole"
Convert a string to a ChatRole enum.
ToolCall
Represents a Tool call prepared by the model, usually contained in an assistant message.
Arguments:
id
: The ID of the Tool call.tool_name
: The name of the Tool to call.arguments
: The arguments to call the Tool with.
id
noqa: A003
ToolCall.to_dict
def to_dict() -> Dict[str, Any]
Convert ToolCall into a dictionary.
Returns:
A dictionary with keys 'tool_name', 'arguments', and 'id'.
ToolCall.from_dict
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "ToolCall"
Creates a new ToolCall object from a dictionary.
Arguments:
data
: The dictionary to build the ToolCall object.
Returns:
The created object.
ToolCallResult
Represents the result of a Tool invocation.
Arguments:
result
: The result of the Tool invocation.origin
: The Tool call that produced this result.error
: Whether the Tool invocation resulted in an error.
ToolCallResult.to_dict
def to_dict() -> Dict[str, Any]
Converts ToolCallResult into a dictionary.
Returns:
A dictionary with keys 'result', 'origin', and 'error'.
ToolCallResult.from_dict
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "ToolCallResult"
Creates a ToolCallResult from a dictionary.
Arguments:
data
: The dictionary to build the ToolCallResult object.
Returns:
The created object.
TextContent
The textual content of a chat message.
Arguments:
text
: The text content of the message.
ChatMessage
Represents a message in a LLM chat conversation.
Use the from_assistant
, from_user
, from_system
, and from_tool
class methods to create a ChatMessage.
ChatMessage.__new__
def __new__(cls, *args, **kwargs)
This method is reimplemented to make the changes to the ChatMessage
dataclass more visible.
ChatMessage.__getattribute__
def __getattribute__(name)
This method is reimplemented to make the content
attribute removal more visible.
ChatMessage.role
@property
def role() -> ChatRole
Returns the role of the entity sending the message.
ChatMessage.meta
@property
def meta() -> Dict[str, Any]
Returns the metadata associated with the message.
ChatMessage.name
@property
def name() -> Optional[str]
Returns the name associated with the message.
ChatMessage.texts
@property
def texts() -> List[str]
Returns the list of all texts contained in the message.
ChatMessage.text
@property
def text() -> Optional[str]
Returns the first text contained in the message.
ChatMessage.tool_calls
@property
def tool_calls() -> List[ToolCall]
Returns the list of all Tool calls contained in the message.
ChatMessage.tool_call
@property
def tool_call() -> Optional[ToolCall]
Returns the first Tool call contained in the message.
ChatMessage.tool_call_results
@property
def tool_call_results() -> List[ToolCallResult]
Returns the list of all Tool call results contained in the message.
ChatMessage.tool_call_result
@property
def tool_call_result() -> Optional[ToolCallResult]
Returns the first Tool call result contained in the message.
ChatMessage.images
@property
def images() -> List[ImageContent]
Returns the list of all images contained in the message.
ChatMessage.image
@property
def image() -> Optional[ImageContent]
Returns the first image contained in the message.
ChatMessage.is_from
def is_from(role: Union[ChatRole, str]) -> bool
Check if the message is from a specific role.
Arguments:
role
: The role to check against.
Returns:
True if the message is from the specified role, False otherwise.
ChatMessage.from_user
@classmethod
def from_user(
cls,
text: Optional[str] = None,
meta: Optional[Dict[str, Any]] = None,
name: Optional[str] = None,
*,
content_parts: Optional[Sequence[Union[TextContent, str,
ImageContent]]] = None
) -> "ChatMessage"
Create a message from the user.
Arguments:
text
: The text content of the message. Specify this or content_parts.meta
: Additional metadata associated with the message.name
: An optional name for the participant. This field is only supported by OpenAI.content_parts
: A list of content parts to include in the message. Specify this or text.
Returns:
A new ChatMessage instance.
ChatMessage.from_system
@classmethod
def from_system(cls,
text: str,
meta: Optional[Dict[str, Any]] = None,
name: Optional[str] = None) -> "ChatMessage"
Create a message from the system.
Arguments:
text
: The text content of the message.meta
: Additional metadata associated with the message.name
: An optional name for the participant. This field is only supported by OpenAI.
Returns:
A new ChatMessage instance.
ChatMessage.from_assistant
@classmethod
def from_assistant(
cls,
text: Optional[str] = None,
meta: Optional[Dict[str, Any]] = None,
name: Optional[str] = None,
tool_calls: Optional[List[ToolCall]] = None) -> "ChatMessage"
Create a message from the assistant.
Arguments:
text
: The text content of the message.meta
: Additional metadata associated with the message.tool_calls
: The Tool calls to include in the message.name
: An optional name for the participant. This field is only supported by OpenAI.
Returns:
A new ChatMessage instance.
ChatMessage.from_tool
@classmethod
def from_tool(cls,
tool_result: str,
origin: ToolCall,
error: bool = False,
meta: Optional[Dict[str, Any]] = None) -> "ChatMessage"
Create a message from a Tool.
Arguments:
tool_result
: The result of the Tool invocation.origin
: The Tool call that produced this result.error
: Whether the Tool invocation resulted in an error.meta
: Additional metadata associated with the message.
Returns:
A new ChatMessage instance.
ChatMessage.to_dict
def to_dict() -> Dict[str, Any]
Converts ChatMessage into a dictionary.
Returns:
Serialized version of the object.
ChatMessage.from_dict
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "ChatMessage"
Creates a new ChatMessage object from a dictionary.
Arguments:
data
: The dictionary to build the ChatMessage object.
Returns:
The created object.
ChatMessage.to_openai_dict_format
def to_openai_dict_format(
require_tool_call_ids: bool = True) -> Dict[str, Any]
Convert a ChatMessage to the dictionary format expected by OpenAI's Chat API.
Arguments:
require_tool_call_ids
: If True (default), enforces that each Tool Call includes a non-nullid
attribute. Set to False to allow Tool Calls withoutid
, which may be suitable for shallow OpenAI-compatible APIs.
Raises:
ValueError
: If the message format is invalid, or ifrequire_tool_call_ids
is True and any Tool Call is missing anid
attribute.
Returns:
The ChatMessage in the format expected by OpenAI's Chat API.
ChatMessage.from_openai_dict_format
@classmethod
def from_openai_dict_format(cls, message: Dict[str, Any]) -> "ChatMessage"
Create a ChatMessage from a dictionary in the format expected by OpenAI's Chat API.
NOTE: While OpenAI's API requires tool_call_id
in both tool calls and tool messages, this method
accepts messages without it to support shallow OpenAI-compatible APIs.
If you plan to use the resulting ChatMessage with OpenAI, you must include tool_call_id
or you'll
encounter validation errors.
Arguments:
message
: The OpenAI dictionary to build the ChatMessage object.
Raises:
ValueError
: If the message dictionary is missing required fields.
Returns:
The created ChatMessage object.
Module document
_BackwardCompatible
Metaclass that handles Document backward compatibility.
_BackwardCompatible.__call__
def __call__(cls, *args, **kwargs)
Called before Document.init, handles legacy fields.
Embedding was stored as NumPy arrays in 1.x, so we convert it to a list of floats. Other legacy fields are removed.
Document
Base data class containing some data to be queried.
Can contain text snippets and file paths to images or audios. Documents can be sorted by score and saved to/from dictionary and JSON.
Arguments:
id
: Unique identifier for the document. When not set, it's generated based on the Document fields' values.content
: Text of the document, if the document contains text.blob
: Binary data associated with the document, if the document has any binary data associated with it.meta
: Additional custom metadata for the document. Must be JSON-serializable.score
: Score of the document. Used for ranking, usually assigned by retrievers.embedding
: dense vector representation of the document.sparse_embedding
: sparse vector representation of the document.
Document.__eq__
def __eq__(other)
Compares Documents for equality.
Two Documents are considered equals if their dictionary representation is identical.
Document.__post_init__
def __post_init__()
Generate the ID based on the init parameters.
Document.to_dict
def to_dict(flatten: bool = True) -> Dict[str, Any]
Converts Document into a dictionary.
blob
field is converted to a JSON-serializable type.
Arguments:
flatten
: Whether to flattenmeta
field or not. Defaults toTrue
to be backward-compatible with Haystack 1.x.
Document.from_dict
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "Document"
Creates a new Document object from a dictionary.
The blob
field is converted to its original type.
Document.content_type
@property
def content_type()
Returns the type of the content for the document.
This is necessary to keep backward compatibility with 1.x.
Module image_content
ImageContent
The image content of a chat message.
Arguments:
base64_image
: A base64 string representing the image.mime_type
: The MIME type of the image (e.g. "image/png", "image/jpeg"). Providing this value is recommended, as most LLM providers require it. If not provided, the MIME type is guessed from the base64 string, which can be slow and not always reliable.detail
: Optional detail level of the image (only supported by OpenAI). One of "auto", "high", or "low".meta
: Optional metadata for the image.validation
: If True (default), a validation process is performed:- Check whether the base64 string is valid;
- Guess the MIME type if not provided;
- Check if the MIME type is a valid image MIME type. Set to False to skip validation and speed up initialization.
ImageContent.__repr__
def __repr__() -> str
Return a string representation of the ImageContent, truncating the base64_image to 100 bytes.
ImageContent.show
def show() -> None
Shows the image.
ImageContent.from_file_path
@classmethod
def from_file_path(cls,
file_path: Union[str, Path],
*,
size: Optional[Tuple[int, int]] = None,
detail: Optional[Literal["auto", "high", "low"]] = None,
meta: Optional[Dict[str, Any]] = None) -> "ImageContent"
Create an ImageContent object from a file path.
It exposes similar functionality as the ImageFileToImageContent
component. For PDF to ImageContent conversion,
use the PDFToImageContent
component.
Arguments:
file_path
: The path to the image file. PDF files are not supported. For PDF to ImageContent conversion, use thePDFToImageContent
component.size
: If provided, resizes the image to fit within the specified dimensions (width, height) while maintaining aspect ratio. This reduces file size, memory usage, and processing time, which is beneficial when working with models that have resolution constraints or when transmitting images to remote services.detail
: Optional detail level of the image (only supported by OpenAI). One of "auto", "high", or "low".meta
: Additional metadata for the image.
Returns:
An ImageContent object.
ImageContent.from_url
@classmethod
def from_url(cls,
url: str,
*,
retry_attempts: int = 2,
timeout: int = 10,
size: Optional[Tuple[int, int]] = None,
detail: Optional[Literal["auto", "high", "low"]] = None,
meta: Optional[Dict[str, Any]] = None) -> "ImageContent"
Create an ImageContent object from a URL. The image is downloaded and converted to a base64 string.
For PDF to ImageContent conversion, use the PDFToImageContent
component.
Arguments:
url
: The URL of the image. PDF files are not supported. For PDF to ImageContent conversion, use thePDFToImageContent
component.retry_attempts
: The number of times to retry to fetch the URL's content.timeout
: Timeout in seconds for the request.size
: If provided, resizes the image to fit within the specified dimensions (width, height) while maintaining aspect ratio. This reduces file size, memory usage, and processing time, which is beneficial when working with models that have resolution constraints or when transmitting images to remote services.detail
: Optional detail level of the image (only supported by OpenAI). One of "auto", "high", or "low".meta
: Additional metadata for the image.
Raises:
ValueError
: If the URL does not point to an image or if it points to a PDF file.
Returns:
An ImageContent object.
Module sparse_embedding
SparseEmbedding
Class representing a sparse embedding.
Arguments:
indices
: List of indices of non-zero elements in the embedding.values
: List of values of non-zero elements in the embedding.
SparseEmbedding.__post_init__
def __post_init__()
Checks if the indices and values lists are of the same length.
Raises a ValueError if they are not.
SparseEmbedding.to_dict
def to_dict() -> Dict[str, Any]
Convert the SparseEmbedding object to a dictionary.
Returns:
Serialized sparse embedding.
SparseEmbedding.from_dict
@classmethod
def from_dict(cls, sparse_embedding_dict: Dict[str, Any]) -> "SparseEmbedding"
Deserializes the sparse embedding from a dictionary.
Arguments:
sparse_embedding_dict
: Dictionary to deserialize from.
Returns:
Deserialized sparse embedding.
Module streaming_chunk
ToolCallDelta
Represents a Tool call prepared by the model, usually contained in an assistant message.
Arguments:
index
: The index of the Tool call in the list of Tool calls.tool_name
: The name of the Tool to call.arguments
: Either the full arguments in JSON format or a delta of the arguments.id
: The ID of the Tool call.
id
noqa: A003
ToolCallDelta.to_dict
def to_dict() -> Dict[str, Any]
Returns a dictionary representation of the ToolCallDelta.
Returns:
A dictionary with keys 'index', 'tool_name', 'arguments', and 'id'.
ToolCallDelta.from_dict
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "ToolCallDelta"
Creates a ToolCallDelta from a serialized representation.
Arguments:
data
: Dictionary containing ToolCallDelta's attributes.
Returns:
A ToolCallDelta instance.
ComponentInfo
The ComponentInfo
class encapsulates information about a component.
Arguments:
type
: The type of the component.name
: The name of the component assigned when adding it to a pipeline.
ComponentInfo.from_component
@classmethod
def from_component(cls, component: Component) -> "ComponentInfo"
Create a ComponentInfo
object from a Component
instance.
Arguments:
component
: TheComponent
instance.
Returns:
The ComponentInfo
object with the type and name of the given component.
ComponentInfo.to_dict
def to_dict() -> Dict[str, Any]
Returns a dictionary representation of ComponentInfo.
Returns:
A dictionary with keys 'type' and 'name'.
ComponentInfo.from_dict
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "ComponentInfo"
Creates a ComponentInfo from a serialized representation.
Arguments:
data
: Dictionary containing ComponentInfo's attributes.
Returns:
A ComponentInfo instance.
StreamingChunk
The StreamingChunk
class encapsulates a segment of streamed content along with associated metadata.
This structure facilitates the handling and processing of streamed data in a systematic manner.
Arguments:
content
: The content of the message chunk as a string.meta
: A dictionary containing metadata related to the message chunk.component_info
: AComponentInfo
object containing information about the component that generated the chunk, such as the component name and type.index
: An optional integer index representing which content block this chunk belongs to.tool_calls
: An optional list of ToolCallDelta object representing a tool call associated with the message chunk.tool_call_result
: An optional ToolCallResult object representing the result of a tool call.start
: A boolean indicating whether this chunk marks the start of a content block.finish_reason
: An optional value indicating the reason the generation finished. Standard values follow OpenAI's convention: "stop", "length", "tool_calls", "content_filter", plus Haystack-specific value "tool_call_results".
StreamingChunk.to_dict
def to_dict() -> Dict[str, Any]
Returns a dictionary representation of the StreamingChunk.
Returns:
Serialized dictionary representation of the calling object.
StreamingChunk.from_dict
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "StreamingChunk"
Creates a deserialized StreamingChunk instance from a serialized representation.
Arguments:
data
: Dictionary containing the StreamingChunk's attributes.
Returns:
A StreamingChunk instance.
select_streaming_callback
def select_streaming_callback(
init_callback: Optional[StreamingCallbackT],
runtime_callback: Optional[StreamingCallbackT],
requires_async: bool) -> Optional[StreamingCallbackT]
Picks the correct streaming callback given an optional initial and runtime callback.
The runtime callback takes precedence over the initial callback.
Arguments:
init_callback
: The initial callback.runtime_callback
: The runtime callback.requires_async
: Whether the selected callback must be async compatible.
Returns:
The selected callback.