DocumentationAPI Reference📓 Tutorials🧑‍🍳 Cookbook🤝 Integrations💜 Discord🎨 Studio (Waitlist)
API Reference

Core classes that carry data through the system.

Module byte_stream

ByteStream

@dataclass
class ByteStream()

Base data class representing a binary object in the Haystack API.

ByteStream.to_file

def to_file(destination_path: Path)

Write the ByteStream to a file. Note: the metadata will be lost.

Arguments:

  • destination_path: The path to write the ByteStream to.

ByteStream.from_file_path

@classmethod
def from_file_path(cls,
                   filepath: Path,
                   mime_type: Optional[str] = None,
                   meta: Optional[Dict[str, Any]] = None) -> "ByteStream"

Create a ByteStream from the contents read from a file.

Arguments:

  • filepath: A valid path to a file.
  • mime_type: The mime type of the file.
  • meta: Additional metadata to be stored with the ByteStream.

ByteStream.from_string

@classmethod
def from_string(cls,
                text: str,
                encoding: str = "utf-8",
                mime_type: Optional[str] = None,
                meta: Optional[Dict[str, Any]] = None) -> "ByteStream"

Create a ByteStream encoding a string.

Arguments:

  • text: The string to encode
  • encoding: The encoding used to convert the string into bytes
  • mime_type: The mime type of the file.
  • meta: Additional metadata to be stored with the ByteStream.

ByteStream.to_string

def to_string(encoding: str = "utf-8") -> str

Convert the ByteStream to a string, metadata will not be included.

Arguments:

  • encoding: The encoding used to convert the bytes to a string. Defaults to "utf-8".

Raises:

  • None: UnicodeDecodeError: If the ByteStream data cannot be decoded with the specified encoding.

Returns:

The string representation of the ByteStream.

Module chat_message

ChatRole

class ChatRole(str, Enum)

Enumeration representing the roles within a chat.

ChatMessage

@dataclass
class ChatMessage()

Represents a message in a LLM chat conversation.

Arguments:

  • content: The text content of the message.
  • role: The role of the entity sending the message.
  • name: The name of the function being called (only applicable for role FUNCTION).
  • meta: Additional metadata associated with the message.

ChatMessage.is_from

def is_from(role: ChatRole) -> bool

Check if the message is from a specific role.

Arguments:

  • role: The role to check against.

Returns:

True if the message is from the specified role, False otherwise.

ChatMessage.from_assistant

@classmethod
def from_assistant(cls,
                   content: str,
                   meta: Optional[Dict[str, Any]] = None) -> "ChatMessage"

Create a message from the assistant.

Arguments:

  • content: The text content of the message.
  • meta: Additional metadata associated with the message.

Returns:

A new ChatMessage instance.

ChatMessage.from_user

@classmethod
def from_user(cls, content: str) -> "ChatMessage"

Create a message from the user.

Arguments:

  • content: The text content of the message.

Returns:

A new ChatMessage instance.

ChatMessage.from_system

@classmethod
def from_system(cls, content: str) -> "ChatMessage"

Create a message from the system.

Arguments:

  • content: The text content of the message.

Returns:

A new ChatMessage instance.

ChatMessage.from_function

@classmethod
def from_function(cls, content: str, name: str) -> "ChatMessage"

Create a message from a function call.

Arguments:

  • content: The text content of the message.
  • name: The name of the function being called.

Returns:

A new ChatMessage instance.

Module document

_BackwardCompatible

class _BackwardCompatible(type)

Metaclass that handles Document backward compatibility.

_BackwardCompatible.__call__

def __call__(cls, *args, **kwargs)

Called before Document.init, will remap legacy fields to new ones. Also handles building a Document from a flattened dictionary.

Document

@dataclass
class Document(metaclass=_BackwardCompatible)

Base data class containing some data to be queried.

Can contain text snippets, tables, and file paths to images or audios. Documents can be sorted by score and saved to/from dictionary and JSON.

Arguments:

  • id: Unique identifier for the document. When not set, it's generated based on the Document fields' values.
  • content: Text of the document, if the document contains text.
  • dataframe: Pandas dataframe with the document's content, if the document contains tabular data.
  • blob: Binary data associated with the document, if the document has any binary data associated with it.
  • meta: Additional custom metadata for the document. Must be JSON-serializable.
  • score: Score of the document. Used for ranking, usually assigned by retrievers.
  • embedding: Vector representation of the document.

Document.__eq__

def __eq__(other)

Compares Documents for equality.

Two Documents are considered equals if their dictionary representation is identical.

Document.__post_init__

def __post_init__()

Generate the ID based on the init parameters.

Document.to_dict

def to_dict(flatten=True) -> Dict[str, Any]

Converts Document into a dictionary.

dataframe and blob fields are converted to JSON-serializable types.

Arguments:

  • flatten: Whether to flatten meta field or not. Defaults to True to be backward-compatible with Haystack 1.x.

Document.from_dict

@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "Document"

Creates a new Document object from a dictionary.

The dataframe and blob fields are converted to their original types.

Document.content_type

@property
def content_type()

Returns the type of the content for the document.

This is necessary to keep backward compatibility with 1.x.

Raises:

  • ValueError: If both text and dataframe fields are set or both are missing.

Module streaming_chunk

StreamingChunk

@dataclass
class StreamingChunk()

The StreamingChunk class encapsulates a segment of streamed content along with associated metadata.

This structure facilitates the handling and processing of streamed data in a systematic manner.

Arguments:

  • content: The content of the message chunk as a string.
  • meta: A dictionary containing metadata related to the message chunk.