Core classes that carry data through the system.
Module byte_stream
ByteStream
@dataclass
class ByteStream()
Base data class representing a binary object in the Haystack API.
ByteStream.to_file
def to_file(destination_path: Path)
Write the ByteStream to a file. Note: the metadata will be lost.
Arguments:
destination_path
: The path to write the ByteStream to.
ByteStream.from_file_path
@classmethod
def from_file_path(cls,
filepath: Path,
mime_type: Optional[str] = None,
meta: Optional[Dict[str, Any]] = None) -> "ByteStream"
Create a ByteStream from the contents read from a file.
Arguments:
filepath
: A valid path to a file.mime_type
: The mime type of the file.meta
: Additional metadata to be stored with the ByteStream.
ByteStream.from_string
@classmethod
def from_string(cls,
text: str,
encoding: str = "utf-8",
mime_type: Optional[str] = None,
meta: Optional[Dict[str, Any]] = None) -> "ByteStream"
Create a ByteStream encoding a string.
Arguments:
text
: The string to encodeencoding
: The encoding used to convert the string into bytesmime_type
: The mime type of the file.meta
: Additional metadata to be stored with the ByteStream.
ByteStream.to_string
def to_string(encoding: str = "utf-8") -> str
Convert the ByteStream to a string, metadata will not be included.
Arguments:
encoding
: The encoding used to convert the bytes to a string. Defaults to "utf-8".
Raises:
None
: UnicodeDecodeError: If the ByteStream data cannot be decoded with the specified encoding.
Returns:
The string representation of the ByteStream.
Module chat_message
ChatRole
class ChatRole(str, Enum)
Enumeration representing the roles within a chat.
ChatMessage
@dataclass
class ChatMessage()
Represents a message in a LLM chat conversation.
Arguments:
content
: The text content of the message.role
: The role of the entity sending the message.name
: The name of the function being called (only applicable for role FUNCTION).meta
: Additional metadata associated with the message.
ChatMessage.is_from
def is_from(role: ChatRole) -> bool
Check if the message is from a specific role.
Arguments:
role
: The role to check against.
Returns:
True if the message is from the specified role, False otherwise.
ChatMessage.from_assistant
@classmethod
def from_assistant(cls,
content: str,
meta: Optional[Dict[str, Any]] = None) -> "ChatMessage"
Create a message from the assistant.
Arguments:
content
: The text content of the message.meta
: Additional metadata associated with the message.
Returns:
A new ChatMessage instance.
ChatMessage.from_user
@classmethod
def from_user(cls, content: str) -> "ChatMessage"
Create a message from the user.
Arguments:
content
: The text content of the message.
Returns:
A new ChatMessage instance.
ChatMessage.from_system
@classmethod
def from_system(cls, content: str) -> "ChatMessage"
Create a message from the system.
Arguments:
content
: The text content of the message.
Returns:
A new ChatMessage instance.
ChatMessage.from_function
@classmethod
def from_function(cls, content: str, name: str) -> "ChatMessage"
Create a message from a function call.
Arguments:
content
: The text content of the message.name
: The name of the function being called.
Returns:
A new ChatMessage instance.
Module document
_BackwardCompatible
class _BackwardCompatible(type)
Metaclass that handles Document backward compatibility.
_BackwardCompatible.__call__
def __call__(cls, *args, **kwargs)
Called before Document.init, will remap legacy fields to new ones. Also handles building a Document from a flattened dictionary.
Document
@dataclass
class Document(metaclass=_BackwardCompatible)
Base data class containing some data to be queried.
Can contain text snippets, tables, and file paths to images or audios. Documents can be sorted by score and saved to/from dictionary and JSON.
Arguments:
id
: Unique identifier for the document. When not set, it's generated based on the Document fields' values.content
: Text of the document, if the document contains text.dataframe
: Pandas dataframe with the document's content, if the document contains tabular data.blob
: Binary data associated with the document, if the document has any binary data associated with it.meta
: Additional custom metadata for the document. Must be JSON-serializable.score
: Score of the document. Used for ranking, usually assigned by retrievers.embedding
: Vector representation of the document.
Document.__eq__
def __eq__(other)
Compares Documents for equality.
Two Documents are considered equals if their dictionary representation is identical.
Document.__post_init__
def __post_init__()
Generate the ID based on the init parameters.
Document.to_dict
def to_dict(flatten=True) -> Dict[str, Any]
Converts Document into a dictionary.
dataframe
and blob
fields are converted to JSON-serializable types.
Arguments:
flatten
: Whether to flattenmeta
field or not. Defaults toTrue
to be backward-compatible with Haystack 1.x.
Document.from_dict
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "Document"
Creates a new Document object from a dictionary.
The dataframe
and blob
fields are converted to their original types.
Document.content_type
@property
def content_type()
Returns the type of the content for the document.
This is necessary to keep backward compatibility with 1.x.
Raises:
ValueError
: If bothtext
anddataframe
fields are set or both are missing.
Module streaming_chunk
StreamingChunk
@dataclass
class StreamingChunk()
The StreamingChunk class encapsulates a segment of streamed content along with associated metadata.
This structure facilitates the handling and processing of streamed data in a systematic manner.
Arguments:
content
: The content of the message chunk as a string.meta
: A dictionary containing metadata related to the message chunk.