Utility functions and classes used across the library.

Module filters

document_matches_filter

def document_matches_filter(filters: Dict[str, Any],
                            document: Document) -> bool

Return whether filters match the Document.

For a detailed specification of the filters, refer to the DocumentStore.filter_documents() protocol documentation.

Module requests_utils

request_with_retry

def request_with_retry(attempts: int = 3,
                       status_codes_to_retry: Optional[List[int]] = None,
                       **kwargs) -> requests.Response

Executes an HTTP request with a configurable exponential backoff retry on failures.

Usage example:

from haystack.utils import request_with_retry

# Sending an HTTP request with default retry configs
res = request_with_retry(method="GET", url="https://example.com")

# Sending an HTTP request with custom number of attempts
res = request_with_retry(method="GET", url="https://example.com", attempts=10)

# Sending an HTTP request with custom HTTP codes to retry
res = request_with_retry(method="GET", url="https://example.com", status_codes_to_retry=[408, 503])

# Sending an HTTP request with custom timeout in seconds
res = request_with_retry(method="GET", url="https://example.com", timeout=5)

# Sending an HTTP request with custom authorization handling
class CustomAuth(requests.auth.AuthBase):
    def __call__(self, r):
        r.headers["authorization"] = "Basic <my_token_here>"
        return r

res = request_with_retry(method="GET", url="https://example.com", auth=CustomAuth())

# All of the above combined
res = request_with_retry(
    method="GET",
    url="https://example.com",
    auth=CustomAuth(),
    attempts=10,
    status_codes_to_retry=[408, 503],
    timeout=5
)

# Sending a POST request
res = request_with_retry(method="POST", url="https://example.com", data={"key": "value"}, attempts=10)

# Retry all 5xx status codes
res = request_with_retry(method="GET", url="https://example.com", status_codes_to_retry=list(range(500, 600)))

Arguments:

  • attempts: Maximum number of attempts to retry the request.
  • status_codes_to_retry: List of HTTP status codes that will trigger a retry. When param is None, HTTP 408, 418, 429 and 503 will be retried.
  • kwargs: Optional arguments that request accepts.

Returns:

The Response object.

Module callable_serialization

serialize_callable

def serialize_callable(callable_handle: Callable) -> str

Serializes a callable to its full path.

Arguments:

  • callable_handle: The callable to serialize

Returns:

The full path of the callable

deserialize_callable

def deserialize_callable(callable_handle: str) -> Optional[Callable]

Deserializes a callable given its full import path as a string.

Arguments:

  • callable_handle: The full path of the callable_handle

Raises:

  • DeserializationError: If the callable cannot be found

Returns:

The callable

Module auth

Secret

class Secret(ABC)

Encapsulates a secret used for authentication.

Usage example:

from haystack.components.generators import OpenAIGenerator
from haystack.utils import Secret

generator = OpenAIGenerator(api_key=Secret.from_token("<here_goes_your_token>"))

Secret.from_token

@staticmethod
def from_token(token: str) -> "Secret"

Create a token-based secret. Cannot be serialized.

Arguments:

  • token: The token to use for authentication.

Secret.from_env_var

@staticmethod
def from_env_var(env_vars: Union[str, List[str]],
                 *,
                 strict: bool = True) -> "Secret"

Create an environment variable-based secret. Accepts

one or more environment variables. Upon resolution, it returns a string token from the first environment variable that is set.

Arguments:

  • env_vars: A single environment variable or an ordered list of candidate environment variables.
  • strict: Whether to raise an exception if none of the environment variables are set.

Secret.to_dict

def to_dict() -> Dict[str, Any]

Convert the secret to a JSON-serializable dictionary.

Some secrets may not be serializable.

Returns:

The serialized policy.

Secret.from_dict

@staticmethod
def from_dict(dict: Dict[str, Any]) -> "Secret"

Create a secret from a JSON-serializable dictionary.

Arguments:

  • dict: The dictionary with the serialized data.

Returns:

The deserialized secret.

Secret.resolve_value

@abstractmethod
def resolve_value() -> Optional[Any]

Resolve the secret to an atomic value. The semantics

of the value is secret-dependent.

Returns:

The value of the secret, if any.

Secret.type

@property
@abstractmethod
def type() -> SecretType

The type of the secret.

deserialize_secrets_inplace

def deserialize_secrets_inplace(data: Dict[str, Any],
                                keys: Iterable[str],
                                *,
                                recursive: bool = False)

Deserialize secrets in a dictionary inplace.

Arguments:

  • data: The dictionary with the serialized data.
  • keys: The keys of the secrets to deserialize.
  • recursive: Whether to recursively deserialize nested dictionaries.

Module jupyter

is_in_jupyter

def is_in_jupyter() -> bool

Returns True if in Jupyter or Google Colab, False otherwise.

Module type_serialization

serialize_type

def serialize_type(target: Any) -> str

Serializes a type or an instance to its string representation, including the module name.

This function handles types, instances of types, and special typing objects. It assumes that non-typing objects will have a 'name' attribute and raises an error if a type cannot be serialized.

Arguments:

  • target: The object to serialize, can be an instance or a type.

Raises:

  • ValueError: If the type cannot be serialized.

Returns:

The string representation of the type.

deserialize_type

def deserialize_type(type_str: str) -> Any

Deserializes a type given its full import path as a string, including nested generic types.

This function will dynamically import the module if it's not already imported and then retrieve the type object from it. It also handles nested generic types like typing.List[typing.Dict[int, str]].

Arguments:

  • type_str: The string representation of the type's full import path.

Raises:

  • DeserializationError: If the type cannot be deserialized due to missing module or type.

Returns:

The deserialized type object.

Module device

DeviceType

class DeviceType(Enum)

Represents device types supported by Haystack. This also includes devices that are not directly used by models - for example, the disk device is exclusively used in device maps for frameworks that support offloading model weights to disk.

DeviceType.from_str

@staticmethod
def from_str(string: str) -> "DeviceType"

Create a device type from a string.

Arguments:

  • string: The string to convert.

Returns:

The device type.

Device

@dataclass
class Device()

A generic representation of a device.

Arguments:

  • type: The device type.
  • id: The optional device id.

Device.__init__

def __init__(type: DeviceType, id: Optional[int] = None)

Create a generic device.

Arguments:

  • type: The device type.
  • id: The device id.

Device.cpu

@staticmethod
def cpu() -> "Device"

Create a generic CPU device.

Returns:

The CPU device.

Device.gpu

@staticmethod
def gpu(id: int = 0) -> "Device"

Create a generic GPU device.

Arguments:

  • id: The GPU id.

Returns:

The GPU device.

Device.disk

@staticmethod
def disk() -> "Device"

Create a generic disk device.

Returns:

The disk device.

Device.mps

@staticmethod
def mps() -> "Device"

Create a generic Apple Metal Performance Shader device.

Returns:

The MPS device.

DeviceMap

@dataclass
class DeviceMap()

A generic mapping from strings to devices. The semantics of the

strings are dependent on target framework. Primarily used to deploy HuggingFace models to multiple devices.

Arguments:

  • mapping: Dictionary mapping strings to devices.

DeviceMap.to_dict

def to_dict() -> Dict[str, str]

Serialize the mapping to a JSON-serializable dictionary.

Returns:

The serialized mapping.

DeviceMap.first_device

@property
def first_device() -> Optional[Device]

Return the first device in the mapping, if any.

Returns:

The first device.

DeviceMap.from_dict

@staticmethod
def from_dict(dict: Dict[str, str]) -> "DeviceMap"

Create a generic device map from a JSON-serialized dictionary.

Arguments:

  • dict: The serialized mapping.

Returns:

The generic device map.

DeviceMap.from_hf

@staticmethod
def from_hf(
        hf_device_map: Dict[str, Union[int, str,
                                       "torch.device"]]) -> "DeviceMap"

Create a generic device map from a HuggingFace device map.

Arguments:

  • hf_device_map: The HuggingFace device map.

Returns:

The deserialized device map.

ComponentDevice

@dataclass(frozen=True)
class ComponentDevice()

A representation of a device for a component. This can be either a single device or a device map.

ComponentDevice.from_str

@classmethod
def from_str(cls, device_str: str) -> "ComponentDevice"

Create a component device representation from a device string.

The device string can only represent a single device.

Arguments:

  • device_str: The device string.

Returns:

The component device representation.

ComponentDevice.from_single

@classmethod
def from_single(cls, device: Device) -> "ComponentDevice"

Create a component device representation from a single device.

Disks cannot be used as single devices.

Arguments:

  • device: The device.

Returns:

The component device representation.

ComponentDevice.from_multiple

@classmethod
def from_multiple(cls, device_map: DeviceMap) -> "ComponentDevice"

Create a component device representation from a device map.

Arguments:

  • device_map: The device map.

Returns:

The component device representation.

ComponentDevice.to_torch

def to_torch() -> "torch.device"

Convert the component device representation to PyTorch format.

Device maps are not supported.

Returns:

The PyTorch device representation.

ComponentDevice.to_torch_str

def to_torch_str() -> str

Convert the component device representation to PyTorch string format.

Device maps are not supported.

Returns:

The PyTorch device string representation.

ComponentDevice.to_spacy

def to_spacy() -> int

Convert the component device representation to spaCy format.

Device maps are not supported.

Returns:

The spaCy device representation.

ComponentDevice.to_hf

def to_hf() -> Union[Union[int, str], Dict[str, Union[int, str]]]

Convert the component device representation to HuggingFace format.

Returns:

The HuggingFace device representation.

ComponentDevice.update_hf_kwargs

def update_hf_kwargs(hf_kwargs: Dict[str, Any], *,
                     overwrite: bool) -> Dict[str, Any]

Convert the component device representation to HuggingFace format

and add them as canonical keyword arguments to the keyword arguments dictionary.

Arguments:

  • hf_kwargs: The HuggingFace keyword arguments dictionary.
  • overwrite: Whether to overwrite existing device arguments.

Returns:

The HuggingFace keyword arguments dictionary.

ComponentDevice.has_multiple_devices

@property
def has_multiple_devices() -> bool

Whether this component device representation contains multiple devices.

ComponentDevice.first_device

@property
def first_device() -> Optional["ComponentDevice"]

Return either the single device or the first device in the

device map, if any.

Returns:

The first device.

ComponentDevice.resolve_device

@staticmethod
def resolve_device(
        device: Optional["ComponentDevice"] = None) -> "ComponentDevice"

Select a device for a component. If a device is specified,

it's used. Otherwise, the default device is used.

Arguments:

  • device: The provided device, if any.

Returns:

The resolved device.

ComponentDevice.to_dict

def to_dict() -> Dict[str, Any]

Convert the component device representation to a JSON-serializable

dictionary.

Returns:

The dictionary representation.

ComponentDevice.from_dict

@classmethod
def from_dict(cls, dict: Dict[str, Any]) -> "ComponentDevice"

Create a component device representation from a JSON-serialized

dictionary.

Arguments:

  • dict: The serialized representation.

Returns:

The deserialized component device.