Skip to main content
Version: 2.30

FileContent

FileContent represents a file payload that can be attached to a ChatMessage. Use it when a chat model accepts file inputs, such as PDFs or other documents, together with the user's text prompt.

If you need the full list of parameters and methods, see the FileContent API reference.

Attributes

python
@dataclass
class FileContent:
base64_data: str
mime_type: str | None = None
filename: str | None = None
extra: dict[str, Any] = field(default_factory=dict)
validation: bool = True
  • base64_data stores the file content as a base64-encoded string.
  • mime_type identifies the file type, for example application/pdf. Providing it explicitly is recommended because many model providers require it.
  • filename is optional, but some providers use it when processing uploaded files.
  • extra can store provider-specific metadata. Values should be JSON serializable.
  • validation checks that base64_data is valid and tries to infer the MIME type when one is not provided.

Create from a file path

Use from_file_path to read a local file, base64-encode it, infer the MIME type from the path, and populate the filename.

python
from haystack.dataclasses import ChatMessage, FileContent

file_content = FileContent.from_file_path("data/attention-is-all-you-need.pdf")

message = ChatMessage.from_user(
content_parts=[
file_content,
"Summarize the key ideas in this paper.",
]
)

Pass filename or extra when a provider expects a specific filename or provider-specific options:

python
file_content = FileContent.from_file_path(
"data/report.pdf",
filename="quarterly-report.pdf",
extra={"source": "finance"},
)

Create from a URL

Use from_url to download a file and convert it into a FileContent instance.

python
from haystack.dataclasses import FileContent

file_content = FileContent.from_url(
"https://example.com/reports/quarterly-report.pdf",
timeout=30,
)

If no filename is provided, Haystack uses the final path segment of the URL.

Create from base64 data

If you already have file bytes, encode them and pass the MIME type explicitly.

python
import base64
from pathlib import Path

from haystack.dataclasses import FileContent

data = Path("data/manual.pdf").read_bytes()
file_content = FileContent(
base64_data=base64.b64encode(data).decode("utf-8"),
mime_type="application/pdf",
filename="manual.pdf",
)

Set validation=False only when the base64 data and MIME type are already trusted and you want to skip validation.

Inspect files in a ChatMessage

After adding FileContent to a ChatMessage, use the file and files properties to access file payloads.

python
from haystack.dataclasses import ChatMessage, FileContent

file_content = FileContent.from_file_path("data/invoice.pdf")
message = ChatMessage.from_user(content_parts=[file_content, "Extract the invoice total."])

print(message.file)
print(message.files)

message.file returns the first file payload, or None if there are no files. message.files returns all file payloads.

Serialization

Use to_dict and from_dict to serialize and restore file content.

python
payload = file_content.to_dict()
restored = FileContent.from_dict(payload)

For tracing, Haystack replaces the full base64 payload with a placeholder so large files are not sent to the tracing backend.