Version: 2.24-unstable

AmazonBedrockChatGenerator

This component enables chat completion using models through Amazon Bedrock service.


Most common position in a pipeline	After a ChatPromptBuilder
Mandatory init variables	`model`: The model to use `aws_access_key_id`: AWS access key ID. Can be set with `AWS_ACCESS_KEY_ID` env var. `aws_secret_access_key`: AWS secret access key. Can be set with `AWS_SECRET_ACCESS_KEY` env var. `aws_region_name`: AWS region name. Can be set with `AWS_DEFAULT_REGION` env var.
Mandatory run variables	`messages`: A list of `ChatMessage` instances
Output variables	`replies`: A list of `ChatMessage` objects `meta`: A list of dictionaries with the metadata associated with each reply, such as token count, finish reason, and so on
API reference	Amazon Bedrock
GitHub link	https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/amazon_bedrock

Amazon Bedrock is a fully managed service that makes high-performing foundation models from leading AI startups and Amazon available through a unified API. You can choose from various foundation models to find the one best suited for your use case.

AmazonBedrockChatGenerator enables chat completion using chat models from Amazon, Anthropic, Cohere, Meta, Mistral, and more with a single component.

Overview

This component uses AWS for authentication. You can use the AWS CLI to authenticate through your IAM. For more information on setting up an IAM identity-based policy, see the official documentation.

Using AWS CLI

Consider using AWS CLI as a more straightforward tool to manage your AWS services. With AWS CLI, you can quickly configure your boto3 credentials. This way, you won't need to provide detailed authentication parameters when initializing Amazon Bedrock Generator in Haystack.

To use this component for text generation, initialize an AmazonBedrockGenerator with the model name, the AWS credentials (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_DEFAULT_REGION) should be set as environment variables, be configured as described above or passed as Secret arguments. Note, make sure the region you set supports Amazon Bedrock.

Tool Support

AmazonBedrockChatGenerator supports function calling through the tools parameter, which accepts flexible tool configurations:

A list of Tool objects: Pass individual tools as a list
A single Toolset: Pass an entire Toolset directly
Mixed Tools and Toolsets: Combine multiple Toolsets with standalone tools in a single list

This allows you to organize related tools into logical groups while also including standalone tools as needed.

python

from haystack.tools import Tool, Toolset
from haystack_integrations.components.generators.amazon_bedrock import AmazonBedrockChatGenerator

# Create individual tools
weather_tool = Tool(name="weather", description="Get weather info", ...)
news_tool = Tool(name="news", description="Get latest news", ...)

# Group related tools into a toolset
math_toolset = Toolset([add_tool, subtract_tool, multiply_tool])

# Pass mixed tools and toolsets to the generator
generator = AmazonBedrockChatGenerator(
    model="anthropic.claude-3-5-sonnet-20240620-v1:0",
    tools=[math_toolset, weather_tool, news_tool]  # Mix of Toolset and Tool objects
)

For more details on working with tools, see the Tool and Toolset documentation.

Streaming

This Generator supports streaming the tokens from the LLM directly in output. To do so, pass a function to the streaming_callback init parameter.

Prompt Caching

AmazonBedrockChatGenerator supports prompt caching, to reduce inference response latency and input token costs.

Prompt caching on Bedrock is available for selected models. It allows you to define cache points within a request, as long as the input meets a model-specific minimum token threshold.

Each request can contain up to four cache points.

Caching messages

This generator allows you to control cache points at the ChatMessage level via the meta field.

For example, to cache a long user message to be reused across multiple requests:

python

from haystack.dataclasses import ChatMessage
from haystack_integrations.components.generators.amazon_bedrock import AmazonBedrockChatGenerator

msg = ChatMessage.from_user(
    "long message...",
    meta={"cachePoint": {"type": "default", "ttl": "5m"}}
)

generator = AmazonBedrockChatGenerator(
    model="anthropic.claude-sonnet-4-5-20250929-v1:0"
)

result = generator.run(messages=[msg])

If the cache point is successfully written, the number of cached input tokens is available at:

python

result["replies"][0].meta["usage"]["cache_write_input_tokens"]

Caching tools

You can also cache tool definitions using the tools_cachepoint_config initialization parameter. When provided, all tools sent to the model are cached, if they exceed the minimum token threshold and the selected model supports prompt caching.

python

from haystack_integrations.components.generators.amazon_bedrock import AmazonBedrockChatGenerator

# define or load your tools

generator = AmazonBedrockChatGenerator(
    model="anthropic.claude-sonnet-4-5-20250929-v1:0",
    tools=my_tools,
    tools_cachepoint_config={"type": "default", "ttl": "5m"}
)

# send a request to the Language Model

For more details on how prompt caching works in Amazon Bedrock, see the official documentation.

Usage

To start using Amazon Bedrock with Haystack, install the amazon-bedrock-haystack package:

shell

pip install amazon-bedrock-haystack

On its own

Basic usage:

python

from haystack_integrations.components.generators.amazon_bedrock import AmazonBedrockChatGenerator
from haystack.dataclasses import ChatMessage

generator = AmazonBedrockChatGenerator(model="meta.llama2-70b-chat-v1")
messages = [ChatMessage.from_system("You are a helpful assistant that answers question in Spanish only"), ChatMessage.from_user("What's Natural Language Processing? Be brief.")]

response = generator.run(messages)
print(response)

With multimodal inputs:

python

from haystack.dataclasses import ChatMessage, ImageContent
from haystack_integrations.components.generators.amazon_bedrock import AmazonBedrockChatGenerator

llm = AmazonBedrockChatGenerator(model="anthropic.claude-3-5-sonnet-20240620-v1:0")

image = ImageContent.from_file_path("apple.jpg")
user_message = ChatMessage.from_user(content_parts=[
	"What does the image show? Max 5 words.",
	image
	])

response = llm.run([user_message])["replies"][0].text
print(response)

# Red apple on straw mat.

In a pipeline

In a RAG pipeline:

python

from haystack import Pipeline
from haystack.components.builders import ChatPromptBuilder
from haystack.dataclasses import ChatMessage
from haystack_integrations.components.generators.amazon_bedrock import AmazonBedrockChatGenerator

pipe = Pipeline()
pipe.add_component("prompt_builder", ChatPromptBuilder())
pipe.add_component("llm", AmazonBedrockChatGenerator(model="meta.llama2-70b-chat-v1"))
pipe.connect("prompt_builder", "llm")

country = "Germany"
system_message = ChatMessage.from_system("You are an assistant giving out valuable information to language learners.")
messages = [system_message, ChatMessage.from_user("What's the official language of {{ country }}?")]

res = pipe.run(data={"prompt_builder": {"template_variables": {"country": country}, "template": messages}})
print(res)

Overview​

Tool Support​

Streaming​

Prompt Caching​

Caching messages​

Caching tools​

Usage​

On its own​

In a pipeline​

Overview

Tool Support

Streaming

Prompt Caching

Caching messages

Caching tools

Usage

On its own

In a pipeline