Skip to main content
Version: 2.24-unstable

AmazonBedrockChatGenerator

This component enables chat completion using models through Amazon Bedrock service.

Most common position in a pipelineAfter aΒ ChatPromptBuilder
Mandatory init variablesmodel: The model to use

aws_access_key_id: AWS access key ID. Can be set with AWS_ACCESS_KEY_ID env var.

aws_secret_access_key: AWS secret access key. Can be set with AWS_SECRET_ACCESS_KEY env var.

aws_region_name: AWS region name. Can be set with AWS_DEFAULT_REGION env var.
Mandatory run variablesmessages: A list of ChatMessage instances
Output variablesreplies: A list of ChatMessage objects

meta: A list of dictionaries with the metadata associated with each reply, such as token count, finish reason, and so on
API referenceAmazon Bedrock
GitHub linkhttps://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/amazon_bedrock

Amazon BedrockΒ is a fully managed service that makes high-performing foundation models from leading AI startups and Amazon available through a unified API. You can choose from various foundation models to find the one best suited for your use case.

AmazonBedrockChatGenerator enables chat completion using chat models from Amazon, Anthropic, Cohere, Meta, Mistral, and more with a single component.

Overview​

This component uses AWS for authentication. You can use the AWS CLI to authenticate through your IAM. For more information on setting up an IAM identity-based policy, see the official documentation.

Using AWS CLI

Consider using AWS CLI as a more straightforward tool to manage your AWS services. With AWS CLI, you can quickly configure your boto3 credentials. This way, you won't need to provide detailed authentication parameters when initializing Amazon Bedrock Generator in Haystack.

To use this component for text generation, initialize an AmazonBedrockGenerator with the model name, the AWS credentials (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_DEFAULT_REGION) should be set as environment variables, be configured as described above or passed as Secret arguments. Note, make sure the region you set supports Amazon Bedrock.

Tool Support​

AmazonBedrockChatGenerator supports function calling through the tools parameter, which accepts flexible tool configurations:

  • A list of Tool objects: Pass individual tools as a list
  • A single Toolset: Pass an entire Toolset directly
  • Mixed Tools and Toolsets: Combine multiple Toolsets with standalone tools in a single list

This allows you to organize related tools into logical groups while also including standalone tools as needed.

python
from haystack.tools import Tool, Toolset
from haystack_integrations.components.generators.amazon_bedrock import AmazonBedrockChatGenerator

# Create individual tools
weather_tool = Tool(name="weather", description="Get weather info", ...)
news_tool = Tool(name="news", description="Get latest news", ...)

# Group related tools into a toolset
math_toolset = Toolset([add_tool, subtract_tool, multiply_tool])

# Pass mixed tools and toolsets to the generator
generator = AmazonBedrockChatGenerator(
model="anthropic.claude-3-5-sonnet-20240620-v1:0",
tools=[math_toolset, weather_tool, news_tool] # Mix of Toolset and Tool objects
)

For more details on working with tools, see the Tool and Toolset documentation.

Streaming​

This Generator supports streaming the tokens from the LLM directly in output. To do so, pass a function to the streaming_callback init parameter.

Prompt Caching​

AmazonBedrockChatGenerator supports prompt caching, to reduce inference response latency and input token costs.

Prompt caching on Bedrock is available for selected models. It allows you to define cache points within a request, as long as the input meets a model-specific minimum token threshold.

Each request can contain up to four cache points.

Caching messages​

This generator allows you to control cache points at the ChatMessage level via the meta field.

For example, to cache a long user message to be reused across multiple requests:

python
from haystack.dataclasses import ChatMessage
from haystack_integrations.components.generators.amazon_bedrock import AmazonBedrockChatGenerator

msg = ChatMessage.from_user(
"long message...",
meta={"cachePoint": {"type": "default", "ttl": "5m"}}
)

generator = AmazonBedrockChatGenerator(
model="anthropic.claude-sonnet-4-5-20250929-v1:0"
)

result = generator.run(messages=[msg])

If the cache point is successfully written, the number of cached input tokens is available at:

python
result["replies"][0].meta["usage"]["cache_write_input_tokens"]

Caching tools​

You can also cache tool definitions using the tools_cachepoint_config initialization parameter. When provided, all tools sent to the model are cached, if they exceed the minimum token threshold and the selected model supports prompt caching.

python
from haystack_integrations.components.generators.amazon_bedrock import AmazonBedrockChatGenerator

# define or load your tools

generator = AmazonBedrockChatGenerator(
model="anthropic.claude-sonnet-4-5-20250929-v1:0",
tools=my_tools,
tools_cachepoint_config={"type": "default", "ttl": "5m"}
)

# send a request to the Language Model

For more details on how prompt caching works in Amazon Bedrock, see the official documentation.

Usage​

To start using Amazon Bedrock with Haystack, install the amazon-bedrock-haystack package:

shell
pip install amazon-bedrock-haystack

On its own​

Basic usage:

python
from haystack_integrations.components.generators.amazon_bedrock import AmazonBedrockChatGenerator
from haystack.dataclasses import ChatMessage

generator = AmazonBedrockChatGenerator(model="meta.llama2-70b-chat-v1")
messages = [ChatMessage.from_system("You are a helpful assistant that answers question in Spanish only"), ChatMessage.from_user("What's Natural Language Processing? Be brief.")]

response = generator.run(messages)
print(response)

With multimodal inputs:

python
from haystack.dataclasses import ChatMessage, ImageContent
from haystack_integrations.components.generators.amazon_bedrock import AmazonBedrockChatGenerator

llm = AmazonBedrockChatGenerator(model="anthropic.claude-3-5-sonnet-20240620-v1:0")

image = ImageContent.from_file_path("apple.jpg")
user_message = ChatMessage.from_user(content_parts=[
"What does the image show? Max 5 words.",
image
])

response = llm.run([user_message])["replies"][0].text
print(response)

# Red apple on straw mat.

In a pipeline​

In a RAG pipeline:

python
from haystack import Pipeline
from haystack.components.builders import ChatPromptBuilder
from haystack.dataclasses import ChatMessage
from haystack_integrations.components.generators.amazon_bedrock import AmazonBedrockChatGenerator

pipe = Pipeline()
pipe.add_component("prompt_builder", ChatPromptBuilder())
pipe.add_component("llm", AmazonBedrockChatGenerator(model="meta.llama2-70b-chat-v1"))
pipe.connect("prompt_builder", "llm")

country = "Germany"
system_message = ChatMessage.from_system("You are an assistant giving out valuable information to language learners.")
messages = [system_message, ChatMessage.from_user("What's the official language of {{ country }}?")]

res = pipe.run(data={"prompt_builder": {"template_variables": {"country": country}, "template": messages}})
print(res)