AmazonBedrockChatGenerator
This component enables chat completion using models through Amazon Bedrock service.
| Most common position in a pipeline | After aΒ ChatPromptBuilder |
| Mandatory init variables | model: The model to use aws_access_key_id: AWS access key ID. Can be set with AWS_ACCESS_KEY_ID env var. aws_secret_access_key: AWS secret access key. Can be set with AWS_SECRET_ACCESS_KEY env var. aws_region_name: AWS region name. Can be set with AWS_DEFAULT_REGION env var. |
| Mandatory run variables | messages: A list of ChatMessage instances |
| Output variables | replies: A list of ChatMessage objects meta: A list of dictionaries with the metadata associated with each reply, such as token count, finish reason, and so on |
| API reference | Amazon Bedrock |
| GitHub link | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/amazon_bedrock |
Amazon BedrockΒ is a fully managed service that makes high-performing foundation models from leading AI startups and Amazon available through a unified API. You can choose from various foundation models to find the one best suited for your use case.
AmazonBedrockChatGenerator enables chat completion using chat models from Amazon, Anthropic, Cohere, Meta, Mistral, and more with a single component.
Overviewβ
This component uses AWS for authentication. You can use the AWS CLI to authenticate through your IAM. For more information on setting up an IAM identity-based policy, see the official documentation.
Consider using AWS CLI as a more straightforward tool to manage your AWS services. With AWS CLI, you can quickly configure your boto3 credentials. This way, you won't need to provide detailed authentication parameters when initializing Amazon Bedrock Generator in Haystack.
To use this component for text generation, initialize an AmazonBedrockGenerator with the model name, the AWS credentials (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_DEFAULT_REGION) should be set as environment variables, be configured as described above or passed as Secret arguments. Note, make sure the region you set supports Amazon Bedrock.
Tool Supportβ
AmazonBedrockChatGenerator supports function calling through the tools parameter, which accepts flexible tool configurations:
- A list of Tool objects: Pass individual tools as a list
- A single Toolset: Pass an entire Toolset directly
- Mixed Tools and Toolsets: Combine multiple Toolsets with standalone tools in a single list
This allows you to organize related tools into logical groups while also including standalone tools as needed.
from haystack.tools import Tool, Toolset
from haystack_integrations.components.generators.amazon_bedrock import AmazonBedrockChatGenerator
# Create individual tools
weather_tool = Tool(name="weather", description="Get weather info", ...)
news_tool = Tool(name="news", description="Get latest news", ...)
# Group related tools into a toolset
math_toolset = Toolset([add_tool, subtract_tool, multiply_tool])
# Pass mixed tools and toolsets to the generator
generator = AmazonBedrockChatGenerator(
model="anthropic.claude-3-5-sonnet-20240620-v1:0",
tools=[math_toolset, weather_tool, news_tool] # Mix of Toolset and Tool objects
)
For more details on working with tools, see the Tool and Toolset documentation.
Streamingβ
This Generator supports streaming the tokens from the LLM directly in output. To do so, pass a function to the streaming_callback init parameter.
Prompt Cachingβ
AmazonBedrockChatGenerator supports prompt caching, to reduce inference response latency and input token costs.
Prompt caching on Bedrock is available for selected models. It allows you to define cache points within a request, as long as the input meets a model-specific minimum token threshold.
Each request can contain up to four cache points.
Caching messagesβ
This generator allows you to control cache points at the ChatMessage level via the meta field.
For example, to cache a long user message to be reused across multiple requests:
from haystack.dataclasses import ChatMessage
from haystack_integrations.components.generators.amazon_bedrock import AmazonBedrockChatGenerator
msg = ChatMessage.from_user(
"long message...",
meta={"cachePoint": {"type": "default", "ttl": "5m"}}
)
generator = AmazonBedrockChatGenerator(
model="anthropic.claude-sonnet-4-5-20250929-v1:0"
)
result = generator.run(messages=[msg])
If the cache point is successfully written, the number of cached input tokens is available at:
Caching toolsβ
You can also cache tool definitions using the tools_cachepoint_config initialization parameter.
When provided, all tools sent to the model are cached, if they exceed the minimum token threshold and the selected
model supports prompt caching.
from haystack_integrations.components.generators.amazon_bedrock import AmazonBedrockChatGenerator
# define or load your tools
generator = AmazonBedrockChatGenerator(
model="anthropic.claude-sonnet-4-5-20250929-v1:0",
tools=my_tools,
tools_cachepoint_config={"type": "default", "ttl": "5m"}
)
# send a request to the Language Model
For more details on how prompt caching works in Amazon Bedrock, see the official documentation.
Usageβ
To start using Amazon Bedrock with Haystack, install the amazon-bedrock-haystack package:
On its ownβ
Basic usage:
from haystack_integrations.components.generators.amazon_bedrock import AmazonBedrockChatGenerator
from haystack.dataclasses import ChatMessage
generator = AmazonBedrockChatGenerator(model="meta.llama2-70b-chat-v1")
messages = [ChatMessage.from_system("You are a helpful assistant that answers question in Spanish only"), ChatMessage.from_user("What's Natural Language Processing? Be brief.")]
response = generator.run(messages)
print(response)
With multimodal inputs:
from haystack.dataclasses import ChatMessage, ImageContent
from haystack_integrations.components.generators.amazon_bedrock import AmazonBedrockChatGenerator
llm = AmazonBedrockChatGenerator(model="anthropic.claude-3-5-sonnet-20240620-v1:0")
image = ImageContent.from_file_path("apple.jpg")
user_message = ChatMessage.from_user(content_parts=[
"What does the image show? Max 5 words.",
image
])
response = llm.run([user_message])["replies"][0].text
print(response)
# Red apple on straw mat.
In a pipelineβ
In a RAG pipeline:
from haystack import Pipeline
from haystack.components.builders import ChatPromptBuilder
from haystack.dataclasses import ChatMessage
from haystack_integrations.components.generators.amazon_bedrock import AmazonBedrockChatGenerator
pipe = Pipeline()
pipe.add_component("prompt_builder", ChatPromptBuilder())
pipe.add_component("llm", AmazonBedrockChatGenerator(model="meta.llama2-70b-chat-v1"))
pipe.connect("prompt_builder", "llm")
country = "Germany"
system_message = ChatMessage.from_system("You are an assistant giving out valuable information to language learners.")
messages = [system_message, ChatMessage.from_user("What's the official language of {{ country }}?")]
res = pipe.run(data={"prompt_builder": {"template_variables": {"country": country}, "template": messages}})
print(res)