AzureOpenAIChatGenerator
This component enables chat completion using OpenAIβs large language models (LLMs) through Azure services.
Most common position in a pipeline | After aΒ ChatPromptBuilder |
Mandatory init variables | "api_key": The Azure OpenAI API key. Can be set with AZURE_OPENAI_API_KEY env var."azure_ad_token": Microsoft Entra ID token. Can be set with AZURE_OPENAI_AD_TOKEN env var. |
Mandatory run variables | βmessagesβ: A list of ChatMessage objects representing the chat |
Output variables | βrepliesβ: A list of alternative replies of the LLM to the input chat |
API reference | Generators |
GitHub link | https://github.com/deepset-ai/haystack/blob/main/haystack/components/generators/chat/azure.py |
Overview
AzureOpenAIChatGenerator
supports OpenAI models deployed through Azure services. To see the list of supported models, head over to Azure documentation. The default model used with the component is gpt-4o-mini
.
To work with Azure components, you will need an Azure OpenAI API key, as well as an Azure OpenAI Endpoint. You can learn more about them in Azure documentation.
The component uses AZURE_OPENAI_API_KEY
andΒ AZURE_OPENAI_AD_TOKEN
Β environment variables by default. Otherwise, you can pass api_key
Β andΒ azure_ad_token
at initialization:
client = AzureOpenAIChatGenerator(azure_endpoint="<Your Azure endpoint e.g. `https://your-company.azure.openai.com/>",
api_key=Secret.from_token("<your-api-key>"),
azure_deployment="<a model name>")
We recommend using environment variables instead of initialization parameters.
Then, the component needs a list of ChatMessage
objects to operate. ChatMessage
is a data class that contains a message, a role (who generated the message, such as user
, assistant
, system
, function
), and optional metadata. See the usage section for an example.
You can pass any chat completion parameters that are valid for the openai.ChatCompletion.create
method directly to AzureOpenAIChatGenerator
using the generation_kwargs
parameter, both at initialization and to run()
method. For more details on the supported parameters, refer to the Azure documentation.
You can also specify a model for this component through the azure_deployment
init parameter.
Streaming
You can stream output as itβs generated. Pass a callback to streaming_callback
. Use the built-in print_streaming_chunk
to print text tokens and tool events (tool calls and tool results).
from haystack.components.generators.utils import print_streaming_chunk
# Configure any `Generator` or `ChatGenerator` with a streaming callback
component = SomeGeneratorOrChatGenerator(streaming_callback=print_streaming_chunk)
# If this is a `ChatGenerator`, pass a list of messages:
# from haystack.dataclasses import ChatMessage
# component.run([ChatMessage.from_user("Your question here")])
# If this is a (non-chat) `Generator`, pass a prompt:
# component.run({"prompt": "Your prompt here"})
Streaming works only with a single response. If a provider supports multiple candidates, set
n=1
.
See our Streaming Support docs to learn more how StreamingChunk
works and how to write a custom callback.
Give preference to print_streaming_chunk
by default. Write a custom callback only if you need a specific transport (for example, SSE/WebSocket) or custom UI formatting.
Usage
On its own
Basic usage:
from haystack.dataclasses import ChatMessage
from haystack.components.generators.chat import AzureOpenAIChatGenerator
client = AzureOpenAIChatGenerator()
response = client.run(
[ChatMessage.from_user("What's Natural Language Processing? Be brief.")]
)
print(response)
With streaming:
from haystack.dataclasses import ChatMessage
from haystack.components.generators.chat import AzureOpenAIChatGenerator
client = AzureOpenAIChatGenerator(streaming_callback=lambda chunk: print(chunk.content, end="", flush=True))
response = client.run(
[ChatMessage.from_user("What's Natural Language Processing? Be brief.")]
)
print(response)
In a pipeline
from haystack.components.builders import ChatPromptBuilder
from haystack.components.generators.chat import AzureOpenAIChatGenerator
from haystack.dataclasses import ChatMessage
from haystack import Pipeline
# no parameter init, we don't use any runtime template variables
prompt_builder = ChatPromptBuilder()
llm = AzureOpenAIChatGenerator()
pipe = Pipeline()
pipe.add_component("prompt_builder", prompt_builder)
pipe.add_component("llm", llm)
pipe.connect("prompt_builder.prompt", "llm.messages")
location = "Berlin"
messages = [ChatMessage.from_system("Always respond in German even if some input data is in other languages."),
ChatMessage.from_user("Tell me about {{location}}")]
pipe.run(data={"prompt_builder": {"template_variables":{"location": location}, "template": messages}})
Updated 3 days ago