NvidiaChatGenerator
This Generator enables chat completion using Nvidia-hosted models.
Most common position in a pipeline | After a ChatPromptBuilder |
Mandatory init variables | "api_key": API key for the NVIDIA NIM. Can be set with NVIDIA_API_KEY env var. |
Mandatory run variables | "messages": A list of ChatMessage objects |
Output variables | "replies": A list of ChatMessage objects |
API reference | NVIDIA API |
GitHub link | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/nvidia |
Overview
NvidiaChatGenerator
enables chat completions using NVIDIA's generative models via the NVIDIA API. It is compatible with the ChatMessage format for both input and output, ensuring seamless integration in chat-based pipelines.
You can use LLMs self-hosted with NVIDIA NIM or models hosted on the NVIDIA API catalog. The default model for this component is meta/llama-3.1-8b-instruct
.
To use this integration, you must have a NVIDIA API key. You can provide it with the NVIDIA_API_KEY
environment variable or by using a Secret.
This generator supports streaming responses from the LLM. To enable streaming, pass a callable to the streaming_callback
parameter during initialization.
Usage
To start using NvidiaChatGenerator
, first, install the nvidia-haystack
package:
pip install nvidia-haystack
You can use the NvidiaChatGenerator
with all the LLMs available in the NVIDIA API catalog or a model deployed with NVIDIA NIM. Follow the NVIDIA NIM for LLMs Playbook to learn how to deploy your desired model on your infrastructure.
On its own
To use LLMs from the NVIDIA API catalog, you need to specify the correct api_url
if needed (the default one is https://integrate.api.nvidia.com/v1
), and your API key. You can get your API key directly from the catalog website.
from haystack_integrations.components.generators.nvidia import NvidiaChatGenerator
from haystack.dataclasses import ChatMessage
generator = NvidiaChatGenerator(
model="meta/llama-3.1-8b-instruct", # or any supported NVIDIA model
api_key=Secret.from_env_var("NVIDIA_API_KEY")
)
messages = [ChatMessage.from_user("What's Natural Language Processing? Be brief.")]
result = generator.run(messages)
print(result["replies"])
print(result["meta"])
In a Pipeline
from haystack import Pipeline
from haystack.components.builders import ChatPromptBuilder
from haystack.dataclasses import ChatMessage
from haystack_integrations.components.generators.nvidia import NvidiaChatGenerator
from haystack.utils import Secret
pipe = Pipeline()
pipe.add_component("prompt_builder", ChatPromptBuilder())
pipe.add_component("llm", NvidiaChatGenerator(
model="meta/llama-3.1-8b-instruct",
api_key=Secret.from_env_var("NVIDIA_API_KEY")
))
pipe.connect("prompt_builder", "llm")
country = "Germany"
system_message = ChatMessage.from_system("You are an assistant giving out valuable information to language learners.")
messages = [system_message, ChatMessage.from_user("What's the official language of {{ country }}?")]
res = pipe.run(data={"prompt_builder": {"template_variables": {"country": country}, "template": messages}})
print(res)
Updated 1 day ago