TogetherAIGenerator
This component enables text generation using models hosted on Together AI.
| Most common position in a pipeline | After a PromptBuilder |
| Mandatory init variables | "api_key": A Together API key. Can be set with TOGETHER_API_KEY env var. |
| Mandatory run variables | "prompt": A string containing the prompt for the LLM |
| Output variables | "replies": A list of strings with all the replies generated by the LLM "meta": A list of dictionaries with the metadata associated with each reply, such as token count, finish reason, and so on |
| API reference | TogetherAI |
| GitHub link | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/togetherai |
Overview
TogetherAIGenerator supports models hosted on Together AI, such as meta-llama/Llama-3.3-70B-Instruct-Turbo. For the full list of supported models, see Together AI documentation.
This component needs a prompt string to operate. You can pass any text generation parameters valid for the Together AI chat completion API directly to this component using the generation_kwargs parameter in __init__ or the generation_kwargs parameter in run method. For more details on the parameters supported by the Together AI API, see Together AI API documentation.
You can also provide an optional system_prompt to set context or instructions for text generation. If not provided, the system prompt is omitted, and the default system prompt of the model is used.
To use this integration, you need to have an active TogetherAI subscription with sufficient credits and an API key. You can provide it with:
- The
TOGETHER_API_KEYenvironment variable (recommended) - The
api_keyinit parameter and Haystack Secret API:Secret.from_token("your-api-key-here")
By default, the component uses Together AI's OpenAI-compatible base URL https://api.together.xyz/v1, which you can override with api_base_url if needed.
Streaming
TogetherAIGenerator supports streaming responses from the LLM, allowing tokens to be emitted as they are generated. To enable streaming, pass a callable to the streaming_callback parameter during initialization.
This component is designed for text generation, not for chat. If you want to use Together AI LLMs for chat, use
TogetherAIChatGeneratorinstead.
Usage
Install the togetherai-haystack package to use the TogetherAIGenerator:
pip install togetherai-haystack
On its own
Basic usage:
from haystack_integrations.components.generators.togetherai import TogetherAIGenerator
client = TogetherAIGenerator(model="meta-llama/Llama-3.3-70B-Instruct-Turbo")
response = client.run("What's Natural Language Processing? Be brief.")
print(response)
>> {'replies': ['Natural Language Processing (NLP) is a branch of artificial intelligence
>> that focuses on enabling computers to understand, interpret, and generate human language
>> in a way that is meaningful and useful.'],
>> 'meta': [{'model': 'meta-llama/Llama-3.3-70B-Instruct-Turbo', 'index': 0,
>> 'finish_reason': 'stop', 'usage': {'prompt_tokens': 15, 'completion_tokens': 36,
>> 'total_tokens': 51}}]}
With streaming:
from haystack_integrations.components.generators.togetherai import TogetherAIGenerator
client = TogetherAIGenerator(
model="meta-llama/Llama-3.3-70B-Instruct-Turbo",
streaming_callback=lambda chunk: print(chunk.content, end="", flush=True),
)
response = client.run("What's Natural Language Processing? Be brief.")
print(response)
With system prompt:
from haystack_integrations.components.generators.togetherai import TogetherAIGenerator
client = TogetherAIGenerator(
model="meta-llama/Llama-3.3-70B-Instruct-Turbo",
system_prompt="You are a helpful assistant that provides concise answers."
)
response = client.run("What's Natural Language Processing?")
print(response["replies"][0])
In a Pipeline
from haystack import Pipeline, Document
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.components.builders.prompt_builder import PromptBuilder
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack_integrations.components.generators.togetherai import TogetherAIGenerator
docstore = InMemoryDocumentStore()
docstore.write_documents([
Document(content="Rome is the capital of Italy"),
Document(content="Paris is the capital of France")
])
query = "What is the capital of France?"
template = """
Given the following information, answer the question.
Context:
{% for document in documents %}
{{ document.content }}
{% endfor %}
Question: {{ query }}?
"""
pipe = Pipeline()
pipe.add_component("retriever", InMemoryBM25Retriever(document_store=docstore))
pipe.add_component("prompt_builder", PromptBuilder(template=template))
pipe.add_component("llm", TogetherAIGenerator(model="meta-llama/Llama-3.3-70B-Instruct-Turbo"))
pipe.connect("retriever", "prompt_builder.documents")
pipe.connect("prompt_builder", "llm")
result = pipe.run({
"prompt_builder": {"query": query},
"retriever": {"query": query}
})
print(result)
>> {'llm': {'replies': ['The capital of France is Paris.'],
>> 'meta': [{'model': 'meta-llama/Llama-3.3-70B-Instruct-Turbo', ...}]}}
Updated about 3 hours ago
