OllamaGenerator
A component that provides an interface to generate text using an LLM running on Ollama.
Name | OllamaGenerator |
Source | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/ollama |
Most common position in a pipeline | After a PromptBuilder |
Mandatory input variables | “prompt”: A string containing the prompt for the LLM |
Output variables | “replies”: A list of strings with all the replies generated by the LLM ”meta”: A list of dictionaries with the metadata associated with each reply, such as token count and others |
Overview
OllamaGenerator
provides an interface to generate text using an LLM running on Ollama.
OllamaGenerator
needs a model
name and a url
to work. By default, it uses "orca-mini"
model and "http://localhost:11434/api/generate"
url.
Ollama is a project focused on running LLMs locally. Internally, it uses the quantized GGUF format by default. This means it is possible to run LLMs on standard machines (even without GPUs) without having to go through complex installation procedures.
Streaming
This Generator supports streaming the tokens from the LLM directly in output. To do so, pass a function to the streaming_callback
init parameter.
Usage
- You need a running instance of Ollama. You can find the installation instructions here.
A fast way to run Ollama is using Docker:
docker run -d -p 11434:11434 --name ollama ollama/ollama:latest
- You need to download or pull the desired LLM. The model library is available on the Ollama website.
If you are using Docker, you can, for example, pull the Zephyr model:
docker exec ollama ollama pull zephyr
If you have already installed Ollama in your system, you can execute:
ollama pull zephyr
Choose a specific version of a model
You can also specify a tag to choose a specific (quantized) version of your model. The available tags are shown in the model card of the Ollama models library. This is an example for Zephyr.
In this case, simply run# ollama pull model:tag ollama pull zephyr:7b-alpha-q3_K_S
- You also need to install the
ollama-haystack
package:
pip install ollama-haystack
On its own
Here's how the OllamaGenerator
would work just on its own:
from haystack_integrations.components.generators.ollama import OllamaGenerator
generator = OllamaGenerator(model="zephyr",
url = "http://localhost:11434/api/generate",
generation_kwargs={
"num_predict": 100,
"temperature": 0.9,
})
print(generator.run("Who is the best American actor?"))
# {'replies': ['I do not have the ability to form opinions or preferences.
# However, some of the most acclaimed american actors in recent years include
# denzel washington, tom hanks, leonardo dicaprio, matthew mcconaughey...'],
#'meta': [{'model': 'zephyr', ...}]}
In a Pipeline
from haystack_integrations.components.generators.ollama import OllamaGenerator
from haystack import Pipeline, Document
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.components.builders.prompt_builder import PromptBuilder
from haystack.document_stores.in_memory import InMemoryDocumentStore
template = """
Given the following information, answer the question.
Context:
{% for document in documents %}
{{ document.content }}
{% endfor %}
Question: {{ query }}?
"""
docstore = InMemoryDocumentStore()
docstore.write_documents([Document(content="I really like summer"),
Document(content="My favorite sport is soccer"),
Document(content="I don't like reading sci-fi books"),
Document(content="I don't like crowded places"),])
generator = OllamaGenerator(model="zephyr",
url = "http://localhost:11434/api/generate",
generation_kwargs={
"num_predict": 100,
"temperature": 0.9,
})
pipe = Pipeline()
pipe.add_component("retriever", InMemoryBM25Retriever(document_store=docstore))
pipe.add_component("prompt_builder", PromptBuilder(template=template))
pipe.add_component("llm", generator)
pipe.connect("retriever", "prompt_builder.documents")
pipe.connect("prompt_builder", "llm")
result = pipe.run({"prompt_builder": {"query": query},
"retriever": {"query": query}})
print(result)
# {'llm': {'replies': ['Based on the provided context, it seems that you enjoy
# soccer and summer. Unfortunately, there is no direct information given about
# what else you enjoy...'],
# 'meta': [{'model': 'zephyr', ...]}}
Updated 2 months ago