Version: 2.27-unstable

TavilyWebSearch

Search the web using the Tavily AI-powered search API, optimized for LLM applications.


Most common position in a pipeline	Before a `ChatPromptBuilder` or right at the beginning of an indexing pipeline
Mandatory init variables	`api_key`: The Tavily API key. Can be set with the `TAVILY_API_KEY` env var.
Mandatory run variables	`query`: A string with your search query.
Output variables	`documents`: A list of Haystack Documents containing search result content and metadata. `links`: A list of strings of resulting URLs.
API reference	Tavily Search API
GitHub link	https://github.com/deepset-ai/haystack-core-integrations/blob/main/integrations/tavily/src/haystack_integrations/components/websearch/tavily/tavily_websearch.py

Overview

When you give TavilyWebSearch a query, it uses the Tavily Search API to search the web and return relevant content as Haystack Document objects. It also returns a list of the source URLs.

Tavily is an AI-powered search API built specifically for LLM applications. It returns clean, relevant snippets without the noise of traditional search engines, making it a great fit for RAG pipelines.

TavilyWebSearch requires a Tavily API key to work. By default, it looks for a TAVILY_API_KEY environment variable. Alternatively, you can pass an api_key directly during initialization.

Usage

On its own

Here is a quick example of how TavilyWebSearch searches the web based on a query and returns a list of Documents.

python

from haystack_integrations.components.websearch.tavily import TavilyWebSearch
from haystack.utils import Secret

web_search = TavilyWebSearch(
    api_key=Secret.from_env_var("TAVILY_API_KEY"),
    top_k=5,
)
query = "What is Haystack by deepset?"

response = web_search.run(query=query)

for doc in response["documents"]:
    print(doc.content)

In a pipeline

Here is an example of a Retrieval-Augmented Generation (RAG) pipeline that uses TavilyWebSearch to look up an answer on the web.

python

from haystack import Pipeline
from haystack.utils import Secret
from haystack.components.builders.chat_prompt_builder import ChatPromptBuilder
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack_integrations.components.websearch.tavily import TavilyWebSearch
from haystack.dataclasses import ChatMessage

web_search = TavilyWebSearch(
    api_key=Secret.from_env_var("TAVILY_API_KEY"),
    top_k=3,
)

prompt_template = [
    ChatMessage.from_system("You are a helpful assistant."),
    ChatMessage.from_user(
        "Given the information below:\n"
        "{% for document in documents %}{{ document.content }}\n{% endfor %}\n"
        "Answer the following question: {{ query }}.\nAnswer:",
    ),
]

prompt_builder = ChatPromptBuilder(
    template=prompt_template,
    required_variables={"query", "documents"},
)

llm = OpenAIChatGenerator(
    api_key=Secret.from_env_var("OPENAI_API_KEY"),
)

pipe = Pipeline()
pipe.add_component("search", web_search)
pipe.add_component("prompt_builder", prompt_builder)
pipe.add_component("llm", llm)

pipe.connect("search.documents", "prompt_builder.documents")
pipe.connect("prompt_builder.prompt", "llm.messages")

query = "What is Haystack by deepset?"

result = pipe.run(data={"search": {"query": query}, "prompt_builder": {"query": query}})

print(result["llm"]["replies"][0].text)

Overview​

Usage​

On its own​

In a pipeline​

Overview

Usage

On its own

In a pipeline