DocumentationAPI Reference📓 Tutorials🧑‍🍳 Cookbook🤝 Integrations💜 Discord

SearchApiWebSearch

Search engine using Search API.

NameSearchApiWebSearch
Folder Path/websearch/
Most common Position in a PipelineBefore LinkContentFetcher or Converters
Mandatory Input variables“query”: a string with your query
Output variables“documents”: a List of Documents

”links”: a List of strings of resulting links

Overview

When you give SearchApiWebSearch a query, it returns a list of the URLs most relevant to your search. It uses page snippets (pieces of text displayed under the page title in search results) to find the answers, not the whole pages.

To search the content of the web pages, use the LinkContentFetcher component.

SearchApiWebSearch requires a SearchApi key to work. It uses a SEARCHAPI_API_KEY environment variable by default. Otherwise, you can pass an api_key at initialization – see code examples below.

📘

Alternative search

To use Serper Dev as an alternative, see its respective documentation page.

Usage

On its own

This is an example of how SearchApiWebSearch looks up answers to our query on the web and converts the results into a List of Documents with content snippets of the results, as well as URLs as strings.

from haystack.components.websearch import SearchApiWebSearch

web_search = SearchApiWebSearch(api_key=Secret.from_token("<your-api-key>"))
query = "What is the capital of Germany?"

response = web_search.run(query)

In a Pipeline

Here’s an example of a RAG pipeline where we use a SearchApiWebSearch to look up the answer to the query. The resulting Documents are then passed to LinkContentFetcher to get the full text from the URLs. Finally, PromptBuilder and OpenAIGenerator work together to form the final answer.

from haystack import Pipeline
from haystack.components.builders.prompt_builder import PromptBuilder
from haystack.components.fetchers import LinkContentFetcher
from haystack.components.converters import HTMLToDocument
from haystack.components.generators import OpenAIGenerator
from haystack.components.websearch import SearchApiWebSearch

web_search = SearchApiWebSearch(api_key=Secret.from_token("<your-api-key>"), top_k=2)
link_content = LinkContentFetcher()
html_converter = HTMLToDocument()

template = """Given the information below: \n
            {% for document in documents %}
                {{ document.content }}
            {% endfor %}
            Answer question: {{ query }}. \n Answer:"""

prompt_builder = PromptBuilder(template=template)
llm = OpenAIGenerator(api_key=Secret.from_token("<your-api-key>"),
                   model="gpt-3.5-turbo")

pipe = Pipeline()
pipe.add_component("search", web_search)
pipe.add_component("fetcher", link_content)
pipe.add_component("converter", html_converter)
pipe.add_component("prompt_builder", prompt_builder)
pipe.add_component("llm", llm)

pipe.connect("search.links", "fetcher.urls")
pipe.connect("fetcher.streams", "converter.sources")
pipe.connect("converter.documents", "prompt_builder.documents")
pipe.connect("prompt_builder.prompt", "llm.prompt")

query = "What is the most famous landmark in Berlin?"

pipe.run(data={"search":{"query":query}, "prompt_builder":{"query": query}})

Related Links