Version: 3.0

SerperDevWebSearch

Search engine using SerperDev API.


Most common position in a pipeline	Before `LinkContentFetcher` or Converters
Mandatory init variables	`api_key`: The SearchAPI API key. Can be set with `SERPERDEV_API_KEY` env var.
Mandatory run variables	`query`: A string with your query
Output variables	`documents`: A list of documents `links`: A list of strings of resulting links
API reference	SerperDev
GitHub link	https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/serperdev
Package name	`serperdev-haystack`

Overview

When you give SerperDevWebSearch a query, it returns a list of the URLs most relevant to your search. It uses page snippets (pieces of text displayed under the page title in search results) to find the answers, not the whole pages.

To search the content of the web pages, use the LinkContentFetcher component.

SerperDevWebSearch requires a SerperDev key to work. It uses a SERPERDEV_API_KEY environment variable by default. Otherwise, you can pass an api_key at initialization – see code examples below.

Alternative search

To use Search API as an alternative, see its respective documentation page.

Usage

Install the serperdev-haystack package to use the SerperDevWebSearch component:

shell

pip install serperdev-haystack

On its own

This is an example of how SerperDevWebSearch looks up answers to our query on the web and converts the results into a list of documents with content snippets of the results, as well as URLs as strings.

python

from haystack_integrations.components.websearch.serperdev import SerperDevWebSearch
from haystack.utils import Secret

web_search = SerperDevWebSearch(api_key=Secret.from_token("<your-api-key>"))
query = "What is the capital of Germany?"

response = web_search.run(query)

In a pipeline

Here’s an example of a RAG pipeline where we use a SerperDevWebSearch to look up the answer to the query. The resulting documents are then passed to LinkContentFetcher to get the full text from the URLs. Finally, ChatPromptBuilder and OpenAIChatGenerator work together to form the final answer.

python

from haystack import Pipeline
from haystack.utils import Secret
from haystack.components.builders.chat_prompt_builder import ChatPromptBuilder
from haystack.components.fetchers import LinkContentFetcher
from haystack.components.converters import HTMLToDocument
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack_integrations.components.websearch.serperdev import SerperDevWebSearch
from haystack.dataclasses import ChatMessage
from haystack.utils import Secret

web_search = SerperDevWebSearch(api_key=Secret.from_token("<your-api-key>"), top_k=2)
link_content = LinkContentFetcher()
html_converter = HTMLToDocument()

prompt_template = [
    ChatMessage.from_system("You are a helpful assistant."),
    ChatMessage.from_user(
        "Given the information below:\n"
        "{% for document in documents %}{{ document.content }}{% endfor %}\n"
        "Answer question: {{ query }}.\nAnswer:",
    ),
]

prompt_builder = ChatPromptBuilder(
    template=prompt_template,
    required_variables={"query", "documents"},
)
llm = OpenAIChatGenerator(
    api_key=Secret.from_token("<your-api-key>"),
)

pipe = Pipeline()
pipe.add_component("search", web_search)
pipe.add_component("fetcher", link_content)
pipe.add_component("converter", html_converter)
pipe.add_component("prompt_builder", prompt_builder)
pipe.add_component("llm", llm)

pipe.connect("search.links", "fetcher.urls")
pipe.connect("fetcher.streams", "converter.sources")
pipe.connect("converter.documents", "prompt_builder.documents")
pipe.connect("prompt_builder.prompt", "llm.messages")

query = "What is the most famous landmark in Berlin?"

pipe.run(data={"search": {"query": query}, "prompt_builder": {"query": query}})

In YAML

This is the YAML representation of the RAG pipeline shown above. It searches the web, fetches the resulting pages, converts them to text, builds a prompt with the content, and generates an answer using a chat model.

yaml

components:
  converter:
    init_parameters:
      extraction_kwargs: {}
      store_full_path: false
    type: haystack.components.converters.html.HTMLToDocument
  fetcher:
    init_parameters:
      client_kwargs:
        follow_redirects: true
        timeout: 3
      http2: false
      raise_on_failure: true
      request_headers: {}
      retry_attempts: 2
      timeout: 3
      user_agents:
      - haystack/LinkContentFetcher/2.27.0rc0
    type: haystack.components.fetchers.link_content.LinkContentFetcher
  llm:
    init_parameters:
      api_base_url: null
      api_key:
        env_vars:
        - OPENAI_API_KEY
        strict: true
        type: env_var
      generation_kwargs: {}
      http_client_kwargs: null
      max_retries: null
      model: gpt-4o-mini
      organization: null
      streaming_callback: null
      timeout: null
      tools: null
      tools_strict: false
    type: haystack.components.generators.chat.openai.OpenAIChatGenerator
  prompt_builder:
    init_parameters:
      required_variables:
      - documents
      - query
      template:
      - content:
        - text: You are a helpful assistant.
        meta: {}
        name: null
        role: system
      - content:
        - text: 'Given the information below:

            {% for document in documents %}{{ document.content }}{% endfor %}

            Answer question: {{ query }}.

            Answer:'
        meta: {}
        name: null
        role: user
      variables: null
    type: haystack.components.builders.chat_prompt_builder.ChatPromptBuilder
  search:
    init_parameters:
      allowed_domains: null
      api_key:
        env_vars:
        - SERPERDEV_API_KEY
        strict: true
        type: env_var
      exclude_subdomains: false
      search_params: {}
      top_k: 2
    type: haystack_integrations.components.websearch.serperdev.websearch.SerperDevWebSearch
connection_type_validation: true
connections:
- receiver: fetcher.urls
  sender: search.links
- receiver: converter.sources
  sender: fetcher.streams
- receiver: prompt_builder.documents
  sender: converter.documents
- receiver: llm.messages
  sender: prompt_builder.prompt
max_runs_per_component: 100
metadata: {}

Additional References

📓 Tutorial: Building Fallbacks to Websearch with Conditional Routing

Overview​

Usage​

On its own​

In a pipeline​

In YAML​

Additional References​