Components for query processing and expansion.
Module haystack_experimental.components.query.query_expander
QueryExpander
A component that returns a list of semantically similar queries to improve retrieval recall in RAG systems.
The component uses a chat generator to expand queries. The chat generator is expected to return a JSON response with the following structure:
Usage example:
{"queries": ["expanded query 1", "expanded query 2", "expanded query 3"]}
from haystack.components.generators.chat.openai import OpenAIChatGenerator
from haystack_experimental.components.query import QueryExpander
expander = QueryExpander(
chat_generator=OpenAIChatGenerator(model="gpt-4.1-mini"),
n_expansions=3
)
result = expander.run(query="green energy sources")
print(result["queries"])
# Output: ['alternative query 1', 'alternative query 2', 'alternative query 3', 'green energy sources']
# Note: Up to 3 additional queries + 1 original query (if include_original_query=True)
# To control total number of queries:
expander = QueryExpander(n_expansions=2, include_original_query=True) # Up to 3 total
# or
expander = QueryExpander(n_expansions=3, include_original_query=False) # Exactly 3 total
QueryExpander.__init__
def __init__(*,
chat_generator: Optional[ChatGenerator] = None,
prompt_template: Optional[str] = None,
n_expansions: int = 4,
include_original_query: bool = True)
Initialize the QueryExpander component.
Arguments:
chat_generator
: The chat generator component to use for query expansion. If None, a default OpenAIChatGenerator with gpt-4.1-mini model is used.prompt_template
: Custom PromptBuilder template for query expansion. The template should instruct the LLM to return a JSON response with the structure: {"queries": ["query1", "query2", "query3"]}. The template should include 'query' and 'n_expansions' variables.n_expansions
: Number of alternative queries to generate (default: 4).include_original_query
: Whether to include the original query in the output.
QueryExpander.to_dict
def to_dict() -> Dict[str, Any]
Serializes the component to a dictionary.
Returns:
Dictionary with serialized data.
QueryExpander.from_dict
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "QueryExpander"
Deserializes the component from a dictionary.
Arguments:
data
: Dictionary with serialized data.
Returns:
Deserialized component.
QueryExpander.run
@component.output_types(queries=List[str])
def run(query: str,
n_expansions: Optional[int] = None) -> Dict[str, List[str]]
Expand the input query into multiple semantically similar queries.
The language of the original query is preserved in the expanded queries.
Arguments:
query
: The original query to expand.n_expansions
: Number of additional queries to generate (not including the original). If None, uses the value from initialization. Can be 0 to generate no additional queries.
Raises:
ValueError
: If n_expansions is not positive (less than or equal to 0).RuntimeError
: If the component is not warmed up and the chat generator does not support warm up.
Returns:
Dictionary with "queries" key containing the list of expanded queries. If include_original_query=True, the original query will be included in addition to the n_expansions alternative queries.
QueryExpander.warm_up
def warm_up()
Warm up the underlying LLM if it supports it.