Skip to main content
Version: 2.24

FalkorDB

haystack_integrations.components.retrievers.falkordb.cypher_retriever

FalkorDBCypherRetriever

A power-user retriever for executing arbitrary OpenCypher queries against FalkorDB.

This retriever allows you to leverage graph traversal and multi-hop queries in GraphRAG pipelines. The query must return nodes or dictionaries that can be mapped exactly to a Haystack Document.

Security Warning: Raw Cypher queries must only come from trusted sources. Do not use un-sanitised user input directly in query strings. Use parameters instead.

Usage example:

python
from haystack_integrations.document_stores.falkordb import FalkorDBDocumentStore
from haystack_integrations.components.retrievers.falkordb import FalkorDBCypherRetriever

store = FalkorDBDocumentStore(host="localhost", port=6379)
retriever = FalkorDBCypherRetriever(
document_store=store,
custom_cypher_query="MATCH (d:Document)-[:RELATES_TO]->(:Concept {name: $concept}) RETURN d"
)

res = retriever.run(parameters={"concept": "GraphRAG"})
print(res["documents"])

init

python
__init__(
document_store: FalkorDBDocumentStore,
custom_cypher_query: str | None = None,
) -> None

Create a new FalkorDBCypherRetriever.

Parameters:

  • document_store (FalkorDBDocumentStore) – The FalkorDBDocumentStore instance.
  • custom_cypher_query (str | None) – A static OpenCypher query to execute. Can be overridden at runtime by passing query to run().

Raises:

  • ValueError – If the provided document_store is not a FalkorDBDocumentStore.

to_dict

python
to_dict() -> dict[str, Any]

Serialise the retriever to a dictionary.

Returns:

  • dict[str, Any] – Dictionary representation of the retriever.

from_dict

python
from_dict(data: dict[str, Any]) -> FalkorDBCypherRetriever

Deserialise a FalkorDBCypherRetriever produced by to_dict.

Parameters:

  • data (dict[str, Any]) – Serialised retriever dictionary.

Returns:

  • FalkorDBCypherRetriever – Reconstructed FalkorDBCypherRetriever instance.

run

python
run(
query: str | None = None, parameters: dict[str, Any] | None = None
) -> dict[str, list[Document]]

Retrieve documents by executing an OpenCypher query.

If a query is provided here, it overrides the custom_cypher_query set during initialisation.

Parameters:

  • query (str | None) – Optional OpenCypher query string.
  • parameters (dict[str, Any] | None) – Optional dictionary of query parameters (referenced as $param_name in the Cypher string).

Returns:

  • dict[str, list[Document]] – Dictionary containing a "documents" key with the retrieved documents.

Raises:

  • ValueError – If no query string is provided (both here and at init).

haystack_integrations.components.retrievers.falkordb.embedding_retriever

FalkorDBEmbeddingRetriever

A component for retrieving documents from a FalkorDBDocumentStore using vector similarity.

The retriever uses FalkorDB's native vector search index to find documents whose embeddings are most similar to the provided query embedding.

Usage example:

python
from haystack.dataclasses import Document
from haystack_integrations.document_stores.falkordb import FalkorDBDocumentStore
from haystack_integrations.components.retrievers.falkordb import FalkorDBEmbeddingRetriever

store = FalkorDBDocumentStore(host="localhost", port=6379)
store.write_documents([
Document(content="GraphRAG is powerful.", embedding=[0.1, 0.2, 0.3]),
Document(content="FalkorDB is fast.", embedding=[0.8, 0.9, 0.1]),
])

retriever = FalkorDBEmbeddingRetriever(document_store=store)
res = retriever.run(query_embedding=[0.1, 0.2, 0.3])
print(res["documents"][0].content) # "GraphRAG is powerful."

init

python
__init__(
document_store: FalkorDBDocumentStore,
filters: dict[str, Any] | None = None,
top_k: int = 10,
filter_policy: FilterPolicy = FilterPolicy.REPLACE,
) -> None

Create a new FalkorDBEmbeddingRetriever.

Parameters:

  • document_store (FalkorDBDocumentStore) – The FalkorDBDocumentStore instance.
  • filters (dict[str, Any] | None) – Optional Haystack filters to narrow down the search space.
  • top_k (int) – Maximum number of documents to retrieve.
  • filter_policy (FilterPolicy) – Policy to determine how runtime filters are combined with initialization filters.

Raises:

  • ValueError – If the provided document_store is not a FalkorDBDocumentStore.

to_dict

python
to_dict() -> dict[str, Any]

Serialise the retriever to a dictionary.

Returns:

  • dict[str, Any] – Dictionary representation of the retriever.

from_dict

python
from_dict(data: dict[str, Any]) -> FalkorDBEmbeddingRetriever

Deserialise a FalkorDBEmbeddingRetriever produced by to_dict.

Parameters:

  • data (dict[str, Any]) – Serialised retriever dictionary.

Returns:

  • FalkorDBEmbeddingRetriever – Reconstructed FalkorDBEmbeddingRetriever instance.

run

python
run(
query_embedding: list[float],
filters: dict[str, Any] | None = None,
top_k: int | None = None,
) -> dict[str, list[Document]]

Retrieve documents by vector similarity.

Parameters:

  • query_embedding (list[float]) – Query embedding vector.
  • filters (dict[str, Any] | None) – Optional Haystack filters to be combined with the init filters based on the configured filter policy.
  • top_k (int | None) – Maximum number of documents to return. If not provided, the default top_k from initialization is used.

Returns:

  • dict[str, list[Document]] – Dictionary containing a "documents" key with the retrieved documents.

haystack_integrations.document_stores.falkordb.document_store

FalkorDBDocumentStore

Bases: DocumentStore

A Haystack DocumentStore backed by FalkorDB — a high-performance graph database.

Optimised for GraphRAG workloads.

Documents are stored as graph nodes (labelled Document by default) in a named FalkorDB graph. Document properties, including meta fields, are stored flat at the same level as id and content — exactly the same layout as the neo4j-haystack reference integration.

Vector search is performed via FalkorDB's native vector index — no APOC is required. All bulk writes use UNWIND + MERGE for safe, idiomatic OpenCypher upserts.

Usage example:

python
from haystack_integrations.document_stores.falkordb import FalkorDBDocumentStore
from haystack.dataclasses import Document

store = FalkorDBDocumentStore(host="localhost", port=6379)
store.write_documents([
Document(content="Hello, GraphRAG!", meta={"year": 2024}),
])
print(store.count_documents()) # 1

init

python
__init__(
*,
host: str = "localhost",
port: int = 6379,
graph_name: str = "haystack",
username: str | None = None,
password: Secret | None = None,
node_label: str = "Document",
embedding_dim: int = 768,
embedding_field: str = "embedding",
similarity: SimilarityFunction = "cosine",
write_batch_size: int = 100,
recreate_graph: bool = False,
verify_connectivity: bool = False
) -> None

Create a new FalkorDBDocumentStore.

Parameters:

  • host (str) – Hostname of the FalkorDB server.
  • port (int) – Port the FalkorDB server listens on.
  • graph_name (str) – Name of the FalkorDB graph to use. Each graph is an isolated namespace.
  • username (str | None) – Optional username for FalkorDB authentication.
  • password (Secret | None) – Optional :class:haystack.utils.Secret holding the FalkorDB password. The secret value is resolved lazily on first connection.
  • node_label (str) – Label used for document nodes in the graph.
  • embedding_dim (int) – Dimensionality of the vector embeddings. Used when creating the vector index.
  • embedding_field (str) – Name of the node property that stores the embedding vector.
  • similarity (SimilarityFunction) – Similarity function for the vector index. Accepted values are "cosine" and "euclidean".
  • write_batch_size (int) – Number of documents written per UNWIND batch.
  • recreate_graph (bool) – When True the existing graph (and all its data) is dropped and recreated on initialisation. Useful for tests.
  • verify_connectivity (bool) – When True a connectivity probe is run immediately in __init__ — raises if the server is unreachable.

Raises:

  • ValueError – If similarity is not "cosine" or "euclidean".

to_dict

python
to_dict() -> dict[str, Any]

Serialise the store to a dictionary suitable for from_dict.

Returns:

  • dict[str, Any] – Dictionary representation of the store.

from_dict

python
from_dict(data: dict[str, Any]) -> FalkorDBDocumentStore

Deserialise a FalkorDBDocumentStore produced by to_dict.

Parameters:

  • data (dict[str, Any]) – Serialised store dictionary.

Returns:

  • FalkorDBDocumentStore – Reconstructed FalkorDBDocumentStore instance.

count_documents

python
count_documents() -> int

Return the number of documents currently stored in the graph.

Returns:

  • int – Integer count of document nodes.

filter_documents

python
filter_documents(filters: dict[str, Any] | None = None) -> list[Document]

Retrieve all documents that match the provided Haystack filters.

Parameters:

  • filters (dict[str, Any] | None) – Optional Haystack filter dict. When None all documents are returned. For filter syntax see Metadata filtering

Returns:

  • list[Document] – List of matching :class:haystack.dataclasses.Document objects.

Raises:

  • ValueError – If the filter dict is malformed.

write_documents

python
write_documents(
documents: list[Document], policy: DuplicatePolicy = DuplicatePolicy.NONE
) -> int

Write documents to the FalkorDB graph using UNWIND + MERGE for batching.

Document meta fields are stored flat at the same level as id and content — no prefix is added. This matches the layout used by the neo4j-haystack reference integration.

Parameters:

  • documents (list[Document]) – List of :class:haystack.dataclasses.Document objects.
  • policy (DuplicatePolicy) – How to handle documents whose id already exists. Defaults to :attr:DuplicatePolicy.NONE (treated as FAIL).

Returns:

  • int – Number of documents written or updated.

Raises:

  • ValueError – If documents contains non-Document elements.
  • DuplicateDocumentError – If policy is FAIL / NONE and a duplicate ID is encountered.
  • DocumentStoreError – If any other DB error occurs.

delete_documents

python
delete_documents(document_ids: list[str]) -> None

Delete documents by their IDs using a single UNWIND-based query.

Parameters:

  • document_ids (list[str]) – List of document IDs to remove from the graph.