FalkorDB
haystack_integrations.components.retrievers.falkordb.cypher_retriever
FalkorDBCypherRetriever
A power-user retriever for executing arbitrary OpenCypher queries against FalkorDB.
This retriever allows you to leverage graph traversal and multi-hop queries in
GraphRAG pipelines. The query must return nodes or dictionaries that can be
mapped exactly to a Haystack Document.
Security Warning: Raw Cypher queries must only come from trusted sources. Do
not use un-sanitised user input directly in query strings. Use parameters instead.
Usage example:
from haystack_integrations.document_stores.falkordb import FalkorDBDocumentStore
from haystack_integrations.components.retrievers.falkordb import FalkorDBCypherRetriever
store = FalkorDBDocumentStore(host="localhost", port=6379)
retriever = FalkorDBCypherRetriever(
document_store=store,
custom_cypher_query="MATCH (d:Document)-[:RELATES_TO]->(:Concept {name: $concept}) RETURN d"
)
res = retriever.run(parameters={"concept": "GraphRAG"})
print(res["documents"])
init
__init__(
document_store: FalkorDBDocumentStore,
custom_cypher_query: str | None = None,
) -> None
Create a new FalkorDBCypherRetriever.
Parameters:
- document_store (
FalkorDBDocumentStore) – The FalkorDBDocumentStore instance. - custom_cypher_query (
str | None) – A static OpenCypher query to execute. Can be overridden at runtime by passingquerytorun().
Raises:
ValueError– If the provideddocument_storeis not aFalkorDBDocumentStore.
to_dict
Serialise the retriever to a dictionary.
Returns:
dict[str, Any]– Dictionary representation of the retriever.
from_dict
Deserialise a FalkorDBCypherRetriever produced by to_dict.
Parameters:
- data (
dict[str, Any]) – Serialised retriever dictionary.
Returns:
FalkorDBCypherRetriever– ReconstructedFalkorDBCypherRetrieverinstance.
run
run(
query: str | None = None, parameters: dict[str, Any] | None = None
) -> dict[str, list[Document]]
Retrieve documents by executing an OpenCypher query.
If a query is provided here, it overrides the custom_cypher_query
set during initialisation.
Parameters:
- query (
str | None) – Optional OpenCypher query string. - parameters (
dict[str, Any] | None) – Optional dictionary of query parameters (referenced as$param_namein the Cypher string).
Returns:
dict[str, list[Document]]– Dictionary containing a"documents"key with the retrieved documents.
Raises:
ValueError– If no query string is provided (both here and at init).
haystack_integrations.components.retrievers.falkordb.embedding_retriever
FalkorDBEmbeddingRetriever
A component for retrieving documents from a FalkorDBDocumentStore using vector similarity.
The retriever uses FalkorDB's native vector search index to find documents whose embeddings are most similar to the provided query embedding.
Usage example:
from haystack.dataclasses import Document
from haystack_integrations.document_stores.falkordb import FalkorDBDocumentStore
from haystack_integrations.components.retrievers.falkordb import FalkorDBEmbeddingRetriever
store = FalkorDBDocumentStore(host="localhost", port=6379)
store.write_documents([
Document(content="GraphRAG is powerful.", embedding=[0.1, 0.2, 0.3]),
Document(content="FalkorDB is fast.", embedding=[0.8, 0.9, 0.1]),
])
retriever = FalkorDBEmbeddingRetriever(document_store=store)
res = retriever.run(query_embedding=[0.1, 0.2, 0.3])
print(res["documents"][0].content) # "GraphRAG is powerful."
init
__init__(
document_store: FalkorDBDocumentStore,
filters: dict[str, Any] | None = None,
top_k: int = 10,
filter_policy: FilterPolicy = FilterPolicy.REPLACE,
) -> None
Create a new FalkorDBEmbeddingRetriever.
Parameters:
- document_store (
FalkorDBDocumentStore) – The FalkorDBDocumentStore instance. - filters (
dict[str, Any] | None) – Optional Haystack filters to narrow down the search space. - top_k (
int) – Maximum number of documents to retrieve. - filter_policy (
FilterPolicy) – Policy to determine how runtime filters are combined with initialization filters.
Raises:
ValueError– If the provideddocument_storeis not aFalkorDBDocumentStore.
to_dict
Serialise the retriever to a dictionary.
Returns:
dict[str, Any]– Dictionary representation of the retriever.
from_dict
Deserialise a FalkorDBEmbeddingRetriever produced by to_dict.
Parameters:
- data (
dict[str, Any]) – Serialised retriever dictionary.
Returns:
FalkorDBEmbeddingRetriever– ReconstructedFalkorDBEmbeddingRetrieverinstance.
run
run(
query_embedding: list[float],
filters: dict[str, Any] | None = None,
top_k: int | None = None,
) -> dict[str, list[Document]]
Retrieve documents by vector similarity.
Parameters:
- query_embedding (
list[float]) – Query embedding vector. - filters (
dict[str, Any] | None) – Optional Haystack filters to be combined with the init filters based on the configured filter policy. - top_k (
int | None) – Maximum number of documents to return. If not provided, the default top_k from initialization is used.
Returns:
dict[str, list[Document]]– Dictionary containing a"documents"key with the retrieved documents.
haystack_integrations.document_stores.falkordb.document_store
FalkorDBDocumentStore
Bases: DocumentStore
A Haystack DocumentStore backed by FalkorDB — a high-performance graph database.
Optimised for GraphRAG workloads.
Documents are stored as graph nodes (labelled Document by default) in a named
FalkorDB graph. Document properties, including meta fields, are stored
flat at the same level as id and content — exactly the same layout as
the neo4j-haystack reference integration.
Vector search is performed via FalkorDB's native vector index —
no APOC is required. All bulk writes use UNWIND + MERGE for safe,
idiomatic OpenCypher upserts.
Usage example:
from haystack_integrations.document_stores.falkordb import FalkorDBDocumentStore
from haystack.dataclasses import Document
store = FalkorDBDocumentStore(host="localhost", port=6379)
store.write_documents([
Document(content="Hello, GraphRAG!", meta={"year": 2024}),
])
print(store.count_documents()) # 1
init
__init__(
*,
host: str = "localhost",
port: int = 6379,
graph_name: str = "haystack",
username: str | None = None,
password: Secret | None = None,
node_label: str = "Document",
embedding_dim: int = 768,
embedding_field: str = "embedding",
similarity: SimilarityFunction = "cosine",
write_batch_size: int = 100,
recreate_graph: bool = False,
verify_connectivity: bool = False
) -> None
Create a new FalkorDBDocumentStore.
Parameters:
- host (
str) – Hostname of the FalkorDB server. - port (
int) – Port the FalkorDB server listens on. - graph_name (
str) – Name of the FalkorDB graph to use. Each graph is an isolated namespace. - username (
str | None) – Optional username for FalkorDB authentication. - password (
Secret | None) – Optional :class:haystack.utils.Secretholding the FalkorDB password. The secret value is resolved lazily on first connection. - node_label (
str) – Label used for document nodes in the graph. - embedding_dim (
int) – Dimensionality of the vector embeddings. Used when creating the vector index. - embedding_field (
str) – Name of the node property that stores the embedding vector. - similarity (
SimilarityFunction) – Similarity function for the vector index. Accepted values are"cosine"and"euclidean". - write_batch_size (
int) – Number of documents written perUNWINDbatch. - recreate_graph (
bool) – WhenTruethe existing graph (and all its data) is dropped and recreated on initialisation. Useful for tests. - verify_connectivity (
bool) – WhenTruea connectivity probe is run immediately in__init__— raises if the server is unreachable.
Raises:
ValueError– Ifsimilarityis not"cosine"or"euclidean".
to_dict
Serialise the store to a dictionary suitable for from_dict.
Returns:
dict[str, Any]– Dictionary representation of the store.
from_dict
Deserialise a FalkorDBDocumentStore produced by to_dict.
Parameters:
- data (
dict[str, Any]) – Serialised store dictionary.
Returns:
FalkorDBDocumentStore– ReconstructedFalkorDBDocumentStoreinstance.
count_documents
Return the number of documents currently stored in the graph.
Returns:
int– Integer count of document nodes.
filter_documents
Retrieve all documents that match the provided Haystack filters.
Parameters:
- filters (
dict[str, Any] | None) – Optional Haystack filter dict. WhenNoneall documents are returned. For filter syntax see Metadata filtering
Returns:
list[Document]– List of matching :class:haystack.dataclasses.Documentobjects.
Raises:
ValueError– If the filter dict is malformed.
write_documents
write_documents(
documents: list[Document], policy: DuplicatePolicy = DuplicatePolicy.NONE
) -> int
Write documents to the FalkorDB graph using UNWIND + MERGE for batching.
Document meta fields are stored flat at the same level as id and
content — no prefix is added. This matches the layout used by the
neo4j-haystack reference integration.
Parameters:
- documents (
list[Document]) – List of :class:haystack.dataclasses.Documentobjects. - policy (
DuplicatePolicy) – How to handle documents whoseidalready exists. Defaults to :attr:DuplicatePolicy.NONE(treated as FAIL).
Returns:
int– Number of documents written or updated.
Raises:
ValueError– Ifdocumentscontains non-Document elements.DuplicateDocumentError– Ifpolicyis FAIL / NONE and a duplicate ID is encountered.DocumentStoreError– If any other DB error occurs.
delete_documents
Delete documents by their IDs using a single UNWIND-based query.
Parameters:
- document_ids (
list[str]) – List of document IDs to remove from the graph.