GoogleGenAIMultimodalDocumentEmbedder
GoogleGenAIMultimodalDocumentEmbedder computes the embeddings of a list of non-textual documents and stores the obtained vectors in the embedding field of each document.
It uses Google AI multimodal embedding models with the ability to embed text, images, videos, and audio into the same vector space.
| Most common position in a pipeline | Before a DocumentWriter in an indexing pipeline |
| Mandatory init variables | api_key: The Google API key. Can be set with GOOGLE_API_KEY or GEMINI_API_KEY env var. |
| Mandatory run variables | documents: A list of documents, with a meta field containing an image file path |
| Output variables | documents: A list of documents (enriched with embeddings) meta: A dictionary of metadata |
| API reference | Google AI |
| GitHub link | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/google_genai |
Overview
GoogleGenAIMultimodalDocumentEmbedder expects a list of documents containing a file path in a meta field. The meta field can be specified with the file_path_meta_field init parameter of this component.
The embedder efficiently loads the files, computes the embeddings using a Google AI model, and stores each of them in the embedding field of the document.
GoogleGenAIMultimodalDocumentEmbedder is commonly used in indexing pipelines. At retrieval time, you need to use the same model with a GoogleGenAITextEmbedder to embed the query, before using an Embedding Retriever.
This component is compatible with Gemini multimodal models: gemini-2-embedding-preview and later. For a complete list of supported models, see the Google AI documentation.
To embed a textual document, you should use the GoogleGenAIDocumentEmbedder.
To embed a string, you should use the GoogleGenAITextEmbedder.
To start using this integration with Haystack, install it with:
Authentication
Google Gen AI is compatible with both the Gemini Developer API and the Vertex AI API.
To use this component with the Gemini Developer API and get an API key, visit Google AI Studio. To use this component with the Vertex AI API, visit Google Cloud > Vertex AI.
The component uses a GOOGLE_API_KEY or GEMINI_API_KEY environment variable by default. Otherwise, you can pass an API key at initialization with a Secret and Secret.from_token static method:
embedder = GoogleGenAIMultimodalDocumentEmbedder(
api_key=Secret.from_token("<your-api-key>"),
)
The following examples show how to use the component with the Gemini Developer API and the Vertex AI API.
Gemini Developer API (API Key Authentication)
from haystack_integrations.components.embedders.google_genai import (
GoogleGenAIMultimodalDocumentEmbedder,
)
## set the environment variable (GOOGLE_API_KEY or GEMINI_API_KEY)
embedder = GoogleGenAIMultimodalDocumentEmbedder()
Vertex AI (Application Default Credentials)
from haystack_integrations.components.embedders.google_genai import (
GoogleGenAIMultimodalDocumentEmbedder,
)
## Using Application Default Credentials (requires gcloud auth setup)
embedder = GoogleGenAIMultimodalDocumentEmbedder(
api="vertex",
vertex_ai_project="my-project",
vertex_ai_location="us-central1",
)
Vertex AI (API Key Authentication)
from haystack_integrations.components.embedders.google_genai import (
GoogleGenAIMultimodalDocumentEmbedder,
)
## set the environment variable (GOOGLE_API_KEY or GEMINI_API_KEY)
embedder = GoogleGenAIMultimodalDocumentEmbedder(api="vertex")
Usage
On its own
Here is how you can use the component on its own. You'll need to pass in your Google API key via Secret or set it as an environment variable called GOOGLE_API_KEY or GEMINI_API_KEY.
The examples below assume you've set the environment variable.
from haystack import Document
from haystack_integrations.components.embedders.google_genai import (
GoogleGenAIMultimodalDocumentEmbedder,
)
docs = [
Document(meta={"file_path": "path/to/image.jpg"}),
Document(meta={"file_path": "path/to/video.mp4"}),
Document(meta={"file_path": "path/to/pdf.pdf", "page_number": 1}),
Document(meta={"file_path": "path/to/pdf.pdf", "page_number": 3}),
]
document_embedder = GoogleGenAIMultimodalDocumentEmbedder()
result = document_embedder.run(documents=docs)
print(result["documents"][0].embedding)
## [0.017020374536514282, -0.023255806416273117, ...]
Setting embedding dimensions
Models like gemini-2-embedding-preview have a default embedding dimension of 3072, but, thanks to
Matryoshka Representation Learning, it's possible to reduce embedding size while keeping similar performance.
Check the Google AI documentation for more information.
from haystack import Document
from haystack_integrations.components.embedders.google_genai import (
GoogleGenAIMultimodalDocumentEmbedder,
)
docs = [Document(meta={"file_path": "path/to/image.jpg"})]
doc_multimodal_embedder = GoogleGenAIMultimodalDocumentEmbedder(
config={"output_dimensionality": 768},
)
docs_with_embeddings = doc_multimodal_embedder.run(docs)["documents"]
In a pipeline
In the following example, we look for a specific plot in the "Scaling Instruction-Finetuned Language Models" paper (PDF format).
You first need to download the PDF file from https://arxiv.org/pdf/2210.11416.pdf.
from haystack import Document
from haystack import Pipeline
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack_integrations.components.embedders.google_genai import (
GoogleGenAITextEmbedder,
)
from haystack_integrations.components.embedders.google_genai import (
GoogleGenAIMultimodalDocumentEmbedder,
)
from haystack.components.writers import DocumentWriter
from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever
document_store = InMemoryDocumentStore(embedding_similarity_function="cosine")
paper_path = "2210.11416.pdf"
documents = [
Document(meta={"file_path": paper_path, "page_number": i}) for i in range(1, 16)
]
indexing_pipeline = Pipeline()
indexing_pipeline.add_component("embedder", GoogleGenAIMultimodalDocumentEmbedder())
indexing_pipeline.add_component("writer", DocumentWriter(document_store=document_store))
indexing_pipeline.connect("embedder", "writer")
indexing_pipeline.run({"embedder": {"documents": documents}})
query_pipeline = Pipeline()
query_pipeline.add_component("text_embedder", GoogleGenAITextEmbedder())
query_pipeline.add_component(
"retriever",
InMemoryEmbeddingRetriever(document_store=document_store),
)
query_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")
query = "plot showing BBH accuracy"
result = query_pipeline.run({"text_embedder": {"text": query}})
print(result["retriever"]["documents"][0].meta)
# {'file_path': '2210.11416.pdf', 'page_number': 9}