Google Vertex integration for Haystack
Module haystack_integrations.components.generators.google_vertex.gemini
VertexAIGeminiGenerator
VertexAIGeminiGenerator
enables text generation using Google Gemini models.
Usage example:
from haystack_integrations.components.generators.google_vertex import VertexAIGeminiGenerator
gemini = VertexAIGeminiGenerator()
result = gemini.run(parts = ["What is the most interesting thing you know?"])
for answer in result["replies"]:
print(answer)
>>> 1. **The Origin of Life:** How and where did life begin? The answers to this ...
>>> 2. **The Unseen Universe:** The vast majority of the universe is ...
>>> 3. **Quantum Entanglement:** This eerie phenomenon in quantum mechanics allows ...
>>> 4. **Time Dilation:** Einstein's theory of relativity revealed that time can ...
>>> 5. **The Fermi Paradox:** Despite the vastness of the universe and the ...
>>> 6. **Biological Evolution:** The idea that life evolves over time through natural ...
>>> 7. **Neuroplasticity:** The brain's ability to adapt and change throughout life, ...
>>> 8. **The Goldilocks Zone:** The concept of the habitable zone, or the Goldilocks zone, ...
>>> 9. **String Theory:** This theoretical framework in physics aims to unify all ...
>>> 10. **Consciousness:** The nature of human consciousness and how it arises ...
VertexAIGeminiGenerator.__init__
def __init__(*,
model: str = "gemini-2.0-flash",
project_id: Optional[str] = None,
location: Optional[str] = None,
generation_config: Optional[Union[GenerationConfig,
Dict[str, Any]]] = None,
safety_settings: Optional[Dict[HarmCategory,
HarmBlockThreshold]] = None,
system_instruction: Optional[Union[str, ByteStream, Part]] = None,
streaming_callback: Optional[Callable[[StreamingChunk],
None]] = None)
Multi-modal generator using Gemini model via Google Vertex AI.
Authenticates using Google Cloud Application Default Credentials (ADCs). For more information see the official Google documentation.
Arguments:
project_id
: ID of the GCP project to use. By default, it is set during Google Cloud authentication.model
: Name of the model to use. For available models, see https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models.location
: The default location to use when making API calls, if not set uses us-central-1.generation_config
: The generation config to use. Can either be aGenerationConfig
object or a dictionary of parameters. Accepted fields are: - temperature - top_p - top_k - candidate_count - max_output_tokens - stop_sequencessafety_settings
: The safety settings to use. See the documentation for HarmBlockThreshold and HarmCategory for more details.system_instruction
: Default system instruction to use for generating content.streaming_callback
: A callback function that is called when a new token is received from the stream. The callback function accepts StreamingChunk as an argument.
VertexAIGeminiGenerator.to_dict
def to_dict() -> Dict[str, Any]
Serializes the component to a dictionary.
Returns:
Dictionary with serialized data.
VertexAIGeminiGenerator.from_dict
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "VertexAIGeminiGenerator"
Deserializes the component from a dictionary.
Arguments:
data
: Dictionary to deserialize from.
Returns:
Deserialized component.
VertexAIGeminiGenerator.run
@component.output_types(replies=List[str])
def run(parts: Variadic[Union[str, ByteStream, Part]],
streaming_callback: Optional[Callable[[StreamingChunk], None]] = None)
Generates content using the Gemini model.
Arguments:
parts
: Prompt for the model.streaming_callback
: A callback function that is called when a new token is received from the stream.
Returns:
A dictionary with the following keys:
replies
: A list of generated content.
Module haystack_integrations.components.generators.google_vertex.captioner
VertexAIImageCaptioner
VertexAIImageCaptioner
enables text generation using Google Vertex AI imagetext generative model.
Authenticates using Google Cloud Application Default Credentials (ADCs). For more information see the official Google documentation.
Usage example:
import requests
from haystack.dataclasses.byte_stream import ByteStream
from haystack_integrations.components.generators.google_vertex import VertexAIImageCaptioner
captioner = VertexAIImageCaptioner()
image = ByteStream(
data=requests.get(
"https://raw.githubusercontent.com/deepset-ai/haystack-core-integrations/main/integrations/google_vertex/example_assets/robot1.jpg"
).content
)
result = captioner.run(image=image)
for caption in result["captions"]:
print(caption)
>>> two gold robots are standing next to each other in the desert
VertexAIImageCaptioner.__init__
def __init__(*,
model: str = "imagetext",
project_id: Optional[str] = None,
location: Optional[str] = None,
**kwargs)
Generate image captions using a Google Vertex AI model.
Authenticates using Google Cloud Application Default Credentials (ADCs). For more information see the official Google documentation.
Arguments:
project_id
: ID of the GCP project to use. By default, it is set during Google Cloud authentication.model
: Name of the model to use.location
: The default location to use when making API calls, if not set uses us-central-1. Defaults to None.kwargs
: Additional keyword arguments to pass to the model. For a list of supported arguments see theImageTextModel.get_captions()
documentation.
VertexAIImageCaptioner.to_dict
def to_dict() -> Dict[str, Any]
Serializes the component to a dictionary.
Returns:
Dictionary with serialized data.
VertexAIImageCaptioner.from_dict
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "VertexAIImageCaptioner"
Deserializes the component from a dictionary.
Arguments:
data
: Dictionary to deserialize from.
Returns:
Deserialized component.
VertexAIImageCaptioner.run
@component.output_types(captions=List[str])
def run(image: ByteStream)
Prompts the model to generate captions for the given image.
Arguments:
image
: The image to generate captions for.
Returns:
A dictionary with the following keys:
captions
: A list of captions generated by the model.
Module haystack_integrations.components.generators.google_vertex.code_generator
VertexAICodeGenerator
This component enables code generation using Google Vertex AI generative model.
VertexAICodeGenerator
supports code-bison
, code-bison-32k
, and code-gecko
.
Usage example:
from haystack_integrations.components.generators.google_vertex import VertexAICodeGenerator
generator = VertexAICodeGenerator()
result = generator.run(prefix="def to_json(data):")
for answer in result["replies"]:
print(answer)
>>> ```python
>>> import json
>>>
>>> def to_json(data):
>>> """Converts a Python object to a JSON string.
>>>
>>> Args:
>>> data: The Python object to convert.
>>>
>>> Returns:
>>> A JSON string representing the Python object.
>>> """
>>>
>>> return json.dumps(data)
>>> ```
VertexAICodeGenerator.__init__
def __init__(*,
model: str = "code-bison",
project_id: Optional[str] = None,
location: Optional[str] = None,
**kwargs)
Generate code using a Google Vertex AI model.
Authenticates using Google Cloud Application Default Credentials (ADCs). For more information see the official Google documentation.
Arguments:
project_id
: ID of the GCP project to use. By default, it is set during Google Cloud authentication.model
: Name of the model to use.location
: The default location to use when making API calls, if not set uses us-central-1.kwargs
: Additional keyword arguments to pass to the model. For a list of supported arguments see theTextGenerationModel.predict()
documentation.
VertexAICodeGenerator.to_dict
def to_dict() -> Dict[str, Any]
Serializes the component to a dictionary.
Returns:
Dictionary with serialized data.
VertexAICodeGenerator.from_dict
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "VertexAICodeGenerator"
Deserializes the component from a dictionary.
Arguments:
data
: Dictionary to deserialize from.
Returns:
Deserialized component.
VertexAICodeGenerator.run
@component.output_types(replies=List[str])
def run(prefix: str, suffix: Optional[str] = None)
Generate code using a Google Vertex AI model.
Arguments:
prefix
: Code before the current point.suffix
: Code after the current point.
Returns:
A dictionary with the following keys:
replies
: A list of generated code snippets.
Module haystack_integrations.components.generators.google_vertex.image_generator
VertexAIImageGenerator
This component enables image generation using Google Vertex AI generative model.
Authenticates using Google Cloud Application Default Credentials (ADCs). For more information see the official Google documentation.
Usage example:
from pathlib import Path
from haystack_integrations.components.generators.google_vertex import VertexAIImageGenerator
generator = VertexAIImageGenerator()
result = generator.run(prompt="Generate an image of a cute cat")
result["images"][0].to_file(Path("my_image.png"))
VertexAIImageGenerator.__init__
def __init__(*,
model: str = "imagegeneration",
project_id: Optional[str] = None,
location: Optional[str] = None,
**kwargs)
Generates images using a Google Vertex AI model.
Authenticates using Google Cloud Application Default Credentials (ADCs). For more information see the official Google documentation.
Arguments:
project_id
: ID of the GCP project to use. By default, it is set during Google Cloud authentication.model
: Name of the model to use.location
: The default location to use when making API calls, if not set uses us-central-1.kwargs
: Additional keyword arguments to pass to the model. For a list of supported arguments see theImageGenerationModel.generate_images()
documentation.
VertexAIImageGenerator.to_dict
def to_dict() -> Dict[str, Any]
Serializes the component to a dictionary.
Returns:
Dictionary with serialized data.
VertexAIImageGenerator.from_dict
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "VertexAIImageGenerator"
Deserializes the component from a dictionary.
Arguments:
data
: Dictionary to deserialize from.
Returns:
Deserialized component.
VertexAIImageGenerator.run
@component.output_types(images=List[ByteStream])
def run(prompt: str, negative_prompt: Optional[str] = None)
Produces images based on the given prompt.
Arguments:
prompt
: The prompt to generate images from.negative_prompt
: A description of what you want to omit in the generated images.
Returns:
A dictionary with the following keys:
images
: A list of ByteStream objects, each containing an image.
Module haystack_integrations.components.generators.google_vertex.question_answering
VertexAIImageQA
This component enables text generation (image captioning) using Google Vertex AI generative models.
Authenticates using Google Cloud Application Default Credentials (ADCs). For more information see the official Google documentation.
Usage example:
from haystack.dataclasses.byte_stream import ByteStream
from haystack_integrations.components.generators.google_vertex import VertexAIImageQA
qa = VertexAIImageQA()
image = ByteStream.from_file_path("dog.jpg")
res = qa.run(image=image, question="What color is this dog")
print(res["replies"][0])
>>> white
VertexAIImageQA.__init__
def __init__(*,
model: str = "imagetext",
project_id: Optional[str] = None,
location: Optional[str] = None,
**kwargs)
Answers questions about an image using a Google Vertex AI model.
Authenticates using Google Cloud Application Default Credentials (ADCs). For more information see the official Google documentation.
Arguments:
project_id
: ID of the GCP project to use. By default, it is set during Google Cloud authentication.model
: Name of the model to use.location
: The default location to use when making API calls, if not set uses us-central-1.kwargs
: Additional keyword arguments to pass to the model. For a list of supported arguments see theImageTextModel.ask_question()
documentation.
VertexAIImageQA.to_dict
def to_dict() -> Dict[str, Any]
Serializes the component to a dictionary.
Returns:
Dictionary with serialized data.
VertexAIImageQA.from_dict
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "VertexAIImageQA"
Deserializes the component from a dictionary.
Arguments:
data
: Dictionary to deserialize from.
Returns:
Deserialized component.
VertexAIImageQA.run
@component.output_types(replies=List[str])
def run(image: ByteStream, question: str)
Prompts model to answer a question about an image.
Arguments:
image
: The image to ask the question about.question
: The question to ask.
Returns:
A dictionary with the following keys:
replies
: A list of answers to the question.
Module haystack_integrations.components.generators.google_vertex.text_generator
VertexAITextGenerator
This component enables text generation using Google Vertex AI generative models.
VertexAITextGenerator
supports text-bison
, text-unicorn
and text-bison-32k
models.
Authenticates using Google Cloud Application Default Credentials (ADCs). For more information see the official Google documentation.
Usage example:
from haystack_integrations.components.generators.google_vertex import VertexAITextGenerator
generator = VertexAITextGenerator()
res = generator.run("Tell me a good interview question for a software engineer.")
print(res["replies"][0])
>>> **Question:**
>>> You are given a list of integers and a target sum.
>>> Find all unique combinations of numbers in the list that add up to the target sum.
>>>
>>> **Example:**
>>>
>>> ```
>>> Input: [1, 2, 3, 4, 5], target = 7
>>> Output: [[1, 2, 4], [3, 4]]
>>> ```
>>>
>>> **Follow-up:** What if the list contains duplicate numbers?
VertexAITextGenerator.__init__
def __init__(*,
model: str = "text-bison",
project_id: Optional[str] = None,
location: Optional[str] = None,
**kwargs)
Generate text using a Google Vertex AI model.
Authenticates using Google Cloud Application Default Credentials (ADCs). For more information see the official Google documentation.
Arguments:
project_id
: ID of the GCP project to use. By default, it is set during Google Cloud authentication.model
: Name of the model to use.location
: The default location to use when making API calls, if not set uses us-central-1.kwargs
: Additional keyword arguments to pass to the model. For a list of supported arguments see theTextGenerationModel.predict()
documentation.
VertexAITextGenerator.to_dict
def to_dict() -> Dict[str, Any]
Serializes the component to a dictionary.
Returns:
Dictionary with serialized data.
VertexAITextGenerator.from_dict
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "VertexAITextGenerator"
Deserializes the component from a dictionary.
Arguments:
data
: Dictionary to deserialize from.
Returns:
Deserialized component.
VertexAITextGenerator.run
@component.output_types(replies=List[str],
safety_attributes=Dict[str, float],
citations=List[Dict[str, Any]])
def run(prompt: str)
Prompts the model to generate text.
Arguments:
prompt
: The prompt to use for text generation.
Returns:
A dictionary with the following keys:
replies
: A list of generated replies.safety_attributes
: A dictionary with the safety scores of each answer.citations
: A list of citations for each answer.
Module haystack_integrations.components.generators.google_vertex.chat.gemini
VertexAIGeminiChatGenerator
VertexAIGeminiChatGenerator
enables chat completion using Google Gemini models.
Authenticates using Google Cloud Application Default Credentials (ADCs). For more information see the official Google documentation.
Usage example
from haystack.dataclasses import ChatMessage
from haystack_integrations.components.generators.google_vertex import VertexAIGeminiChatGenerator
gemini_chat = VertexAIGeminiChatGenerator()
messages = [ChatMessage.from_user("Tell me the name of a movie")]
res = gemini_chat.run(messages)
print(res["replies"][0].text)
>>> The Shawshank Redemption
#### With Tool calling:
```python
from typing import Annotated
from haystack.utils import Secret
from haystack.dataclasses.chat_message import ChatMessage
from haystack.components.tools import ToolInvoker
from haystack.tools import create_tool_from_function
from haystack_integrations.components.generators.google_vertex import VertexAIGeminiChatGenerator
__example function to get the current weather__
def get_current_weather(
location: Annotated[str, "The city for which to get the weather, e.g. 'San Francisco'"] = "Munich",
unit: Annotated[str, "The unit for the temperature, e.g. 'celsius'"] = "celsius",
) -> str:
return f"The weather in {location} is sunny. The temperature is 20 {unit}."
tool = create_tool_from_function(get_current_weather)
tool_invoker = ToolInvoker(tools=[tool])
gemini_chat = VertexAIGeminiChatGenerator(
model="gemini-2.0-flash-exp",
tools=[tool],
)
user_message = [ChatMessage.from_user("What is the temperature in celsius in Berlin?")]
replies = gemini_chat.run(messages=user_message)["replies"]
print(replies[0].tool_calls)
__actually invoke the tool__
tool_messages = tool_invoker.run(messages=replies)["tool_messages"]
messages = user_message + replies + tool_messages
__transform the tool call result into a human readable message__
final_replies = gemini_chat.run(messages=messages)["replies"]
print(final_replies[0].text)
VertexAIGeminiChatGenerator.__init__
def __init__(*,
model: str = "gemini-1.5-flash",
project_id: Optional[str] = None,
location: Optional[str] = None,
generation_config: Optional[Union[GenerationConfig,
Dict[str, Any]]] = None,
safety_settings: Optional[Dict[HarmCategory,
HarmBlockThreshold]] = None,
tools: Optional[List[Tool]] = None,
tool_config: Optional[ToolConfig] = None,
streaming_callback: Optional[StreamingCallbackT] = None)
VertexAIGeminiChatGenerator
enables chat completion using Google Gemini models.
Authenticates using Google Cloud Application Default Credentials (ADCs). For more information see the official Google documentation.
Arguments:
model
: Name of the model to use. For available models, see https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models.project_id
: ID of the GCP project to use. By default, it is set during Google Cloud authentication.location
: The default location to use when making API calls, if not set uses us-central-1. Defaults to None.generation_config
: Configuration for the generation process. See the [GenerationConfig documentation](https://cloud.google.com/python/docs/reference/aiplatform/latest/vertexai.generative_models.GenerationConfig for a list of supported arguments.safety_settings
: Safety settings to use when generating content. See the documentation for HarmBlockThreshold and HarmCategory for more details.tools
: A list of tools for which the model can prepare calls.tool_config
: The tool config to use. See the documentation for [ToolConfig] (https://cloud.google.com/vertex-ai/generative-ai/docs/reference/python/latest/vertexai.generative_models.ToolConfig)streaming_callback
: A callback function that is called when a new token is received from the stream. The callback function accepts StreamingChunk as an argument.
VertexAIGeminiChatGenerator.to_dict
def to_dict() -> Dict[str, Any]
Serializes the component to a dictionary.
Returns:
Dictionary with serialized data.
VertexAIGeminiChatGenerator.from_dict
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "VertexAIGeminiChatGenerator"
Deserializes the component from a dictionary.
Arguments:
data
: Dictionary to deserialize from.
Returns:
Deserialized component.
VertexAIGeminiChatGenerator.run
@component.output_types(replies=List[ChatMessage])
def run(messages: List[ChatMessage],
streaming_callback: Optional[StreamingCallbackT] = None,
*,
tools: Optional[List[Tool]] = None)
Arguments:
messages
: A list ofChatMessage
instances, representing the input messages.streaming_callback
: A callback function that is called when a new token is received from the stream.tools
: A list of tools for which the model can prepare calls. If set, it will override thetools
parameter set during component initialization.
Returns:
A dictionary containing the following key:
replies
: A list containing the generated responses asChatMessage
instances.
VertexAIGeminiChatGenerator.run_async
@component.output_types(replies=List[ChatMessage])
async def run_async(messages: List[ChatMessage],
streaming_callback: Optional[StreamingCallbackT] = None,
*,
tools: Optional[List[Tool]] = None)
Async version of the run method. Generates text based on the provided messages.
Arguments:
messages
: A list ofChatMessage
instances, representing the input messages.streaming_callback
: A callback function that is called when a new token is received from the stream.tools
: A list of tools for which the model can prepare calls. If set, it will override thetools
parameter set during component initialization.
Returns:
A dictionary containing the following key:
replies
: A list containing the generated responses asChatMessage
instances.
Module haystack_integrations.components.embedders.google_vertex.document_embedder
VertexAIDocumentEmbedder
Embed text using Vertex AI Embedder API
Available models found here: https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/text-embeddings-api#syntax
Usage example:
from haystack import Document
from haystack_integrations.components.embedders.google_vertex import VertexAIDocumentEmbedder
doc = Document(content="I love pizza!")
document_embedder = VertexAIDocumentEmbedder(model="text-embedding-005")
result = document_embedder.run([doc])
print(result['documents'][0].embedding)
# [-0.044606007635593414, 0.02857724390923977, -0.03549133986234665,
VertexAIDocumentEmbedder.__init__
def __init__(model: Literal[
"text-embedding-004",
"text-embedding-005",
"textembedding-gecko-multilingual@001",
"text-multilingual-embedding-002",
"text-embedding-large-exp-03-07",
],
task_type: Literal[
"RETRIEVAL_DOCUMENT",
"RETRIEVAL_QUERY",
"SEMANTIC_SIMILARITY",
"CLASSIFICATION",
"CLUSTERING",
"QUESTION_ANSWERING",
"FACT_VERIFICATION",
"CODE_RETRIEVAL_QUERY",
] = "RETRIEVAL_DOCUMENT",
gcp_region_name: Optional[Secret] = Secret.from_env_var(
"GCP_DEFAULT_REGION", strict=False),
gcp_project_id: Optional[Secret] = Secret.from_env_var(
"GCP_PROJECT_ID", strict=False),
batch_size: int = 32,
max_tokens_total: int = 20000,
time_sleep: int = 30,
retries: int = 3,
progress_bar: bool = True,
truncate_dim: Optional[int] = None,
meta_fields_to_embed: Optional[List[str]] = None,
embedding_separator: str = "\n") -> None
Generate Document Embedder using a Google Vertex AI model.
Authenticates using Google Cloud Application Default Credentials (ADCs). For more information see the official Google documentation.
Arguments:
model
: Name of the model to use.task_type
: The type of task for which the embeddings are being generated. For more information see the official Google documentation.gcp_region_name
: The default location to use when making API calls, if not set uses us-central-1.gcp_project_id
: ID of the GCP project to use. By default, it is set during Google Cloud authentication.batch_size
: The number of documents to process in a single batch.max_tokens_total
: The maximum number of tokens to process in total.time_sleep
: The time to sleep between retries in seconds.retries
: The number of retries in case of failure.progress_bar
: Whether to display a progress bar during processing.truncate_dim
: The dimension to truncate the embeddings to, if specified.meta_fields_to_embed
: A list of metadata fields to include in the embeddings.embedding_separator
: The separator to use between different embeddings.
Raises:
ValueError
: If the provided model is not in the list of supported models.
VertexAIDocumentEmbedder.get_text_embedding_input
def get_text_embedding_input(
batch: List[Document]) -> List[TextEmbeddingInput]
Converts a batch of Document objects into a list of TextEmbeddingInput objects.
Arguments:
batch
List[Document] - A list of Document objects to be converted.
Returns:
List[TextEmbeddingInput]
- A list of TextEmbeddingInput objects created from the input documents.
VertexAIDocumentEmbedder.embed_batch_by_smaller_batches
def embed_batch_by_smaller_batches(batch: List[str],
subbatch=1) -> List[List[float]]
Embeds a batch of text strings by dividing them into smaller sub-batches.
Arguments:
batch
List[str] - A list of text strings to be embedded.subbatch
int, optional - The size of the smaller sub-batches. Defaults to 1.
Returns:
List[List[float]]
- A list of embeddings, where each embedding is a list of floats.
Raises:
Exception
- If embedding fails at the item level, an exception is raised with the error details.
VertexAIDocumentEmbedder.embed_batch
def embed_batch(batch: List[str]) -> List[List[float]]
Generate embeddings for a batch of text strings.
Arguments:
batch
List[str] - A list of text strings to be embedded.
Returns:
List[List[float]]
- A list of embeddings, where each embedding is a list of floats.
VertexAIDocumentEmbedder.run
@component.output_types(documents=List[Document])
def run(documents: List[Document])
Processes all documents in batches while adhering to the API's token limit per request.
Arguments:
- documents: List of documents (strings) to be processed.
Returns:
- List of dictionaries containing processing results for each document.
VertexAIDocumentEmbedder.to_dict
def to_dict() -> Dict[str, Any]
Serializes the component to a dictionary.
Returns:
Dictionary with serialized data.
VertexAIDocumentEmbedder.from_dict
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "VertexAIDocumentEmbedder"
Deserializes the component from a dictionary.
Arguments:
data
: Dictionary to deserialize from.
Returns:
Deserialized component.
Module haystack_integrations.components.embedders.google_vertex.text_embedder
VertexAITextEmbedder
Embed text using VertexAI Text Embedder API
Available models found here: https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/text-embeddings-api#syntax
Usage example:
from haystack_integrations.components.embedders.google_vertex import VertexAITextEmbedder
text_to_embed = "I love pizza!"
text_embedder = VertexAITextEmbedder(model="text-embedding-005")
print(text_embedder.run(text_to_embed))
# {'embedding': [-0.08127457648515701, 0.03399784862995148, -0.05116401985287666, ...]
VertexAITextEmbedder.__init__
def __init__(model: Literal[
"text-embedding-004",
"text-embedding-005",
"textembedding-gecko-multilingual@001",
"text-multilingual-embedding-002",
"text-embedding-large-exp-03-07",
],
task_type: Literal[
"RETRIEVAL_DOCUMENT",
"RETRIEVAL_QUERY",
"SEMANTIC_SIMILARITY",
"CLASSIFICATION",
"CLUSTERING",
"QUESTION_ANSWERING",
"FACT_VERIFICATION",
"CODE_RETRIEVAL_QUERY",
] = "RETRIEVAL_QUERY",
gcp_region_name: Optional[Secret] = Secret.from_env_var(
"GCP_DEFAULT_REGION", strict=False),
gcp_project_id: Optional[Secret] = Secret.from_env_var(
"GCP_PROJECT_ID", strict=False),
progress_bar: bool = True,
truncate_dim: Optional[int] = None) -> None
Initializes the TextEmbedder with the specified model, task type, and GCP configuration.
Arguments:
model (Literal["text-embedding-004", "text-embedding-005", "textembedding-gecko-multilingual@001", "text-multilingual-embedding-002", "text-embedding-large-exp-03-07"]): The model to be used for text embedding. task_type (Literal["RETRIEVAL_DOCUMENT", "RETRIEVAL_QUERY", "SEMANTIC_SIMILARITY", "CLASSIFICATION", "CLUSTERING", "QUESTION_ANSWERING", "FACT_VERIFICATION", "CODE_RETRIEVAL_QUERY"]): The type of task for which the embedding model will be used. Please refer to the VertexAI documentation for more details here: https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/text-embeddings-api#tasktype gcp_region_name (Optional[Secret], optional): The GCP region name, fetched from environment variable "GCP_DEFAULT_REGION" if not provided. Defaults to None. gcp_project_id (Optional[Secret], optional): The GCP project ID, fetched from environment variable "GCP_PROJECT_ID" if not provided. Defaults to None. progress_bar (bool, optional): Whether to display a progress bar during operations. Defaults to True. truncate_dim (Optional[int], optional): The dimension to which embeddings should be truncated. Defaults to None.
Returns:
None
VertexAITextEmbedder.to_dict
def to_dict() -> Dict[str, Any]
Serializes the component to a dictionary.
Returns:
Dictionary with serialized data.
VertexAITextEmbedder.from_dict
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "VertexAITextEmbedder"
Deserializes the component from a dictionary.
Arguments:
data
: Dictionary to deserialize from.
Returns:
Deserialized component.