Enables use of Large Language Models from different providers with PromptNode.
Module anthropic_claude
AnthropicClaudeInvocationLayer
class AnthropicClaudeInvocationLayer(PromptModelInvocationLayer)
Anthropic Claude Invocation Layer This layer invokes the Claude API provided by Anthropic.
AnthropicClaudeInvocationLayer.__init__
def __init__(api_key: str,
model_name_or_path: str = "claude-2",
max_length=200,
**kwargs)
Creates an instance of PromptModelInvocation Layer for Claude models by Anthropic.
Arguments:
model_name_or_path
: The name or path of the underlying model.max_tokens_to_sample
: The maximum length of the output text.api_key
: The Anthropic API key.kwargs
: Additional keyword arguments passed to the underlying model. The list of Anthropic-relevant kwargs includes: stop_sequences, temperature, top_p, top_k, and stream. For more details about these kwargs, see Anthropic's documentation.
AnthropicClaudeInvocationLayer.invoke
def invoke(*args, **kwargs)
Invokes a prompt on the model. It takes in a prompt and returns a list of responses using a REST invocation.
Returns:
The responses are being returned.
AnthropicClaudeInvocationLayer.supports
@classmethod
def supports(cls, model_name_or_path: str, **kwargs) -> bool
Ensures Anthropic Claude Invocation Layer is selected only when Claude models are specified in the model name.
Module azure_chatgpt
AzureChatGPTInvocationLayer
class AzureChatGPTInvocationLayer(ChatGPTInvocationLayer)
Azure ChatGPT Invocation Layer
This layer is used to invoke the ChatGPT API on Azure. It is essentially the same as the ChatGPTInvocationLayer
with additional two parameters: azure_base_url
and azure_deployment_name
. The azure_base_url
is the URL of the Azure OpenAI
endpoint and the azure_deployment_name
is the name of the deployment.
AzureChatGPTInvocationLayer.supports
@classmethod
def supports(cls, model_name_or_path: str, **kwargs) -> bool
Ensures Azure ChatGPT Invocation Layer is selected when azure_base_url
and azure_deployment_name
are provided in
addition to a list of supported models.
Module azure_open_ai
AzureOpenAIInvocationLayer
class AzureOpenAIInvocationLayer(OpenAIInvocationLayer)
Azure OpenAI Invocation Layer
This layer is used to invoke the OpenAI API on Azure. It is essentially the same as the OpenAIInvocationLayer
with additional two parameters: azure_base_url
and azure_deployment_name
. The azure_base_url
is the URL of the Azure OpenAI
endpoint and the azure_deployment_name
is the name of the deployment.
AzureOpenAIInvocationLayer.supports
@classmethod
def supports(cls, model_name_or_path: str, **kwargs) -> bool
Ensures Azure OpenAI Invocation Layer is selected when azure_base_url
and azure_deployment_name
are provided in
addition to a list of supported models.
Module chatgpt
ChatGPTInvocationLayer
class ChatGPTInvocationLayer(OpenAIInvocationLayer)
ChatGPT Invocation Layer
PromptModelInvocationLayer implementation for OpenAI's GPT-3 ChatGPT API. Invocations are made using REST API. See OpenAI ChatGPT API for more details.
Note: kwargs other than init parameter names are ignored to enable reflective construction of the class as many variants of PromptModelInvocationLayer are possible and they may have different parameters.
ChatGPTInvocationLayer.__init__
def __init__(api_key: str,
model_name_or_path: str = "gpt-3.5-turbo",
max_length: Optional[int] = 500,
api_base: str = "https://api.openai.com/v1",
timeout: Optional[float] = None,
**kwargs)
Creates an instance of ChatGPTInvocationLayer for OpenAI's GPT-3.5 GPT-4 models.
Arguments:
model_name_or_path
: The name or path of the underlying model.max_length
: The maximum number of tokens the output text can have.api_key
: The OpenAI API key.api_base
: The OpenAI API Base url, defaults tohttps://api.openai.com/v1
.kwargs
: Additional keyword arguments passed to the underlying model. See OpenAI documentation. Note: additional model argument moderate_content will filter input and generated answers for potentially sensitive content using the OpenAI Moderation API if set. If the input or answers are flagged, an empty list is returned in place of the answers.
ChatGPTInvocationLayer.ainvoke
async def ainvoke(*args, **kwargs)
Invokes a prompt on the model. Based on the model, it takes in a prompt (or either a prompt or a list of messages)
and returns a list of responses using a REST invocation.
Returns:
The responses are being returned. Note: Only kwargs relevant to OpenAI are passed to OpenAI rest API. Others kwargs are ignored. For more details, see OpenAI documentation.
ChatGPTInvocationLayer.invoke
def invoke(*args, **kwargs)
Invokes a prompt on the model. Based on the model, it takes in a prompt (or either a prompt or a list of messages)
and returns a list of responses using a REST invocation.
Returns:
The responses are being returned. Note: Only kwargs relevant to OpenAI are passed to OpenAI rest API. Others kwargs are ignored. For more details, see OpenAI documentation.
Module cohere
CohereInvocationLayer
class CohereInvocationLayer(PromptModelInvocationLayer)
PromptModelInvocationLayer implementation for Cohere's command models. Invocations are made using REST API.
CohereInvocationLayer.__init__
def __init__(api_key: str,
model_name_or_path: str,
max_length: Optional[int] = 100,
**kwargs)
Creates an instance of CohereInvocationLayer for the specified Cohere model
Arguments:
api_key
: Cohere API keymodel_name_or_path
: Cohere model namemax_length
: The maximum length of the output text.
CohereInvocationLayer.invoke
def invoke(*args, **kwargs)
Invokes a prompt on the model. It takes in a prompt and returns a list of responses using a REST invocation.
Returns:
The responses are being returned.
CohereInvocationLayer.supports
@classmethod
def supports(cls, model_name_or_path: str, **kwargs) -> bool
Ensures CohereInvocationLayer is selected only when Cohere models are specified in the model name.
Module hugging_face
HFLocalInvocationLayer
class HFLocalInvocationLayer(PromptModelInvocationLayer)
A subclass of the PromptModelInvocationLayer class. It loads a pre-trained model from Hugging Face and passes a prepared prompt into that model.
Note: kwargs other than init parameter names are ignored to enable reflective construction of the class, as many variants of PromptModelInvocationLayer are possible and they may have different parameters.
HFLocalInvocationLayer.__init__
def __init__(model_name_or_path: str = "google/flan-t5-base",
max_length: int = 100,
use_auth_token: Optional[Union[str, bool]] = None,
use_gpu: Optional[bool] = True,
devices: Optional[List[Union[str, "torch.device"]]] = None,
**kwargs)
Creates an instance of HFLocalInvocationLayer used to invoke local Hugging Face models.
Arguments:
model_name_or_path
: The name or path of the underlying model.max_length
: The maximum number of tokens the output text can have.use_auth_token
: The token to use as HTTP bearer authorization for remote files.use_gpu
: Whether to use GPU for inference.device
: The device to use for inference.kwargs
: Additional keyword arguments passed to the underlying model. Due to reflective construction of all PromptModelInvocationLayer instances, this instance of HFLocalInvocationLayer might receive some unrelated kwargs. Only kwargs relevant to the HFLocalInvocationLayer are considered. The list of supported kwargs includes: "task", "model", "config", "tokenizer", "feature_extractor", "revision", "use_auth_token", "device_map", "device", "torch_dtype", "trust_remote_code", "model_kwargs", and "pipeline_class". For more details about pipeline kwargs in general, see Hugging Face documentation.
This layer supports two additional kwargs: generation_kwargs and model_max_length.
The generation_kwargs are used to customize text generation for the underlying pipeline. See Hugging Face docs for more details.
The model_max_length is used to specify the custom sequence length for the underlying pipeline.
HFLocalInvocationLayer.invoke
def invoke(*args, **kwargs)
It takes a prompt and returns a list of generated texts using the local Hugging Face transformers model
Returns:
A list of generated texts. Note: Only kwargs relevant to Text2TextGenerationPipeline and TextGenerationPipeline are passed to Hugging Face as model_input_kwargs. Other kwargs are ignored.
Module hugging_face_inference
HFInferenceEndpointInvocationLayer
class HFInferenceEndpointInvocationLayer(PromptModelInvocationLayer)
A PromptModelInvocationLayer that invokes Hugging Face remote Inference Endpoint and API Inference to prompt the model. For more details see Hugging Face Inference API documentation and Hugging Face Inference Endpoints documentation
The Inference API is free to use, and rate limited. If you need an inference solution for production, you can use Inference Endpoints service.
See documentation for more details: https://huggingface.co/docs/inference-endpoints
HFInferenceEndpointInvocationLayer.__init__
def __init__(api_key: str,
model_name_or_path: str,
max_length: Optional[int] = 100,
**kwargs)
Creates an instance of HFInferenceEndpointInvocationLayer
Arguments:
model_name_or_path
: can be either: a) Hugging Face Inference model name (i.e. google/flan-t5-xxl) b) Hugging Face Inference Endpoint URL (i.e. e.g. https://.us-east-1.aws.endpoints.huggingface.cloud)max_length
: The maximum length of the output text.api_key
: The Hugging Face API token. You’ll need to provide your user token which can be found in your Hugging Face account settings
HFInferenceEndpointInvocationLayer.invoke
def invoke(*args, **kwargs)
Invokes a prompt on the model. It takes in a prompt and returns a list of responses using a REST invocation.
Returns:
The responses are being returned.
Module open_ai
OpenAIInvocationLayer
class OpenAIInvocationLayer(PromptModelInvocationLayer)
PromptModelInvocationLayer implementation for OpenAI's GPT-3 InstructGPT models. Invocations are made using REST API. See OpenAI GPT-3 for more details.
Note: kwargs other than init parameter names are ignored to enable reflective construction of the class as many variants of PromptModelInvocationLayer are possible and they may have different parameters.
OpenAIInvocationLayer.__init__
def __init__(api_key: str,
model_name_or_path: str = "gpt-3.5-turbo-instruct",
max_length: Optional[int] = 100,
api_base: str = "https://api.openai.com/v1",
openai_organization: Optional[str] = None,
timeout: Optional[float] = None,
**kwargs)
Creates an instance of OpenAIInvocationLayer for OpenAI's GPT-3 InstructGPT models.
Arguments:
model_name_or_path
: The name or path of the underlying model.max_length
: The maximum number of tokens the output text can have.api_key
: The OpenAI API key.api_base
: The OpenAI API Base url, defaults tohttps://api.openai.com/v1
.openai_organization
: The OpenAI-Organization ID, defaults toNone
. For more details, see see OpenAI documentation.kwargs
: Additional keyword arguments passed to the underlying model. Due to reflective construction of all PromptModelInvocationLayer instances, this instance of OpenAIInvocationLayer might receive some unrelated kwargs. Only the kwargs relevant to OpenAIInvocationLayer are considered. The list of OpenAI-relevant kwargs includes: suffix, temperature, top_p, presence_penalty, frequency_penalty, best_of, n, max_tokens, logit_bias, stop, echo, and logprobs. For more details about these kwargs, see OpenAI documentation. Note: additional model argument moderate_content will filter input and generated answers for potentially sensitive content using the OpenAI Moderation API if set. If the input or answers are flagged, an empty list is returned in place of the answers.
OpenAIInvocationLayer.invoke
def invoke(*args, **kwargs)
Invokes a prompt on the model. Based on the model, it takes in a prompt (or either a prompt or a list of messages)
and returns a list of responses using a REST invocation.
Returns:
The responses are being returned. Note: Only kwargs relevant to OpenAI are passed to OpenAI rest API. Others kwargs are ignored. For more details, see OpenAI documentation.
OpenAIInvocationLayer.ainvoke
async def ainvoke(*args, **kwargs)
asyncio version of the invoke
method.
Module sagemaker_base
SageMakerBaseInvocationLayer
class SageMakerBaseInvocationLayer(AWSBaseInvocationLayer, ABC)
Base class for SageMaker based invocation layers.
SageMakerBaseInvocationLayer.get_test_payload
@classmethod
@abstractmethod
def get_test_payload(cls) -> Dict[str, Any]
Return test payload for the model.
SageMakerBaseInvocationLayer.supports
@classmethod
def supports(cls, model_name_or_path: str, **kwargs) -> bool
Checks whether a model_name_or_path passed down (e.g. via PromptNode) is supported by this class.
Arguments:
model_name_or_path
: The model_name_or_path to check.
SageMakerBaseInvocationLayer.check_endpoint_in_service
@classmethod
def check_endpoint_in_service(cls, session: "boto3.Session", endpoint: str)
Checks if the SageMaker endpoint exists and is in service.
Arguments:
session
: The boto3 session.endpoint
: The endpoint to check.
SageMakerBaseInvocationLayer.format_custom_attributes
@classmethod
def format_custom_attributes(cls, attributes: dict) -> str
Formats the custom attributes for the SageMaker endpoint.
Arguments:
attributes
: The custom attributes to format.
Returns:
The formatted custom attributes.
SageMakerBaseInvocationLayer.check_model_input_format
@classmethod
def check_model_input_format(cls, session: "boto3.Session", endpoint: str,
test_payload: Any, **kwargs)
Checks if the SageMaker endpoint supports the test_payload model input format.
Arguments:
session
: The boto3 session.endpoint
: The endpoint to hittest_payload
: The payload to send to the endpoint
Returns:
True if the endpoint supports the test_payload model input format, False otherwise.
Module sagemaker_hf_infer
SageMakerHFInferenceInvocationLayer
class SageMakerHFInferenceInvocationLayer(SageMakerBaseInvocationLayer)
SageMaker HuggingFace Inference Invocation Layer
SageMakerHFInferenceInvocationLayer enables the use of Large Language Models (LLMs) hosted on a SageMaker Inference Endpoint via PromptNode. It supports text-generation and text2text-generation models from HuggingFace, which are running on the SageMaker Inference Endpoint.
As of June 23, this layer has been confirmed to support the following SageMaker deployed models:
- MPT
- Dolly V2
- Flan-U2
- Flan-T5
- RedPajama
- Open Llama
- GPT-J-6B
- GPT NEO
- BloomZ
For guidance on how to deploy such a model to SageMaker, refer to the SageMaker JumpStart foundation models documentation and follow the instructions provided there.
Technical Note:
This layer is designed for models that anticipate an input format composed of the following keys/values: {'text_inputs': 'prompt_text', **(params or {})} The text_inputs key represents the prompt text, with all additional parameters for the model being added at the same dictionary level as text_inputs.
Example
Example using AWS env variables
Of course, in both examples your endpoints, region names and other settings will be different. You can find it in the SageMaker AWS console.
from haystack.nodes import PromptNode
# Pass sagemaker endpoint name and authentication details
pn = PromptNode(model_name_or_path="jumpstart-dft-hf-textgeneration-dolly-v2-3b-bf16",
model_kwargs={"aws_profile_name": "my_aws_profile_name", "aws_region_name": "eu-central-1"})
res = pn("what is the meaning of life?")
print(res)
import os
from haystack.nodes import PromptNode
# We can also configure Sagemaker via AWS environment variables without AWS profile name
pn = PromptNode(model_name_or_path="jumpstart-dft-hf-textgeneration-dolly-v2-3b-bf16", max_length=128,
model_kwargs={"aws_access_key_id": os.getenv("AWS_ACCESS_KEY_ID"),
"aws_secret_access_key": os.getenv("AWS_SECRET_ACCESS_KEY"),
"aws_session_token": os.getenv("AWS_SESSION_TOKEN"),
"aws_region_name": "us-east-1"})
response = pn("Tell me more about Berlin, be elaborate")
print(response)
SageMakerHFInferenceInvocationLayer.__init__
def __init__(model_name_or_path: str,
max_length: int = 100,
aws_access_key_id: Optional[str] = None,
aws_secret_access_key: Optional[str] = None,
aws_session_token: Optional[str] = None,
aws_region_name: Optional[str] = None,
aws_profile_name: Optional[str] = None,
**kwargs)
Instantiates the session with SageMaker using IAM based authentication via boto3.
Arguments:
model_name_or_path
: The name for SageMaker Model Endpoint.max_length
: The maximum length of the output text.aws_access_key_id
: AWS access key ID.aws_secret_access_key
: AWS secret access key.aws_session_token
: AWS session token.aws_region_name
: AWS region name.aws_profile_name
: AWS profile name.
SageMakerHFInferenceInvocationLayer.invoke
def invoke(*args, **kwargs) -> List[str]
Sends the prompt to the remote model and returns the generated response(s).
You can pass all parameters supported by the SageMaker model here via **kwargs (e.g. "temperature", "do_sample" ...).
Returns:
The generated responses from the model as a list of strings.
SageMakerHFInferenceInvocationLayer.get_test_payload
@classmethod
def get_test_payload(cls) -> Dict[str, str]
Returns a payload used for testing if the current endpoint supports the JSON payload format used by
this class.
As of June 23, Sagemaker endpoints support the format where the payload is a JSON object with: "text_inputs" used as the key and the prompt as the value. All other parameters are passed as key/value pairs on the same level. See _post method for more details.
Returns:
A payload used for testing if the current endpoint is working.
Module sagemaker_hf_text_gen
SageMakerHFTextGenerationInvocationLayer
class SageMakerHFTextGenerationInvocationLayer(SageMakerBaseInvocationLayer)
SageMaker HuggingFace TextGeneration Invocation Layer
SageMakerHFTextGenerationInvocationLayer enables the use of Large Language Models (LLMs) hosted on a SageMaker Inference Endpoint via PromptNode. It supports text-generation from HuggingFace, which are running on the SageMaker Inference Endpoint.
For guidance on how to deploy such a model to SageMaker, refer to the SageMaker JumpStart foundation models documentation and follow the instructions provided there.
As of June 23, this layer has been confirmed to support the following SageMaker deployed models:
- Falcon models
Technical Note: This layer is designed for models that anticipate an input format composed of the following keys/values: {'inputs': 'prompt_text', 'parameters': params} where 'inputs' represents the prompt and 'parameters' the parameters for the model.
Example
Example using AWS env variables
Of course, in both examples your endpoints, region names and other settings will be different. You can find it in the SageMaker AWS console.
from haystack.nodes import PromptNode
# Pass sagemaker endpoint name and authentication details
pn = PromptNode(model_name_or_path="falcon-40b-my-sagemaker-inference-endpoint,
model_kwargs={"aws_profile_name": "my_aws_profile_name", "aws_region_name": "eu-central-1"})
res = pn("what is the meaning of life?")
print(res)
import os
from haystack.nodes import PromptNode
# We can also configure Sagemaker via AWS environment variables without AWS profile name
pn = PromptNode(model_name_or_path="hf-llm-falcon-7b-instruct-bf16-2023-06-22-16-22-19-811", max_length=256,
model_kwargs={"aws_access_key_id": os.getenv("AWS_ACCESS_KEY_ID"),
"aws_secret_access_key": os.getenv("AWS_SECRET_ACCESS_KEY"),
"aws_session_token": os.getenv("AWS_SESSION_TOKEN"),
"aws_region_name": "us-east-1"})
response = pn("Tell me more about Berlin, be elaborate")
print(response)
SageMakerHFTextGenerationInvocationLayer.__init__
def __init__(model_name_or_path: str,
max_length: int = 100,
aws_access_key_id: Optional[str] = None,
aws_secret_access_key: Optional[str] = None,
aws_session_token: Optional[str] = None,
aws_region_name: Optional[str] = None,
aws_profile_name: Optional[str] = None,
**kwargs)
Instantiates the session with SageMaker using IAM based authentication via boto3.
Arguments:
model_name_or_path
: The name for SageMaker Model Endpoint.max_length
: The maximum length of the output text.aws_access_key_id
: AWS access key ID.aws_secret_access_key
: AWS secret access key.aws_session_token
: AWS session token.aws_region_name
: AWS region name.aws_profile_name
: AWS profile name.
SageMakerHFTextGenerationInvocationLayer.invoke
def invoke(*args, **kwargs) -> List[str]
Sends the prompt to the remote model and returns the generated response(s).
You can pass all parameters supported by the Huggingface Transformers generate
method
here via **kwargs (e.g. "temperature", "stop" ...).
Returns:
The generated responses from the model as a list of strings.
SageMakerHFTextGenerationInvocationLayer.get_test_payload
@classmethod
def get_test_payload(cls) -> Dict[str, Any]
Returns a payload used for testing if the current endpoint supports the JSON payload format used by
this class.
As of June 23, Sagemaker endpoints support the JSON payload format from the https://github.com/huggingface/text-generation-inference project. At the time of writing this docstring, only Falcon models were deployed using this format. See python client implementation from the https://github.com/huggingface/text-generation-inference for more details.
Returns:
A payload used for testing if the current endpoint is working.
Module sagemaker_meta
SageMakerMetaInvocationLayer
class SageMakerMetaInvocationLayer(SageMakerBaseInvocationLayer)
SageMaker Meta Invocation Layer
SageMakerMetaInvocationLayer enables the use of Meta Large Language Models (LLMs) hosted on a SageMaker Inference Endpoint via PromptNode. It primarily focuses on LLama-2 models and it supports both the chat and instruction following models. Other Meta models have not been tested.
For guidance on how to deploy such a model to SageMaker, refer to the SageMaker JumpStart foundation models documentation and follow the instructions provided there.
As of July 24, this layer has been confirmed to support the following SageMaker deployed models:
- Llama-2 models
Technical Note: This layer is designed for models that anticipate an input format composed of the following keys/values: {'inputs': 'prompt_text', 'parameters': params} where 'inputs' represents the prompt and 'parameters' the parameters for the model.
Examples
Example using AWS env variables
LLama-2 also supports chat format. Example using chat format
Note that in the chat examples we can also include multiple turns between the user and the assistant. See the Llama-2 chat documentation for more details.
Example using chat format with multiple turns
Llama-2 models support the following inference payload parameters:
max_new_tokens: Model generates text until the output length (excluding the input context length) reaches max_new_tokens. If specified, it must be a positive integer. temperature: Controls the randomness in the output. Higher temperature results in output sequence with low-probability words and lower temperature results in output sequence with high-probability words. If temperature -> 0, it results in greedy decoding. If specified, it must be a positive float. top_p: In each step of text generation, sample from the smallest possible set of words with cumulative probability top_p. If specified, it must be a float between 0 and 1. return_full_text: If True, input text will be part of the output generated text. If specified, it must be boolean. The default value for it is False.
Of course, in both examples your endpoints, region names and other settings will be different. You can find it in the SageMaker AWS console. ```python from haystack.nodes import PromptNode
# Pass sagemaker endpoint name and authentication details
pn = PromptNode(model_name_or_path="llama-2-7b",
model_kwargs={"aws_profile_name": "my_aws_profile_name"})
res = pn("Berlin is the capital of")
print(res)
```
```python
import os
from haystack.nodes import PromptNode
# We can also configure Sagemaker via AWS environment variables without AWS profile name
pn = PromptNode(model_name_or_path="llama-2-7b", max_length=512,
model_kwargs={"aws_access_key_id": os.getenv("AWS_ACCESS_KEY_ID"),
"aws_secret_access_key": os.getenv("AWS_SECRET_ACCESS_KEY"),
"aws_session_token": os.getenv("AWS_SESSION_TOKEN"),
"aws_region_name": "us-east-1"})
response = pn("The secret for a good life is")
print(response)
```
```python
from haystack.nodes.prompt import PromptNode
pn = PromptNode(model_name_or_path="llama-2-7b-chat", max_length=512, model_kwargs={"aws_profile_name": "default",
"aws_custom_attributes": {"accept_eula": True}})
pn_input = [[{"role": "user", "content": "what is the recipe of mayonnaise?"}]]
response = pn(pn_input)
print(response)
```
```python
from haystack.nodes.prompt import PromptNode
pn = PromptNode(model_name_or_path="llama-2-7b-chat", max_length=512, model_kwargs={"aws_profile_name": "default",
"aws_custom_attributes": {"accept_eula": True}})
pn_input = [[
{"role": "user", "content": "I am going to Paris, what should I see?"},
{"role": "assistant", "content": "Paris, the capital of France, is known for its stunning architecture, art museums, historical landmarks, and romantic atmosphere. Here are some of the top attractions to see in Paris:
-
The Eiffel Tower: The iconic Eiffel Tower is one of the most recognizable landmarks in the world and offers breathtaking views of the city.
-
The Louvre Museum: The Louvre is one of the world's largest and most famous museums, housing an impressive collection of art and artifacts, including the Mona Lisa.
-
Notre-Dame Cathedral: This beautiful cathedral is one of the most famous landmarks in Paris and is known for its Gothic architecture and stunning stained glass windows.
These are just a few of the many attractions that Paris has to offer. With so much to see and do, it's no wonder that Paris is one of the most popular tourist destinations in the world.",},
{"role": "user", "content": "What is so great about 1
?"}]]
response = pn(pn_input)
print(response)
```
SageMakerMetaInvocationLayer.__init__
def __init__(model_name_or_path: str,
max_length: int = 100,
aws_access_key_id: Optional[str] = None,
aws_secret_access_key: Optional[str] = None,
aws_session_token: Optional[str] = None,
aws_region_name: Optional[str] = None,
aws_profile_name: Optional[str] = None,
**kwargs)
Instantiates the session with SageMaker using IAM based authentication via boto3.
Arguments:
model_name_or_path
: The name for SageMaker Model Endpoint.max_length
: The maximum length of the output text.aws_access_key_id
: AWS access key ID.aws_secret_access_key
: AWS secret access key.aws_session_token
: AWS session token.aws_region_name
: AWS region name.aws_profile_name
: AWS profile name.
SageMakerMetaInvocationLayer.invoke
def invoke(*args, **kwargs) -> List[str]
Sends the prompt to the remote model and returns the generated response(s).
Returns:
The generated responses from the model as a list of strings.
SageMakerMetaInvocationLayer.is_proper_chat_conversation_format
def is_proper_chat_conversation_format(prompt: List[Any]) -> bool
Checks whether a chat conversation is in the proper format.
Arguments:
prompt
: The chat conversation to be checked.
Returns:
True if the chat conversation is in the proper format, False otherwise.
SageMakerMetaInvocationLayer.get_test_payload
@classmethod
def get_test_payload(cls) -> Dict[str, Any]
Return test payload for the model.
SageMakerMetaInvocationLayer.supports
@classmethod
def supports(cls, model_name_or_path: str, **kwargs) -> bool
Checks whether a model_name_or_path passed down (e.g. via PromptNode) is supported by this class.
Arguments:
model_name_or_path
: The model_name_or_path to check.