VertexModelGardenLlama(
self,
**kwargs = {},
)_BaseVertexMaasModelGardenBaseChatModelIntegration for Llama 3.1 on Google Cloud Vertex AI Model-as-a-Service.
Setup:
You need to enable a corresponding MaaS model (Google Cloud UI console -> Vertex AI -> Model Garden -> search for a model you need and click enable)
And either:
- Have credentials configured for your environment (gcloud, workload
identity, etc...)
- Store the path to a service account JSON file as the
GOOGLE_APPLICATION_CREDENTIALS environment variable
This codebase uses the google.auth library which first looks for the
application credentials variable mentioned above, and then looks for system-level auth.
Key init args — completion params:
model: str
Name of VertexMaaS model to use ('meta/llama3-405b-instruct-maas')
append_tools_to_system_message: bool
Whether to append tools to a system message
Key init args — client params: credentials: Optional[google.auth.credentials.Credentials] The default custom credentials to use when making API calls. If not provided, credentials will be ascertained from the environment. project: Optional[str] The default GCP project to use when making Vertex API calls. location: str = "us-central1" The default location to use when making API calls.
See full list of supported init args and their descriptions in the params section.
Instantiate:
from langchain_google_vertexai import VertexMaaS
llm = VertexModelGardenLlama(
model="meta/llama3-405b-instruct-maas",
# other params...
)
Invoke:
messages = [
(
"system",
"You are a helpful translator. Translate the user sentence to French.",
),
("human", "I love programming."),
]
llm.invoke(messages)
AIMessage(
content="J'adore programmer. \n",
id="run-925ce305-2268-44c4-875f-dde9128520ad-0",
)
Stream:
for chunk in llm.stream(messages):
print(chunk)
AIMessageChunk(content="J", id="run-9df01d73-84d9-42db-9d6b-b1466a019e89")
AIMessageChunk(
content="'adore programmer. \n",
id="run-9df01d73-84d9-42db-9d6b-b1466a019e89",
)
AIMessageChunk(content="", id="run-9df01d73-84d9-42db-9d6b-b1466a019e89")
stream = llm.stream(messages)
full = next(stream)
for chunk in stream:
full += chunk
full
AIMessageChunk(
content="J'adore programmer. \n",
id="run-b7f7492c-4cb5-42d0-8fc3-dce9b293b0fb",
)