Google Vertex AI (Anthropic)
The Google Vertex AI (Anthropic) provider enables access to Claude models through Google Cloud’s Vertex AI platform. This guide covers setup, configuration, authentication, and usage options specific to the Vertex AI integration.
Overview
Google Vertex AI offers access to Anthropic’s Claude models through Google Cloud’s infrastructure. This provider leverages Google Cloud authentication to securely access Claude models with enterprise-grade security, compliance features, and the benefits of Google Cloud’s global infrastructure.
Google Cloud Authentication
Unlike the direct Anthropic API integration, this provider uses Google Cloud authentication mechanisms:
- Google Application Default Credentials (ADC): The provider uses standard Google authentication that automatically detects credentials from your environment.
- Project ID: You must specify a Google Cloud project ID where Vertex AI is enabled.
- Region: The Google Cloud region where the models will be accessed (defaults to “us-east1”).
For more details on setting up Google Cloud authentication, refer to the following resources:
Setting Up Authentication
-
Install the Google Cloud CLI:
Terminal window # Follow instructions at https://cloud.google.com/sdk/docs/install -
Log in and set up default credentials:
Terminal window gcloud auth logingcloud auth application-default login -
Set your default project:
Terminal window gcloud config set project YOUR_PROJECT_ID
Example Usage
Below is a basic example of how to use the provider with Vertex AI:
./gateway discover \ --ai-provider anthropic-vertexai \ --vertexai-region your-gcp-region \ --vertexai-project your-gcp-project-id \ --config connection.yaml
Project and Region Configuration
The provider determines which Google Cloud project and region to use in the following order:
- Command line flags:
--vertexai-project
and--vertexai-region
- Environment variables:
ANTHROPIC_VERTEXAI_PROJECT
andANTHROPIC_VERTEXAI_REGION
- Default region fallback to
us-east1
(if no region specified)
Example with explicit project and region:
./gateway discover \ --ai-provider anthropic-vertexai \ --vertexai-project your-gcp-project-id \ --vertexai-region us-central1 \ --config connection.yaml
Alternatively, using environment variables:
export ANTHROPIC_VERTEXAI_PROJECT="your-gcp-project-id"export ANTHROPIC_VERTEXAI_REGION="us-central1"./gateway discover \ --ai-provider anthropic-vertexai \ --config connection.yaml
Model Selection
By default, the Vertex AI provider uses claude-3-7-sonnet@20250219
. Note the different format from the direct Anthropic API, using @
instead of -
for date versions.
You can specify a different model using:
- Command-line Flag: Use the
--ai-model
flag. - Environment Variable: Set the
ANTHROPIC_MODEL_ID
.
Examples:
# Specify model via command line./gateway discover \ --ai-provider anthropic-vertexai \ --ai-model claude-3-7-sonnet@20250219 \ --config connection.yaml
# Or via environment variableexport ANTHROPIC_MODEL_ID=claude-3-7-sonnet@20250219./gateway discover \ --ai-provider anthropic-vertexai \ --config connection.yaml
Advanced Configuration
Reasoning Mode
Enable Claude’s “thinking” mode for complex reasoning tasks:
./gateway discover \ --ai-provider anthropic-vertexai \ --ai-reasoning=true \ --config connection.yaml
When reasoning mode is enabled, the provider:
- Activates Claude’s thinking capability
- Allocates a thinking token budget of 4096 tokens
- Sets the temperature to 1.0
Response Length Control
Control the maximum token count in responses:
./gateway discover \ --ai-provider anthropic-vertexai \ --ai-max-tokens 8192 \ --config connection.yaml
If not specified, the default maximum token count is 64000 tokens.
Temperature Adjustment
Adjust the randomness of responses with the temperature parameter:
./gateway discover \ --ai-provider anthropic-vertexai \ --ai-temperature 0.5 \ --config connection.yaml
Lower values produce more deterministic outputs, while higher values increase creativity and randomness.
Usage Costs
The Vertex AI provider includes cost estimation for Claude Sonnet models:
- Input tokens: $3.75 per 1 million tokens
- Output tokens: $15.00 per 1 million tokens
Costs are calculated based on the actual token usage for each request. Note that actual billing is handled through your Google Cloud account, and pricing may vary based on your specific agreements with Google Cloud.
Recommended Best Practices
- Use Service Accounts: For production deployments, create a dedicated service account with minimal permissions.
- Configure IAM Permissions: Ensure your account has the necessary Vertex AI permissions.
- Manage Token Count: Set a reasonable maximum token count to control costs.
- Enable Reasoning Mode: Use for complex analytical tasks.
- Regional Selection: Choose a region close to your users/applications for lower latency.