Skip to content

Google Vertex AI Models

Google Vertex AI provides 98 AI models accessible via API.

Visit Google Vertex AI →

98

Models Available

$0.000

Cheapest Input / 1M

10M

Largest Context

What is Google Vertex AI?

Google Vertex AI is an AI model provider offering 98 large language models for developers. Their cheapest model starts at $0.000 per 1M input tokens, and their largest context window reaches 10M. Google Vertex AI provides 98 AI models accessible via API.

Google Vertex AI Strengths

All Google Vertex AI Models

Model Input $/1M Output $/1M Context Max Output Released
Meta/Llama 3.1 70b Instruct Maas $0.000 $0.000 128K 2,048
Meta/Llama 3.1 8b Instruct Maas $0.000 $0.000 128K 2,048
Meta/Llama 3.2 90b Vision Instruct Maas $0.000 $0.000 128K 2,048
Meta/Llama3 405b Instruct Maas $0.000 $0.000 32K 32,000
Meta/Llama3 70b Instruct Maas $0.000 $0.000 32K 32,000
Meta/Llama3 8b Instruct Maas $0.000 $0.000 32K 32,000
Gemini 2.0 Flash Lite $0.075 $0.30 1.0M 8,192
Gemini 2.0 Flash Lite 001 $0.075 $0.30 1.0M 8,192
Openai/Gpt Oss 20b Maas $0.075 $0.30 131K 32,768
Gemini 2.0 Flash $0.10 $0.40 1.0M 8,192
Gemini 2.5 Flash Lite $0.10 $0.40 1.0M 65,535
Gemini 2.5 Flash Lite Preview 09 2025 $0.10 $0.40 1.0M 65,535
Gemini 2.5 Flash Lite Preview 06 17 $0.10 $0.40 1.0M 65,535
Gemini 2.0 Flash 001 $0.15 $0.60 1.0M 8,192
Mistral Nemo@Latest $0.15 $0.15 128K 128,000
Openai/Gpt Oss 120b Maas $0.15 $0.60 131K 32,768
Qwen/Qwen3 Next 80b A3b Instruct Maas $0.15 $1.20 262K 262,144
Qwen/Qwen3 Next 80b A3b Thinking Maas $0.15 $1.20 262K 262,144
Codestral 2501 $0.20 $0.60 128K 128,000
Codestral $0.20 $0.60 128K 128,000
Codestral@Latest $0.20 $0.60 128K 128,000
Jamba 1.5 $0.20 $0.40 256K 256,000
Jamba 1.5 Mini $0.20 $0.40 256K 256,000
Jamba 1.5 Mini $0.20 $0.40 256K 256,000
Gemini 3.1 Flash Lite Preview $0.25 $1.50 1.0M 65,536
Claude 3 Haiku $0.25 $1.25 200K 4,096
Claude 3 Haiku $0.25 $1.25 200K 4,096
Gemini 3.1 Flash Lite Preview $0.25 $1.50 1.0M 65,536
Meta/Llama 4 Scout 17b 128e Instruct Maas $0.25 $0.70 10M 10,000,000
Meta/Llama 4 Scout 17b 16e Instruct Maas $0.25 $0.70 10M 10,000,000
Qwen/Qwen3 235b A22b Instruct 2507 Maas $0.25 $1.00 262K 16,384
Gemini 2.5 Flash $0.30 $2.50 1.0M 65,535
Gemini 2.5 Flash Preview 09 2025 $0.30 $2.50 1.0M 65,535
Gemini Robotics Er 1.5 Preview $0.30 $2.50 1.0M 65,535
Mistralai/Codestral 2 $0.30 $0.90 128K 128,000
Codestral 2 $0.30 $0.90 128K 128,000
Codestral 2 $0.30 $0.90 128K 128,000
Mistralai/Codestral 2 $0.30 $0.90 128K 128,000
Minimaxai/Minimax M2 Maas $0.30 $1.20 197K 196,608
Meta/Llama 4 Maverick 17b 128e Instruct Maas $0.35 $1.15 1M 1,000,000
Meta/Llama 4 Maverick 17b 16e Instruct Maas $0.35 $1.15 1M 1,000,000
Mistral Medium 3 $0.40 $2.00 128K 8,191
Mistral Medium 3 $0.40 $2.00 128K 8,191
Mistralai/Mistral Medium 3 $0.40 $2.00 128K 8,191
Mistralai/Mistral Medium 3 $0.40 $2.00 128K 8,191
Gemini 3 Flash Preview $0.50 $3.00 1.0M 65,535
Gemini 3 Flash Preview $0.50 $3.00 1.0M 65,535
Deepseek Ai/Deepseek V3.2 Maas $0.56 $1.68 164K 32,768
Moonshotai/Kimi K2 Thinking Maas $0.60 $2.50 256K 256,000
Zai Org/Glm 4.7 Maas $0.60 $2.20 200K 128,000
Claude 3 5 Haiku $1.00 $5.00 200K 8,192
Claude 3 5 Haiku $1.00 $5.00 200K 8,192
Claude Haiku 4 5 $1.00 $5.00 200K 8,192
Claude Haiku 4 5 $1.00 $5.00 200K 8,192
Zai Org/Glm 5 Maas $1.00 $3.20 200K 128,000
Mistral Small 2503 $1.00 $3.00 128K 128,000
Mistral Small 2503 $1.00 $3.00 32K 8,191
Qwen/Qwen3 Coder 480b A35b Instruct Maas $1.00 $4.00 262K 32,768
Gemini 2.5 Pro $1.25 $10.00 1.0M 65,535
Gemini 2.5 Pro Preview Tts $1.25 $10.00 1.0M 65,535
Gemini 2.5 Computer Use Preview 10 2025 $1.25 $10.00 128K 64,000
Deepseek Ai/Deepseek V3.1 Maas $1.35 $5.40 164K 32,768
Deepseek Ai/Deepseek R1 0528 Maas $1.35 $5.40 65K 8,192
Gemini 3 Pro Preview $2.00 $12.00 1.0M 65,535
Gemini 3.1 Pro Preview $2.00 $12.00 1.0M 65,536
Gemini 3.1 Pro Preview Customtools $2.00 $12.00 1.0M 65,536
Gemini 3 Pro Preview $2.00 $12.00 1.0M 65,535
Gemini 3.1 Pro Preview $2.00 $12.00 1.0M 65,536
Gemini 3.1 Pro Preview Customtools $2.00 $12.00 1.0M 65,536
Jamba 1.5 Large $2.00 $8.00 256K 256,000
Jamba 1.5 Large $2.00 $8.00 256K 256,000
Mistral Large 2411 $2.00 $6.00 128K 8,191
Mistral Large $2.00 $6.00 128K 8,191
Mistral Large@2411 001 $2.00 $6.00 128K 8,191
Mistral Large@Latest $2.00 $6.00 128K 8,191
Claude 3 5 Sonnet $3.00 $15.00 200K 8,192
Claude 3 5 Sonnet $3.00 $15.00 200K 8,192
Claude 3 7 Sonnet $3.00 $15.00 200K 8,192
Claude 3 Sonnet $3.00 $15.00 200K 4,096
Claude 3 Sonnet $3.00 $15.00 200K 4,096
Claude Sonnet 4 5 $3.00 $15.00 200K 64,000
Claude Sonnet 4 6 $3.00 $15.00 1M 64,000
Claude Sonnet 4 5 $3.00 $15.00 200K 64,000
Claude Sonnet 4 $3.00 $15.00 1M 64,000
Claude Sonnet 4 $3.00 $15.00 1M 64,000
Mistral Nemo $3.00 $3.00 128K 128,000
Claude Sonnet 4 6@Default $3.00 $15.00 1M 64,000
Claude Opus 4 5 $5.00 $25.00 200K 64,000
Claude Opus 4 5 $5.00 $25.00 200K 64,000
Claude Opus 4 6 $5.00 $25.00 1M 128,000
Claude Opus 4 6@Default $5.00 $25.00 1M 128,000
Meta/Llama 3.1 405b Instruct Maas $5.00 $16.00 128K 2,048
Claude 3 Opus $15.00 $75.00 200K 4,096
Claude 3 Opus $15.00 $75.00 200K 4,096
Claude Opus 4 $15.00 $75.00 200K 32,000
Claude Opus 4 1 $15.00 $75.00 200K 32,000
Claude Opus 4 1 $15.00 $75.00 200K 32,000
Claude Opus 4 $15.00 $75.00 200K 32,000

Model Details

Meta/Llama 3.1 70b Instruct Maas

Meta/Llama 3.1 70b Instruct Maas is available via Google Vertex AI with a 128K context window and up to 2,048 output tokens. Pricing: $0.000000/1M input tokens, $0.000000/1M output tokens.

Input: $0.000/1M Output: $0.000/1M Context: 128K
text vision

Meta/Llama 3.1 8b Instruct Maas

Meta/Llama 3.1 8b Instruct Maas is available via Google Vertex AI with a 128K context window and up to 2,048 output tokens. Pricing: $0.000000/1M input tokens, $0.000000/1M output tokens.

Input: $0.000/1M Output: $0.000/1M Context: 128K
text vision

Meta/Llama 3.2 90b Vision Instruct Maas

Meta/Llama 3.2 90b Vision Instruct Maas is available via Google Vertex AI with a 128K context window and up to 2,048 output tokens. Pricing: $0.000000/1M input tokens, $0.000000/1M output tokens.

Input: $0.000/1M Output: $0.000/1M Context: 128K
text vision

Meta/Llama3 405b Instruct Maas

Meta/Llama3 405b Instruct Maas is available via Google Vertex AI with a 32K context window and up to 32,000 output tokens. Pricing: $0.000000/1M input tokens, $0.000000/1M output tokens.

Input: $0.000/1M Output: $0.000/1M Context: 32K
text

Meta/Llama3 70b Instruct Maas

Meta/Llama3 70b Instruct Maas is available via Google Vertex AI with a 32K context window and up to 32,000 output tokens. Pricing: $0.000000/1M input tokens, $0.000000/1M output tokens.

Input: $0.000/1M Output: $0.000/1M Context: 32K
text

Meta/Llama3 8b Instruct Maas

Meta/Llama3 8b Instruct Maas is available via Google Vertex AI with a 32K context window and up to 32,000 output tokens. Pricing: $0.000000/1M input tokens, $0.000000/1M output tokens.

Input: $0.000/1M Output: $0.000/1M Context: 32K
text

Gemini 2.0 Flash Lite

Gemini 2.0 Flash Lite is available via Google Vertex AI with a 1.0M context window and up to 8,192 output tokens. Pricing: $0.0750/1M input tokens, $0.3000/1M output tokens.

Input: $0.075/1M Output: $0.30/1M Context: 1.0M
text vision function calling web search json mode

Gemini 2.0 Flash Lite 001

Gemini 2.0 Flash Lite 001 is available via Google Vertex AI with a 1.0M context window and up to 8,192 output tokens. Pricing: $0.0750/1M input tokens, $0.3000/1M output tokens.

Input: $0.075/1M Output: $0.30/1M Context: 1.0M
text vision function calling web search json mode

Openai/Gpt Oss 20b Maas

Openai/Gpt Oss 20b Maas is available via Google Vertex AI with a 131K context window and up to 32,768 output tokens. Pricing: $0.0750/1M input tokens, $0.3000/1M output tokens.

Input: $0.075/1M Output: $0.30/1M Context: 131K
text reasoning

Gemini 2.0 Flash

Gemini 2.0 Flash is available via Google Vertex AI with a 1.0M context window and up to 8,192 output tokens. Pricing: $0.1000/1M input tokens, $0.4000/1M output tokens.

Input: $0.10/1M Output: $0.40/1M Context: 1.0M
text vision function calling audio web search json mode

Gemini 2.5 Flash Lite

Gemini 2.5 Flash Lite is available via Google Vertex AI with a 1.0M context window and up to 65,535 output tokens. Pricing: $0.1000/1M input tokens, $0.4000/1M output tokens.

Input: $0.10/1M Output: $0.40/1M Context: 1.0M
text vision function calling reasoning pdf web search json mode

Gemini 2.5 Flash Lite Preview 09 2025

Gemini 2.5 Flash Lite Preview 09 2025 is available via Google Vertex AI with a 1.0M context window and up to 65,535 output tokens. Pricing: $0.1000/1M input tokens, $0.4000/1M output tokens.

Input: $0.10/1M Output: $0.40/1M Context: 1.0M
text vision function calling reasoning pdf web search json mode

Gemini 2.5 Flash Lite Preview 06 17

Gemini 2.5 Flash Lite Preview 06 17 is available via Google Vertex AI with a 1.0M context window and up to 65,535 output tokens. Pricing: $0.1000/1M input tokens, $0.4000/1M output tokens.

Input: $0.10/1M Output: $0.40/1M Context: 1.0M
text vision function calling reasoning pdf web search json mode

Gemini 2.0 Flash 001

Gemini 2.0 Flash 001 is available via Google Vertex AI with a 1.0M context window and up to 8,192 output tokens. Pricing: $0.1500/1M input tokens, $0.6000/1M output tokens.

Input: $0.15/1M Output: $0.60/1M Context: 1.0M
text vision function calling web search json mode

Mistral Nemo@Latest

Mistral Nemo@Latest is available via Google Vertex AI with a 128K context window and up to 128,000 output tokens. Pricing: $0.1500/1M input tokens, $0.1500/1M output tokens.

Input: $0.15/1M Output: $0.15/1M Context: 128K
text function calling

Openai/Gpt Oss 120b Maas

Openai/Gpt Oss 120b Maas is available via Google Vertex AI with a 131K context window and up to 32,768 output tokens. Pricing: $0.1500/1M input tokens, $0.6000/1M output tokens.

Input: $0.15/1M Output: $0.60/1M Context: 131K
text reasoning

Qwen/Qwen3 Next 80b A3b Instruct Maas

Qwen/Qwen3 Next 80b A3b Instruct Maas is available via Google Vertex AI with a 262K context window and up to 262,144 output tokens. Pricing: $0.1500/1M input tokens, $1.20/1M output tokens.

Input: $0.15/1M Output: $1.20/1M Context: 262K
text function calling

Qwen/Qwen3 Next 80b A3b Thinking Maas

Qwen/Qwen3 Next 80b A3b Thinking Maas is available via Google Vertex AI with a 262K context window and up to 262,144 output tokens. Pricing: $0.1500/1M input tokens, $1.20/1M output tokens.

Input: $0.15/1M Output: $1.20/1M Context: 262K
text function calling

Codestral 2501

Codestral 2501 is available via Google Vertex AI with a 128K context window and up to 128,000 output tokens. Pricing: $0.2000/1M input tokens, $0.6000/1M output tokens.

Input: $0.20/1M Output: $0.60/1M Context: 128K
text function calling

Codestral

Codestral is available via Google Vertex AI with a 128K context window and up to 128,000 output tokens. Pricing: $0.2000/1M input tokens, $0.6000/1M output tokens.

Input: $0.20/1M Output: $0.60/1M Context: 128K
text function calling

Codestral@Latest

Codestral@Latest is available via Google Vertex AI with a 128K context window and up to 128,000 output tokens. Pricing: $0.2000/1M input tokens, $0.6000/1M output tokens.

Input: $0.20/1M Output: $0.60/1M Context: 128K
text function calling

Jamba 1.5

Jamba 1.5 is available via Google Vertex AI with a 256K context window and up to 256,000 output tokens. Pricing: $0.2000/1M input tokens, $0.4000/1M output tokens.

Input: $0.20/1M Output: $0.40/1M Context: 256K
text

Jamba 1.5 Mini

Jamba 1.5 Mini is available via Google Vertex AI with a 256K context window and up to 256,000 output tokens. Pricing: $0.2000/1M input tokens, $0.4000/1M output tokens.

Input: $0.20/1M Output: $0.40/1M Context: 256K
text

Jamba 1.5 Mini

Jamba 1.5 Mini is available via Google Vertex AI with a 256K context window and up to 256,000 output tokens. Pricing: $0.2000/1M input tokens, $0.4000/1M output tokens.

Input: $0.20/1M Output: $0.40/1M Context: 256K
text

Gemini 3.1 Flash Lite Preview

Gemini 3.1 Flash Lite Preview is available via Google Vertex AI with a 1.0M context window and up to 65,536 output tokens. Pricing: $0.2500/1M input tokens, $1.50/1M output tokens.

Input: $0.25/1M Output: $1.50/1M Context: 1.0M
text vision function calling reasoning audio pdf web search json mode

Claude 3 Haiku

Claude 3 Haiku is available via Google Vertex AI with a 200K context window and up to 4,096 output tokens. Pricing: $0.2500/1M input tokens, $1.25/1M output tokens.

Input: $0.25/1M Output: $1.25/1M Context: 200K
text vision function calling

Claude 3 Haiku

Claude 3 Haiku is available via Google Vertex AI with a 200K context window and up to 4,096 output tokens. Pricing: $0.2500/1M input tokens, $1.25/1M output tokens.

Input: $0.25/1M Output: $1.25/1M Context: 200K
text vision function calling

Gemini 3.1 Flash Lite Preview

Gemini 3.1 Flash Lite Preview is available via Google Vertex AI with a 1.0M context window and up to 65,536 output tokens. Pricing: $0.2500/1M input tokens, $1.50/1M output tokens.

Input: $0.25/1M Output: $1.50/1M Context: 1.0M
text vision function calling reasoning audio pdf web search json mode

Meta/Llama 4 Scout 17b 128e Instruct Maas

Meta/Llama 4 Scout 17b 128e Instruct Maas is available via Google Vertex AI with a 10M context window and up to 10,000,000 output tokens. Pricing: $0.2500/1M input tokens, $0.7000/1M output tokens.

Input: $0.25/1M Output: $0.70/1M Context: 10M
text function calling

Meta/Llama 4 Scout 17b 16e Instruct Maas

Meta/Llama 4 Scout 17b 16e Instruct Maas is available via Google Vertex AI with a 10M context window and up to 10,000,000 output tokens. Pricing: $0.2500/1M input tokens, $0.7000/1M output tokens.

Input: $0.25/1M Output: $0.70/1M Context: 10M
text function calling

Qwen/Qwen3 235b A22b Instruct 2507 Maas

Qwen/Qwen3 235b A22b Instruct 2507 Maas is available via Google Vertex AI with a 262K context window and up to 16,384 output tokens. Pricing: $0.2500/1M input tokens, $1.00/1M output tokens.

Input: $0.25/1M Output: $1.00/1M Context: 262K
text function calling

Gemini 2.5 Flash

Gemini 2.5 Flash is available via Google Vertex AI with a 1.0M context window and up to 65,535 output tokens. Pricing: $0.3000/1M input tokens, $2.50/1M output tokens.

Input: $0.30/1M Output: $2.50/1M Context: 1.0M
text vision function calling reasoning pdf web search json mode

Gemini 2.5 Flash Preview 09 2025

Gemini 2.5 Flash Preview 09 2025 is available via Google Vertex AI with a 1.0M context window and up to 65,535 output tokens. Pricing: $0.3000/1M input tokens, $2.50/1M output tokens.

Input: $0.30/1M Output: $2.50/1M Context: 1.0M
text vision function calling reasoning pdf web search json mode

Gemini Robotics Er 1.5 Preview

Gemini Robotics Er 1.5 Preview is available via Google Vertex AI with a 1.0M context window and up to 65,535 output tokens. Pricing: $0.3000/1M input tokens, $2.50/1M output tokens.

Input: $0.30/1M Output: $2.50/1M Context: 1.0M
text vision function calling reasoning json mode

Mistralai/Codestral 2

Mistralai/Codestral 2 is available via Google Vertex AI with a 128K context window and up to 128,000 output tokens. Pricing: $0.3000/1M input tokens, $0.9000/1M output tokens.

Input: $0.30/1M Output: $0.90/1M Context: 128K
text function calling

Codestral 2

Codestral 2 is available via Google Vertex AI with a 128K context window and up to 128,000 output tokens. Pricing: $0.3000/1M input tokens, $0.9000/1M output tokens.

Input: $0.30/1M Output: $0.90/1M Context: 128K
text function calling

Codestral 2

Codestral 2 is available via Google Vertex AI with a 128K context window and up to 128,000 output tokens. Pricing: $0.3000/1M input tokens, $0.9000/1M output tokens.

Input: $0.30/1M Output: $0.90/1M Context: 128K
text function calling

Mistralai/Codestral 2

Mistralai/Codestral 2 is available via Google Vertex AI with a 128K context window and up to 128,000 output tokens. Pricing: $0.3000/1M input tokens, $0.9000/1M output tokens.

Input: $0.30/1M Output: $0.90/1M Context: 128K
text function calling

Minimaxai/Minimax M2 Maas

Minimaxai/Minimax M2 Maas is available via Google Vertex AI with a 197K context window and up to 196,608 output tokens. Pricing: $0.3000/1M input tokens, $1.20/1M output tokens.

Input: $0.30/1M Output: $1.20/1M Context: 197K
text function calling

Meta/Llama 4 Maverick 17b 128e Instruct Maas

Meta/Llama 4 Maverick 17b 128e Instruct Maas is available via Google Vertex AI with a 1M context window and up to 1,000,000 output tokens. Pricing: $0.3500/1M input tokens, $1.15/1M output tokens.

Input: $0.35/1M Output: $1.15/1M Context: 1M
text function calling

Meta/Llama 4 Maverick 17b 16e Instruct Maas

Meta/Llama 4 Maverick 17b 16e Instruct Maas is available via Google Vertex AI with a 1M context window and up to 1,000,000 output tokens. Pricing: $0.3500/1M input tokens, $1.15/1M output tokens.

Input: $0.35/1M Output: $1.15/1M Context: 1M
text function calling

Mistral Medium 3

Mistral Medium 3 is available via Google Vertex AI with a 128K context window and up to 8,191 output tokens. Pricing: $0.4000/1M input tokens, $2.00/1M output tokens.

Input: $0.40/1M Output: $2.00/1M Context: 128K
text function calling

Mistral Medium 3

Mistral Medium 3 is available via Google Vertex AI with a 128K context window and up to 8,191 output tokens. Pricing: $0.4000/1M input tokens, $2.00/1M output tokens.

Input: $0.40/1M Output: $2.00/1M Context: 128K
text function calling

Mistralai/Mistral Medium 3

Mistralai/Mistral Medium 3 is available via Google Vertex AI with a 128K context window and up to 8,191 output tokens. Pricing: $0.4000/1M input tokens, $2.00/1M output tokens.

Input: $0.40/1M Output: $2.00/1M Context: 128K
text function calling

Mistralai/Mistral Medium 3

Mistralai/Mistral Medium 3 is available via Google Vertex AI with a 128K context window and up to 8,191 output tokens. Pricing: $0.4000/1M input tokens, $2.00/1M output tokens.

Input: $0.40/1M Output: $2.00/1M Context: 128K
text function calling

Gemini 3 Flash Preview

Gemini 3 Flash Preview is available via Google Vertex AI with a 1.0M context window and up to 65,535 output tokens. Pricing: $0.5000/1M input tokens, $3.00/1M output tokens.

Input: $0.50/1M Output: $3.00/1M Context: 1.0M
text vision function calling reasoning audio pdf web search json mode

Gemini 3 Flash Preview

Gemini 3 Flash Preview is available via Google Vertex AI with a 1.0M context window and up to 65,535 output tokens. Pricing: $0.5000/1M input tokens, $3.00/1M output tokens.

Input: $0.50/1M Output: $3.00/1M Context: 1.0M
text vision function calling reasoning pdf web search json mode

Deepseek Ai/Deepseek V3.2 Maas

Deepseek Ai/Deepseek V3.2 Maas is available via Google Vertex AI with a 164K context window and up to 32,768 output tokens. Pricing: $0.5600/1M input tokens, $1.68/1M output tokens.

Input: $0.56/1M Output: $1.68/1M Context: 164K
text function calling reasoning

Moonshotai/Kimi K2 Thinking Maas

Moonshotai/Kimi K2 Thinking Maas is available via Google Vertex AI with a 256K context window and up to 256,000 output tokens. Pricing: $0.6000/1M input tokens, $2.50/1M output tokens.

Input: $0.60/1M Output: $2.50/1M Context: 256K
text function calling web search

Zai Org/Glm 4.7 Maas

Zai Org/Glm 4.7 Maas is available via Google Vertex AI with a 200K context window and up to 128,000 output tokens. Pricing: $0.6000/1M input tokens, $2.20/1M output tokens.

Input: $0.60/1M Output: $2.20/1M Context: 200K
text function calling reasoning

Claude 3 5 Haiku

Claude 3 5 Haiku is available via Google Vertex AI with a 200K context window and up to 8,192 output tokens. Pricing: $1.00/1M input tokens, $5.00/1M output tokens.

Input: $1.00/1M Output: $5.00/1M Context: 200K
text function calling pdf

Claude 3 5 Haiku

Claude 3 5 Haiku is available via Google Vertex AI with a 200K context window and up to 8,192 output tokens. Pricing: $1.00/1M input tokens, $5.00/1M output tokens.

Input: $1.00/1M Output: $5.00/1M Context: 200K
text function calling pdf

Claude Haiku 4 5

Claude Haiku 4 5 is available via Google Vertex AI with a 200K context window and up to 8,192 output tokens. Pricing: $1.00/1M input tokens, $5.00/1M output tokens.

Input: $1.00/1M Output: $5.00/1M Context: 200K
text vision function calling reasoning pdf json mode

Claude Haiku 4 5

Claude Haiku 4 5 is available via Google Vertex AI with a 200K context window and up to 8,192 output tokens. Pricing: $1.00/1M input tokens, $5.00/1M output tokens.

Input: $1.00/1M Output: $5.00/1M Context: 200K
text vision function calling reasoning pdf json mode

Zai Org/Glm 5 Maas

Zai Org/Glm 5 Maas is available via Google Vertex AI with a 200K context window and up to 128,000 output tokens. Pricing: $1.00/1M input tokens, $3.20/1M output tokens.

Input: $1.00/1M Output: $3.20/1M Context: 200K
text function calling reasoning

Mistral Small 2503

Mistral Small 2503 is available via Google Vertex AI with a 128K context window and up to 128,000 output tokens. Pricing: $1.00/1M input tokens, $3.00/1M output tokens.

Input: $1.00/1M Output: $3.00/1M Context: 128K
text vision function calling

Mistral Small 2503

Mistral Small 2503 is available via Google Vertex AI with a 32K context window and up to 8,191 output tokens. Pricing: $1.00/1M input tokens, $3.00/1M output tokens.

Input: $1.00/1M Output: $3.00/1M Context: 32K
text function calling

Qwen/Qwen3 Coder 480b A35b Instruct Maas

Qwen/Qwen3 Coder 480b A35b Instruct Maas is available via Google Vertex AI with a 262K context window and up to 32,768 output tokens. Pricing: $1.00/1M input tokens, $4.00/1M output tokens.

Input: $1.00/1M Output: $4.00/1M Context: 262K
text function calling

Gemini 2.5 Pro

Gemini 2.5 Pro is available via Google Vertex AI with a 1.0M context window and up to 65,535 output tokens. Pricing: $1.25/1M input tokens, $10.00/1M output tokens.

Input: $1.25/1M Output: $10.00/1M Context: 1.0M
text vision function calling reasoning audio pdf web search json mode

Gemini 2.5 Pro Preview Tts

Gemini 2.5 Pro Preview Tts is available via Google Vertex AI with a 1.0M context window and up to 65,535 output tokens. Pricing: $1.25/1M input tokens, $10.00/1M output tokens.

Input: $1.25/1M Output: $10.00/1M Context: 1.0M
text vision function calling web search json mode

Gemini 2.5 Computer Use Preview 10 2025

Gemini 2.5 Computer Use Preview 10 2025 is available via Google Vertex AI with a 128K context window and up to 64,000 output tokens. Pricing: $1.25/1M input tokens, $10.00/1M output tokens.

Input: $1.25/1M Output: $10.00/1M Context: 128K
text vision function calling computer use

Deepseek Ai/Deepseek V3.1 Maas

Deepseek Ai/Deepseek V3.1 Maas is available via Google Vertex AI with a 164K context window and up to 32,768 output tokens. Pricing: $1.35/1M input tokens, $5.40/1M output tokens.

Input: $1.35/1M Output: $5.40/1M Context: 164K
text function calling reasoning

Deepseek Ai/Deepseek R1 0528 Maas

Deepseek Ai/Deepseek R1 0528 Maas is available via Google Vertex AI with a 65K context window and up to 8,192 output tokens. Pricing: $1.35/1M input tokens, $5.40/1M output tokens.

Input: $1.35/1M Output: $5.40/1M Context: 65K
text function calling reasoning

Gemini 3 Pro Preview

Gemini 3 Pro Preview is available via Google Vertex AI with a 1.0M context window and up to 65,535 output tokens. Pricing: $2.00/1M input tokens, $12.00/1M output tokens.

Input: $2.00/1M Output: $12.00/1M Context: 1.0M
text vision function calling reasoning audio pdf web search json mode

Gemini 3.1 Pro Preview

Gemini 3.1 Pro Preview is available via Google Vertex AI with a 1.0M context window and up to 65,536 output tokens. Pricing: $2.00/1M input tokens, $12.00/1M output tokens.

Input: $2.00/1M Output: $12.00/1M Context: 1.0M
text vision function calling reasoning audio pdf web search json mode

Gemini 3.1 Pro Preview Customtools

Gemini 3.1 Pro Preview Customtools is available via Google Vertex AI with a 1.0M context window and up to 65,536 output tokens. Pricing: $2.00/1M input tokens, $12.00/1M output tokens.

Input: $2.00/1M Output: $12.00/1M Context: 1.0M
text vision function calling reasoning audio pdf web search json mode

Gemini 3 Pro Preview

Gemini 3 Pro Preview is available via Google Vertex AI with a 1.0M context window and up to 65,535 output tokens. Pricing: $2.00/1M input tokens, $12.00/1M output tokens.

Input: $2.00/1M Output: $12.00/1M Context: 1.0M
text vision function calling reasoning audio pdf web search json mode

Gemini 3.1 Pro Preview

Gemini 3.1 Pro Preview is available via Google Vertex AI with a 1.0M context window and up to 65,536 output tokens. Pricing: $2.00/1M input tokens, $12.00/1M output tokens.

Input: $2.00/1M Output: $12.00/1M Context: 1.0M
text vision function calling reasoning audio pdf web search json mode

Gemini 3.1 Pro Preview Customtools

Gemini 3.1 Pro Preview Customtools is available via Google Vertex AI with a 1.0M context window and up to 65,536 output tokens. Pricing: $2.00/1M input tokens, $12.00/1M output tokens.

Input: $2.00/1M Output: $12.00/1M Context: 1.0M
text vision function calling reasoning audio pdf web search json mode

Jamba 1.5 Large

Jamba 1.5 Large is available via Google Vertex AI with a 256K context window and up to 256,000 output tokens. Pricing: $2.00/1M input tokens, $8.00/1M output tokens.

Input: $2.00/1M Output: $8.00/1M Context: 256K
text

Jamba 1.5 Large

Jamba 1.5 Large is available via Google Vertex AI with a 256K context window and up to 256,000 output tokens. Pricing: $2.00/1M input tokens, $8.00/1M output tokens.

Input: $2.00/1M Output: $8.00/1M Context: 256K
text

Mistral Large 2411

Mistral Large 2411 is available via Google Vertex AI with a 128K context window and up to 8,191 output tokens. Pricing: $2.00/1M input tokens, $6.00/1M output tokens.

Input: $2.00/1M Output: $6.00/1M Context: 128K
text function calling

Mistral Large

Mistral Large is available via Google Vertex AI with a 128K context window and up to 8,191 output tokens. Pricing: $2.00/1M input tokens, $6.00/1M output tokens.

Input: $2.00/1M Output: $6.00/1M Context: 128K
text function calling

Mistral Large@2411 001

Mistral Large@2411 001 is available via Google Vertex AI with a 128K context window and up to 8,191 output tokens. Pricing: $2.00/1M input tokens, $6.00/1M output tokens.

Input: $2.00/1M Output: $6.00/1M Context: 128K
text function calling

Mistral Large@Latest

Mistral Large@Latest is available via Google Vertex AI with a 128K context window and up to 8,191 output tokens. Pricing: $2.00/1M input tokens, $6.00/1M output tokens.

Input: $2.00/1M Output: $6.00/1M Context: 128K
text function calling

Claude 3 5 Sonnet

Claude 3 5 Sonnet is available via Google Vertex AI with a 200K context window and up to 8,192 output tokens. Pricing: $3.00/1M input tokens, $15.00/1M output tokens.

Input: $3.00/1M Output: $15.00/1M Context: 200K
text vision function calling pdf computer use

Claude 3 5 Sonnet

Claude 3 5 Sonnet is available via Google Vertex AI with a 200K context window and up to 8,192 output tokens. Pricing: $3.00/1M input tokens, $15.00/1M output tokens.

Input: $3.00/1M Output: $15.00/1M Context: 200K
text vision function calling pdf

Claude 3 7 Sonnet

Claude 3 7 Sonnet is available via Google Vertex AI with a 200K context window and up to 8,192 output tokens. Pricing: $3.00/1M input tokens, $15.00/1M output tokens.

Input: $3.00/1M Output: $15.00/1M Context: 200K
text vision function calling reasoning pdf computer use json mode

Claude 3 Sonnet

Claude 3 Sonnet is available via Google Vertex AI with a 200K context window and up to 4,096 output tokens. Pricing: $3.00/1M input tokens, $15.00/1M output tokens.

Input: $3.00/1M Output: $15.00/1M Context: 200K
text vision function calling

Claude 3 Sonnet

Claude 3 Sonnet is available via Google Vertex AI with a 200K context window and up to 4,096 output tokens. Pricing: $3.00/1M input tokens, $15.00/1M output tokens.

Input: $3.00/1M Output: $15.00/1M Context: 200K
text vision function calling

Claude Sonnet 4 5

Claude Sonnet 4 5 is available via Google Vertex AI with a 200K context window and up to 64,000 output tokens. Pricing: $3.00/1M input tokens, $15.00/1M output tokens.

Input: $3.00/1M Output: $15.00/1M Context: 200K
text vision function calling reasoning pdf computer use json mode

Claude Sonnet 4 6

Claude Sonnet 4 6 is available via Google Vertex AI with a 1M context window and up to 64,000 output tokens. Pricing: $3.00/1M input tokens, $15.00/1M output tokens.

Input: $3.00/1M Output: $15.00/1M Context: 1M
text vision function calling reasoning pdf computer use json mode

Claude Sonnet 4 5

Claude Sonnet 4 5 is available via Google Vertex AI with a 200K context window and up to 64,000 output tokens. Pricing: $3.00/1M input tokens, $15.00/1M output tokens.

Input: $3.00/1M Output: $15.00/1M Context: 200K
text vision function calling reasoning pdf computer use json mode

Claude Sonnet 4

Claude Sonnet 4 is available via Google Vertex AI with a 1M context window and up to 64,000 output tokens. Pricing: $3.00/1M input tokens, $15.00/1M output tokens.

Input: $3.00/1M Output: $15.00/1M Context: 1M
text vision function calling reasoning pdf computer use json mode

Claude Sonnet 4

Claude Sonnet 4 is available via Google Vertex AI with a 1M context window and up to 64,000 output tokens. Pricing: $3.00/1M input tokens, $15.00/1M output tokens.

Input: $3.00/1M Output: $15.00/1M Context: 1M
text vision function calling reasoning pdf computer use json mode

Mistral Nemo

Mistral Nemo is available via Google Vertex AI with a 128K context window and up to 128,000 output tokens. Pricing: $3.00/1M input tokens, $3.00/1M output tokens.

Input: $3.00/1M Output: $3.00/1M Context: 128K
text function calling

Claude Sonnet 4 6@Default

Claude Sonnet 4 6@Default is available via Google Vertex AI with a 1M context window and up to 64,000 output tokens. Pricing: $3.00/1M input tokens, $15.00/1M output tokens.

Input: $3.00/1M Output: $15.00/1M Context: 1M
text vision function calling reasoning pdf computer use json mode

Claude Opus 4 5

Claude Opus 4 5 is available via Google Vertex AI with a 200K context window and up to 64,000 output tokens. Pricing: $5.00/1M input tokens, $25.00/1M output tokens.

Input: $5.00/1M Output: $25.00/1M Context: 200K
text vision function calling reasoning pdf computer use json mode

Claude Opus 4 5

Claude Opus 4 5 is available via Google Vertex AI with a 200K context window and up to 64,000 output tokens. Pricing: $5.00/1M input tokens, $25.00/1M output tokens.

Input: $5.00/1M Output: $25.00/1M Context: 200K
text vision function calling reasoning pdf computer use json mode

Claude Opus 4 6

Claude Opus 4 6 is available via Google Vertex AI with a 1M context window and up to 128,000 output tokens. Pricing: $5.00/1M input tokens, $25.00/1M output tokens.

Input: $5.00/1M Output: $25.00/1M Context: 1M
text vision function calling reasoning pdf computer use json mode

Claude Opus 4 6@Default

Claude Opus 4 6@Default is available via Google Vertex AI with a 1M context window and up to 128,000 output tokens. Pricing: $5.00/1M input tokens, $25.00/1M output tokens.

Input: $5.00/1M Output: $25.00/1M Context: 1M
text vision function calling reasoning pdf computer use json mode

Meta/Llama 3.1 405b Instruct Maas

Meta/Llama 3.1 405b Instruct Maas is available via Google Vertex AI with a 128K context window and up to 2,048 output tokens. Pricing: $5.00/1M input tokens, $16.00/1M output tokens.

Input: $5.00/1M Output: $16.00/1M Context: 128K
text vision

Claude 3 Opus

Claude 3 Opus is available via Google Vertex AI with a 200K context window and up to 4,096 output tokens. Pricing: $15.00/1M input tokens, $75.00/1M output tokens.

Input: $15.00/1M Output: $75.00/1M Context: 200K
text vision function calling

Claude 3 Opus

Claude 3 Opus is available via Google Vertex AI with a 200K context window and up to 4,096 output tokens. Pricing: $15.00/1M input tokens, $75.00/1M output tokens.

Input: $15.00/1M Output: $75.00/1M Context: 200K
text vision function calling

Claude Opus 4

Claude Opus 4 is available via Google Vertex AI with a 200K context window and up to 32,000 output tokens. Pricing: $15.00/1M input tokens, $75.00/1M output tokens.

Input: $15.00/1M Output: $75.00/1M Context: 200K
text vision function calling reasoning pdf computer use json mode

Claude Opus 4 1

Claude Opus 4 1 is available via Google Vertex AI with a 200K context window and up to 32,000 output tokens. Pricing: $15.00/1M input tokens, $75.00/1M output tokens.

Input: $15.00/1M Output: $75.00/1M Context: 200K
text vision function calling

Claude Opus 4 1

Claude Opus 4 1 is available via Google Vertex AI with a 200K context window and up to 32,000 output tokens. Pricing: $15.00/1M input tokens, $75.00/1M output tokens.

Input: $15.00/1M Output: $75.00/1M Context: 200K
text vision function calling

Claude Opus 4

Claude Opus 4 is available via Google Vertex AI with a 200K context window and up to 32,000 output tokens. Pricing: $15.00/1M input tokens, $75.00/1M output tokens.

Input: $15.00/1M Output: $75.00/1M Context: 200K
text vision function calling reasoning pdf computer use json mode

Compare Google Vertex AI model pricing

Use our pricing calculator to find the cheapest Google Vertex AI model for your workload.

Pricing Calculator Compare Models All Models Directory

Related Reading

OpenAI vs Anthropic vs Google: Which AI API Should You Choose? → Cheapest LLM API in 2026: Complete Pricing Comparison → OpenAI API Pricing Guide 2026 →