Llamagate Models

Llamagate provides 14 AI models accessible via API.

Models Available

$0.030

Cheapest Input / 1M

131K

Largest Context

What is Llamagate?

Llamagate is an AI model provider offering 14 large language models for developers. Their cheapest model starts at $0.030 per 1M input tokens, and their largest context window reaches 131K. Llamagate provides 14 AI models accessible via API.

Llamagate Strengths

All Llamagate Models

Model	Input $/1M	Output $/1M	Context	Max Output	Released
Llama 3.1 8b	$0.030	$0.050	131K	8,192	—
Gemma3 4b	$0.030	$0.080	128K	8,192	—
Llama 3.2 3b	$0.040	$0.080	131K	8,192	—
Qwen3 8b	$0.040	$0.14	33K	8,192	—
Qwen2.5 Coder 7b	$0.060	$0.12	33K	8,192	—
Deepseek Coder 6.7b	$0.060	$0.12	16K	4,096	—
Codellama 7b	$0.060	$0.12	16K	4,096	—
Dolphin3 8b	$0.080	$0.15	128K	8,192	—
Deepseek R1 7b Qwen	$0.080	$0.15	131K	16,384	—
Openthinker 7b	$0.080	$0.15	33K	8,192	—
Mistral 7b V0.3	$0.10	$0.15	33K	8,192	—
Deepseek R1 8b	$0.10	$0.20	66K	16,384	—
Llava 7b	$0.10	$0.20	4K	2,048	—
Qwen3 Vl 8b	$0.15	$0.55	33K	8,192	—

Model Details

Llama 3.1 8b

Llama 3.1 8b is available via Llamagate with a 131K context window and up to 8,192 output tokens. Pricing: $0.0300/1M input tokens, $0.0500/1M output tokens.

Input: $0.030/1M Output: $0.050/1M Context: 131K

text function calling json mode

Gemma3 4b

Gemma3 4b is available via Llamagate with a 128K context window and up to 8,192 output tokens. Pricing: $0.0300/1M input tokens, $0.0800/1M output tokens.

Input: $0.030/1M Output: $0.080/1M Context: 128K

text vision function calling json mode

Llama 3.2 3b

Llama 3.2 3b is available via Llamagate with a 131K context window and up to 8,192 output tokens. Pricing: $0.0400/1M input tokens, $0.0800/1M output tokens.

Input: $0.040/1M Output: $0.080/1M Context: 131K

text function calling json mode

Qwen3 8b

Qwen3 8b is available via Llamagate with a 33K context window and up to 8,192 output tokens. Pricing: $0.0400/1M input tokens, $0.1400/1M output tokens.

Input: $0.040/1M Output: $0.14/1M Context: 33K

text function calling json mode

Qwen2.5 Coder 7b

Qwen2.5 Coder 7b is available via Llamagate with a 33K context window and up to 8,192 output tokens. Pricing: $0.0600/1M input tokens, $0.1200/1M output tokens.

Input: $0.060/1M Output: $0.12/1M Context: 33K

text function calling json mode

Deepseek Coder 6.7b

Deepseek Coder 6.7b is available via Llamagate with a 16K context window and up to 4,096 output tokens. Pricing: $0.0600/1M input tokens, $0.1200/1M output tokens.

Input: $0.060/1M Output: $0.12/1M Context: 16K

text function calling json mode

Codellama 7b

Codellama 7b is available via Llamagate with a 16K context window and up to 4,096 output tokens. Pricing: $0.0600/1M input tokens, $0.1200/1M output tokens.

Input: $0.060/1M Output: $0.12/1M Context: 16K

text function calling json mode

Dolphin3 8b

Dolphin3 8b is available via Llamagate with a 128K context window and up to 8,192 output tokens. Pricing: $0.0800/1M input tokens, $0.1500/1M output tokens.

Input: $0.080/1M Output: $0.15/1M Context: 128K

text function calling json mode

Deepseek R1 7b Qwen

Deepseek R1 7b Qwen is available via Llamagate with a 131K context window and up to 16,384 output tokens. Pricing: $0.0800/1M input tokens, $0.1500/1M output tokens.

Input: $0.080/1M Output: $0.15/1M Context: 131K

text function calling reasoning json mode

Openthinker 7b

Openthinker 7b is available via Llamagate with a 33K context window and up to 8,192 output tokens. Pricing: $0.0800/1M input tokens, $0.1500/1M output tokens.

Input: $0.080/1M Output: $0.15/1M Context: 33K

text function calling reasoning json mode

Mistral 7b V0.3

Mistral 7b V0.3 is available via Llamagate with a 33K context window and up to 8,192 output tokens. Pricing: $0.1000/1M input tokens, $0.1500/1M output tokens.

Input: $0.10/1M Output: $0.15/1M Context: 33K

text function calling json mode

Deepseek R1 8b

Deepseek R1 8b is available via Llamagate with a 66K context window and up to 16,384 output tokens. Pricing: $0.1000/1M input tokens, $0.2000/1M output tokens.

Input: $0.10/1M Output: $0.20/1M Context: 66K

text function calling reasoning json mode

Llava 7b

Llava 7b is available via Llamagate with a 4K context window and up to 2,048 output tokens. Pricing: $0.1000/1M input tokens, $0.2000/1M output tokens.

Input: $0.10/1M Output: $0.20/1M Context: 4K

text vision json mode

Qwen3 Vl 8b

Qwen3 Vl 8b is available via Llamagate with a 33K context window and up to 8,192 output tokens. Pricing: $0.1500/1M input tokens, $0.5500/1M output tokens.

Input: $0.15/1M Output: $0.55/1M Context: 33K

text vision function calling json mode

Compare Llamagate model pricing

Use our pricing calculator to find the cheapest Llamagate model for your workload.

Pricing Calculator Compare Models All Models Directory

Llamagate Models

What is Llamagate?

Llamagate Strengths

All Llamagate Models

Model Details

Llama 3.1 8b

Gemma3 4b

Llama 3.2 3b

Qwen3 8b

Qwen2.5 Coder 7b

Deepseek Coder 6.7b

Codellama 7b

Dolphin3 8b

Deepseek R1 7b Qwen

Openthinker 7b

Mistral 7b V0.3

Deepseek R1 8b

Llava 7b

Qwen3 Vl 8b

Compare Llamagate model pricing

Related Reading