Skip to content

Llamagate Models

Llamagate provides 14 AI models accessible via API.

Visit Llamagate →

14

Models Available

$0.030

Cheapest Input / 1M

131K

Largest Context

What is Llamagate?

Llamagate is an AI model provider offering 14 large language models for developers. Their cheapest model starts at $0.030 per 1M input tokens, and their largest context window reaches 131K. Llamagate provides 14 AI models accessible via API.

Llamagate Strengths

All Llamagate Models

Model Input $/1M Output $/1M Context Max Output Released
Llama 3.1 8b $0.030 $0.050 131K 8,192
Gemma3 4b $0.030 $0.080 128K 8,192
Llama 3.2 3b $0.040 $0.080 131K 8,192
Qwen3 8b $0.040 $0.14 33K 8,192
Qwen2.5 Coder 7b $0.060 $0.12 33K 8,192
Deepseek Coder 6.7b $0.060 $0.12 16K 4,096
Codellama 7b $0.060 $0.12 16K 4,096
Dolphin3 8b $0.080 $0.15 128K 8,192
Deepseek R1 7b Qwen $0.080 $0.15 131K 16,384
Openthinker 7b $0.080 $0.15 33K 8,192
Mistral 7b V0.3 $0.10 $0.15 33K 8,192
Deepseek R1 8b $0.10 $0.20 66K 16,384
Llava 7b $0.10 $0.20 4K 2,048
Qwen3 Vl 8b $0.15 $0.55 33K 8,192

Model Details

Llama 3.1 8b

Llama 3.1 8b is available via Llamagate with a 131K context window and up to 8,192 output tokens. Pricing: $0.0300/1M input tokens, $0.0500/1M output tokens.

Input: $0.030/1M Output: $0.050/1M Context: 131K
text function calling json mode

Gemma3 4b

Gemma3 4b is available via Llamagate with a 128K context window and up to 8,192 output tokens. Pricing: $0.0300/1M input tokens, $0.0800/1M output tokens.

Input: $0.030/1M Output: $0.080/1M Context: 128K
text vision function calling json mode

Llama 3.2 3b

Llama 3.2 3b is available via Llamagate with a 131K context window and up to 8,192 output tokens. Pricing: $0.0400/1M input tokens, $0.0800/1M output tokens.

Input: $0.040/1M Output: $0.080/1M Context: 131K
text function calling json mode

Qwen3 8b

Qwen3 8b is available via Llamagate with a 33K context window and up to 8,192 output tokens. Pricing: $0.0400/1M input tokens, $0.1400/1M output tokens.

Input: $0.040/1M Output: $0.14/1M Context: 33K
text function calling json mode

Qwen2.5 Coder 7b

Qwen2.5 Coder 7b is available via Llamagate with a 33K context window and up to 8,192 output tokens. Pricing: $0.0600/1M input tokens, $0.1200/1M output tokens.

Input: $0.060/1M Output: $0.12/1M Context: 33K
text function calling json mode

Deepseek Coder 6.7b

Deepseek Coder 6.7b is available via Llamagate with a 16K context window and up to 4,096 output tokens. Pricing: $0.0600/1M input tokens, $0.1200/1M output tokens.

Input: $0.060/1M Output: $0.12/1M Context: 16K
text function calling json mode

Codellama 7b

Codellama 7b is available via Llamagate with a 16K context window and up to 4,096 output tokens. Pricing: $0.0600/1M input tokens, $0.1200/1M output tokens.

Input: $0.060/1M Output: $0.12/1M Context: 16K
text function calling json mode

Dolphin3 8b

Dolphin3 8b is available via Llamagate with a 128K context window and up to 8,192 output tokens. Pricing: $0.0800/1M input tokens, $0.1500/1M output tokens.

Input: $0.080/1M Output: $0.15/1M Context: 128K
text function calling json mode

Deepseek R1 7b Qwen

Deepseek R1 7b Qwen is available via Llamagate with a 131K context window and up to 16,384 output tokens. Pricing: $0.0800/1M input tokens, $0.1500/1M output tokens.

Input: $0.080/1M Output: $0.15/1M Context: 131K
text function calling reasoning json mode

Openthinker 7b

Openthinker 7b is available via Llamagate with a 33K context window and up to 8,192 output tokens. Pricing: $0.0800/1M input tokens, $0.1500/1M output tokens.

Input: $0.080/1M Output: $0.15/1M Context: 33K
text function calling reasoning json mode

Mistral 7b V0.3

Mistral 7b V0.3 is available via Llamagate with a 33K context window and up to 8,192 output tokens. Pricing: $0.1000/1M input tokens, $0.1500/1M output tokens.

Input: $0.10/1M Output: $0.15/1M Context: 33K
text function calling json mode

Deepseek R1 8b

Deepseek R1 8b is available via Llamagate with a 66K context window and up to 16,384 output tokens. Pricing: $0.1000/1M input tokens, $0.2000/1M output tokens.

Input: $0.10/1M Output: $0.20/1M Context: 66K
text function calling reasoning json mode

Llava 7b

Llava 7b is available via Llamagate with a 4K context window and up to 2,048 output tokens. Pricing: $0.1000/1M input tokens, $0.2000/1M output tokens.

Input: $0.10/1M Output: $0.20/1M Context: 4K
text vision json mode

Qwen3 Vl 8b

Qwen3 Vl 8b is available via Llamagate with a 33K context window and up to 8,192 output tokens. Pricing: $0.1500/1M input tokens, $0.5500/1M output tokens.

Input: $0.15/1M Output: $0.55/1M Context: 33K
text vision function calling json mode

Compare Llamagate model pricing

Use our pricing calculator to find the cheapest Llamagate model for your workload.

Pricing Calculator Compare Models All Models Directory

Related Reading

OpenAI vs Anthropic vs Google: Which AI API Should You Choose? → Cheapest LLM API in 2026: Complete Pricing Comparison → OpenAI API Pricing Guide 2026 →