DeepInfra Models

DeepInfra provides 67 AI models accessible via API.

Models Available

$0.020

Cheapest Input / 1M

1.0M

Largest Context

What is DeepInfra?

DeepInfra is an AI model provider offering 67 large language models for developers. Their cheapest model starts at $0.020 per 1M input tokens, and their largest context window reaches 1.0M. DeepInfra provides 67 AI models accessible via API.

DeepInfra Strengths

All DeepInfra Models

Model	Input $/1M	Output $/1M	Context	Max Output	Released
Meta Llama/Llama 3.2 3B Instruct	$0.020	$0.020	131K	131,072	—
Meta Llama/Meta Llama 3.1 8B Instruct Turbo	$0.020	$0.030	131K	131,072	—
Mistralai/Mistral Nemo Instruct 2407	$0.020	$0.040	131K	131,072	—
Meta Llama/Meta Llama 3 8B Instruct	$0.030	$0.060	8K	8,192	—
Meta Llama/Meta Llama 3.1 8B Instruct	$0.030	$0.050	131K	131,072	—
Qwen/Qwen2.5 7B Instruct	$0.040	$0.10	33K	32,768	—
Sao10K/L3 8B Lunaris V1 Turbo	$0.040	$0.050	8K	8,192	—
Google/Gemma 3 4b It	$0.040	$0.080	131K	131,072	—
Nvidia/NVIDIA Nemotron Nano 9B	$0.040	$0.16	131K	131,072	—
Openai/Gpt Oss 20b	$0.040	$0.15	131K	131,072	—
Meta Llama/Llama 3.2 11B Vision Instruct	$0.049	$0.049	131K	131,072	—
Google/Gemma 3 12b It	$0.050	$0.10	131K	131,072	—
Mistralai/Mistral Small 24B Instruct 2501	$0.050	$0.080	33K	32,768	—
Openai/Gpt Oss 120b	$0.050	$0.45	131K	131,072	—
Meta Llama/Llama Guard 3 8B	$0.055	$0.055	131K	131,072	—
Qwen/Qwen3 14B	$0.060	$0.24	41K	40,960	—
Microsoft/Phi 4	$0.070	$0.14	16K	16,384	—
Mistralai/Mistral Small 3.2 24B Instruct 2506	$0.075	$0.20	128K	128,000	—
Gryphe/MythoMax L2 13b	$0.080	$0.090	4K	4,096	—
Qwen/Qwen3 30B A3B	$0.080	$0.29	41K	40,960	—
Meta Llama/Llama 4 Scout 17B 16E Instruct	$0.080	$0.30	328K	327,680	—
Qwen/Qwen3 235B A22B Instruct 2507	$0.090	$0.60	262K	262,144	—
Google/Gemma 3 27b It	$0.090	$0.16	131K	131,072	—
Qwen/Qwen3 32B	$0.10	$0.28	41K	40,960	—
Google/Gemini 2.0 Flash 001	$0.10	$0.40	1M	1,000,000	—
Meta Llama/Meta Llama 3.1 70B Instruct Turbo	$0.10	$0.28	131K	131,072	—
Nvidia/Llama 3.3 Nemotron Super 49B V1.5	$0.10	$0.40	131K	131,072	—
Qwen/Qwen2.5 72B Instruct	$0.12	$0.39	33K	32,768	—
Meta Llama/Llama 3.3 70B Instruct Turbo	$0.13	$0.39	131K	131,072	—
Qwen/Qwen3 Next 80B A3B Instruct	$0.14	$1.40	262K	262,144	—
Qwen/Qwen3 Next 80B A3B Thinking	$0.14	$1.40	262K	262,144	—
Qwen/QwQ 32B	$0.15	$0.40	131K	131,072	—
Meta Llama/Llama 4 Maverick 17B 128E Instruct FP8	$0.15	$0.60	1.0M	1,048,576	—
Qwen/Qwen3 235B A22B	$0.18	$0.54	41K	40,960	—
Meta Llama/Llama Guard 4 12B	$0.18	$0.18	164K	163,840	—
Qwen/Qwen2.5 VL 32B Instruct	$0.20	$0.60	128K	128,000	—
Deepseek Ai/DeepSeek R1 Distill Llama 70B	$0.20	$0.60	131K	131,072	—
Meta Llama/Llama 3.3 70B Instruct	$0.23	$0.40	131K	131,072	—
Deepseek Ai/DeepSeek V3 0324	$0.25	$0.88	164K	163,840	—
Allenai/OlmOCR 7B 0725 FP8	$0.27	$1.50	16K	16,384	—
Deepseek Ai/DeepSeek R1 Distill Qwen 32B	$0.27	$0.27	131K	131,072	—
Deepseek Ai/DeepSeek V3.1	$0.27	$1.00	164K	163,840	—
Deepseek Ai/DeepSeek V3.1 Terminus	$0.27	$1.00	164K	163,840	—
Qwen/Qwen3 Coder 480B A35B Instruct Turbo	$0.29	$1.20	262K	262,144	—
NousResearch/Hermes 3 Llama 3.1 70B	$0.30	$0.30	131K	131,072	—
Qwen/Qwen3 235B A22B Thinking 2507	$0.30	$2.90	262K	262,144	—
Google/Gemini 2.5 Flash	$0.30	$2.50	1M	1,000,000	—
Deepseek Ai/DeepSeek V3	$0.38	$0.89	164K	163,840	—
Qwen/Qwen3 Coder 480B A35B Instruct	$0.40	$1.60	262K	262,144	—
Meta Llama/Meta Llama 3.1 70B Instruct	$0.40	$0.40	131K	131,072	—
Mistralai/Mixtral 8x7B Instruct V0.1	$0.40	$0.40	33K	32,768	—
Zai Org/GLM 4.5	$0.40	$1.60	131K	131,072	—
Microsoft/WizardLM 2 8x22B	$0.48	$0.48	66K	65,536	—
Deepseek Ai/DeepSeek R1 0528	$0.50	$2.15	164K	163,840	—
Moonshotai/Kimi K2 Instruct	$0.50	$2.00	131K	131,072	—
Moonshotai/Kimi K2 Instruct 0905	$0.50	$2.00	262K	262,144	—
Nvidia/Llama 3.1 Nemotron 70B Instruct	$0.60	$0.60	131K	131,072	—
Sao10K/L3.1 70B Euryale V2.2	$0.65	$0.75	131K	131,072	—
Sao10K/L3.3 70B Euryale V2.3	$0.65	$0.75	131K	131,072	—
Deepseek Ai/DeepSeek R1	$0.70	$2.40	164K	163,840	—
NousResearch/Hermes 3 Llama 3.1 405B	$1.00	$1.00	131K	131,072	—
Deepseek Ai/DeepSeek R1 0528 Turbo	$1.00	$3.00	33K	32,768	—
Deepseek Ai/DeepSeek R1 Turbo	$1.00	$3.00	41K	40,960	—
Google/Gemini 2.5 Pro	$1.25	$10.00	1M	1,000,000	—
Anthropic/Claude 3 7 Sonnet Latest	$3.30	$16.50	200K	200,000	—
Anthropic/Claude 4 Sonnet	$3.30	$16.50	200K	200,000	—
Anthropic/Claude 4 Opus	$16.50	$82.50	200K	200,000	—

Model Details

Meta Llama/Llama 3.2 3B Instruct

Meta Llama/Llama 3.2 3B Instruct is available via DeepInfra with a 131K context window and up to 131,072 output tokens. Pricing: $0.0200/1M input tokens, $0.0200/1M output tokens.

DeepInfra Models

What is DeepInfra?

DeepInfra Strengths

All DeepInfra Models

Model Details

Meta Llama/Llama 3.2 3B Instruct

Meta Llama/Meta Llama 3.1 8B Instruct Turbo

Mistralai/Mistral Nemo Instruct 2407

Meta Llama/Meta Llama 3 8B Instruct

Meta Llama/Meta Llama 3.1 8B Instruct

Qwen/Qwen2.5 7B Instruct

Sao10K/L3 8B Lunaris V1 Turbo

Google/Gemma 3 4b It

Nvidia/NVIDIA Nemotron Nano 9B

Openai/Gpt Oss 20b

Meta Llama/Llama 3.2 11B Vision Instruct

Google/Gemma 3 12b It

Mistralai/Mistral Small 24B Instruct 2501

Openai/Gpt Oss 120b

Meta Llama/Llama Guard 3 8B

Qwen/Qwen3 14B

Microsoft/Phi 4

Mistralai/Mistral Small 3.2 24B Instruct 2506

Gryphe/MythoMax L2 13b

Qwen/Qwen3 30B A3B

Meta Llama/Llama 4 Scout 17B 16E Instruct

Qwen/Qwen3 235B A22B Instruct 2507

Google/Gemma 3 27b It

Qwen/Qwen3 32B

Google/Gemini 2.0 Flash 001

Meta Llama/Meta Llama 3.1 70B Instruct Turbo

Nvidia/Llama 3.3 Nemotron Super 49B V1.5

Qwen/Qwen2.5 72B Instruct

Meta Llama/Llama 3.3 70B Instruct Turbo

Qwen/Qwen3 Next 80B A3B Instruct

Qwen/Qwen3 Next 80B A3B Thinking

Qwen/QwQ 32B

Meta Llama/Llama 4 Maverick 17B 128E Instruct FP8

Qwen/Qwen3 235B A22B

Meta Llama/Llama Guard 4 12B

Qwen/Qwen2.5 VL 32B Instruct

Deepseek Ai/DeepSeek R1 Distill Llama 70B

Meta Llama/Llama 3.3 70B Instruct

Deepseek Ai/DeepSeek V3 0324

Allenai/OlmOCR 7B 0725 FP8

Deepseek Ai/DeepSeek R1 Distill Qwen 32B

Deepseek Ai/DeepSeek V3.1

Deepseek Ai/DeepSeek V3.1 Terminus

Qwen/Qwen3 Coder 480B A35B Instruct Turbo

NousResearch/Hermes 3 Llama 3.1 70B

Qwen/Qwen3 235B A22B Thinking 2507

Google/Gemini 2.5 Flash

Deepseek Ai/DeepSeek V3

Qwen/Qwen3 Coder 480B A35B Instruct

Meta Llama/Meta Llama 3.1 70B Instruct

Mistralai/Mixtral 8x7B Instruct V0.1

Zai Org/GLM 4.5

Microsoft/WizardLM 2 8x22B

Deepseek Ai/DeepSeek R1 0528

Moonshotai/Kimi K2 Instruct

Moonshotai/Kimi K2 Instruct 0905

Nvidia/Llama 3.1 Nemotron 70B Instruct

Sao10K/L3.1 70B Euryale V2.2

Sao10K/L3.3 70B Euryale V2.3

Deepseek Ai/DeepSeek R1

NousResearch/Hermes 3 Llama 3.1 405B

Deepseek Ai/DeepSeek R1 0528 Turbo

Deepseek Ai/DeepSeek R1 Turbo

Google/Gemini 2.5 Pro

Anthropic/Claude 3 7 Sonnet Latest

Anthropic/Claude 4 Sonnet

Anthropic/Claude 4 Opus

Compare DeepInfra model pricing

Related Reading