Skip to content

Ollama Models

Ollama provides 21 AI models accessible via API.

Visit Ollama →

21

Models Available

$0.000

Cheapest Input / 1M

262K

Largest Context

What is Ollama?

Ollama is an AI model provider offering 21 large language models for developers. Their cheapest model starts at $0.000 per 1M input tokens, and their largest context window reaches 262K. Ollama provides 21 AI models accessible via API.

Ollama Strengths

All Ollama Models

Model Input $/1M Output $/1M Context Max Output Released
Codegeex4 $0.000 $0.000 33K 8,192
Deepseek Coder V2 Instruct $0.000 $0.000 33K 8,192
Deepseek Coder V2 Lite Instruct $0.000 $0.000 33K 8,192
Deepseek V3.1:671b Cloud $0.000 $0.000 164K 163,840
Gpt Oss:120b Cloud $0.000 $0.000 131K 131,072
Gpt Oss:20b Cloud $0.000 $0.000 131K 131,072
Internlm2 5 20b Chat $0.000 $0.000 33K 8,192
Llama2 $0.000 $0.000 4K 4,096
Llama2:13b $0.000 $0.000 4K 4,096
Llama2:70b $0.000 $0.000 4K 4,096
Llama2:7b $0.000 $0.000 4K 4,096
Llama3 $0.000 $0.000 8K 8,192
Llama3.1 $0.000 $0.000 8K 8,192
Llama3:70b $0.000 $0.000 8K 8,192
Llama3:8b $0.000 $0.000 8K 8,192
Mistral 7B Instruct V0.1 $0.000 $0.000 8K 8,192
Mistral 7B Instruct V0.2 $0.000 $0.000 33K 32,768
Mistral Large Instruct 2407 $0.000 $0.000 66K 8,192
Mixtral 8x22B Instruct V0.1 $0.000 $0.000 66K 65,536
Mixtral 8x7B Instruct V0.1 $0.000 $0.000 33K 32,768
Qwen3 Coder:480b Cloud $0.000 $0.000 262K 262,144

Model Details

Codegeex4

Codegeex4 is available via Ollama with a 33K context window and up to 8,192 output tokens. Pricing: $0.000000/1M input tokens, $0.000000/1M output tokens.

Input: $0.000/1M Output: $0.000/1M Context: 33K
text

Deepseek Coder V2 Instruct

Deepseek Coder V2 Instruct is available via Ollama with a 33K context window and up to 8,192 output tokens. Pricing: $0.000000/1M input tokens, $0.000000/1M output tokens.

Input: $0.000/1M Output: $0.000/1M Context: 33K
text function calling

Deepseek Coder V2 Lite Instruct

Deepseek Coder V2 Lite Instruct is available via Ollama with a 33K context window and up to 8,192 output tokens. Pricing: $0.000000/1M input tokens, $0.000000/1M output tokens.

Input: $0.000/1M Output: $0.000/1M Context: 33K
text function calling

Deepseek V3.1:671b Cloud

Deepseek V3.1:671b Cloud is available via Ollama with a 164K context window and up to 163,840 output tokens. Pricing: $0.000000/1M input tokens, $0.000000/1M output tokens.

Input: $0.000/1M Output: $0.000/1M Context: 164K
text function calling

Gpt Oss:120b Cloud

Gpt Oss:120b Cloud is available via Ollama with a 131K context window and up to 131,072 output tokens. Pricing: $0.000000/1M input tokens, $0.000000/1M output tokens.

Input: $0.000/1M Output: $0.000/1M Context: 131K
text function calling

Gpt Oss:20b Cloud

Gpt Oss:20b Cloud is available via Ollama with a 131K context window and up to 131,072 output tokens. Pricing: $0.000000/1M input tokens, $0.000000/1M output tokens.

Input: $0.000/1M Output: $0.000/1M Context: 131K
text function calling

Internlm2 5 20b Chat

Internlm2 5 20b Chat is available via Ollama with a 33K context window and up to 8,192 output tokens. Pricing: $0.000000/1M input tokens, $0.000000/1M output tokens.

Input: $0.000/1M Output: $0.000/1M Context: 33K
text function calling

Llama2

Llama2 is available via Ollama with a 4K context window and up to 4,096 output tokens. Pricing: $0.000000/1M input tokens, $0.000000/1M output tokens.

Input: $0.000/1M Output: $0.000/1M Context: 4K
text

Llama2:13b

Llama2:13b is available via Ollama with a 4K context window and up to 4,096 output tokens. Pricing: $0.000000/1M input tokens, $0.000000/1M output tokens.

Input: $0.000/1M Output: $0.000/1M Context: 4K
text

Llama2:70b

Llama2:70b is available via Ollama with a 4K context window and up to 4,096 output tokens. Pricing: $0.000000/1M input tokens, $0.000000/1M output tokens.

Input: $0.000/1M Output: $0.000/1M Context: 4K
text

Llama2:7b

Llama2:7b is available via Ollama with a 4K context window and up to 4,096 output tokens. Pricing: $0.000000/1M input tokens, $0.000000/1M output tokens.

Input: $0.000/1M Output: $0.000/1M Context: 4K
text

Llama3

Llama3 is available via Ollama with a 8K context window and up to 8,192 output tokens. Pricing: $0.000000/1M input tokens, $0.000000/1M output tokens.

Input: $0.000/1M Output: $0.000/1M Context: 8K
text

Llama3.1

Llama3.1 is available via Ollama with a 8K context window and up to 8,192 output tokens. Pricing: $0.000000/1M input tokens, $0.000000/1M output tokens.

Input: $0.000/1M Output: $0.000/1M Context: 8K
text function calling

Llama3:70b

Llama3:70b is available via Ollama with a 8K context window and up to 8,192 output tokens. Pricing: $0.000000/1M input tokens, $0.000000/1M output tokens.

Input: $0.000/1M Output: $0.000/1M Context: 8K
text

Llama3:8b

Llama3:8b is available via Ollama with a 8K context window and up to 8,192 output tokens. Pricing: $0.000000/1M input tokens, $0.000000/1M output tokens.

Input: $0.000/1M Output: $0.000/1M Context: 8K
text

Mistral 7B Instruct V0.1

Mistral 7B Instruct V0.1 is available via Ollama with a 8K context window and up to 8,192 output tokens. Pricing: $0.000000/1M input tokens, $0.000000/1M output tokens.

Input: $0.000/1M Output: $0.000/1M Context: 8K
text function calling

Mistral 7B Instruct V0.2

Mistral 7B Instruct V0.2 is available via Ollama with a 33K context window and up to 32,768 output tokens. Pricing: $0.000000/1M input tokens, $0.000000/1M output tokens.

Input: $0.000/1M Output: $0.000/1M Context: 33K
text function calling

Mistral Large Instruct 2407

Mistral Large Instruct 2407 is available via Ollama with a 66K context window and up to 8,192 output tokens. Pricing: $0.000000/1M input tokens, $0.000000/1M output tokens.

Input: $0.000/1M Output: $0.000/1M Context: 66K
text function calling

Mixtral 8x22B Instruct V0.1

Mixtral 8x22B Instruct V0.1 is available via Ollama with a 66K context window and up to 65,536 output tokens. Pricing: $0.000000/1M input tokens, $0.000000/1M output tokens.

Input: $0.000/1M Output: $0.000/1M Context: 66K
text function calling

Mixtral 8x7B Instruct V0.1

Mixtral 8x7B Instruct V0.1 is available via Ollama with a 33K context window and up to 32,768 output tokens. Pricing: $0.000000/1M input tokens, $0.000000/1M output tokens.

Input: $0.000/1M Output: $0.000/1M Context: 33K
text function calling

Qwen3 Coder:480b Cloud

Qwen3 Coder:480b Cloud is available via Ollama with a 262K context window and up to 262,144 output tokens. Pricing: $0.000000/1M input tokens, $0.000000/1M output tokens.

Input: $0.000/1M Output: $0.000/1M Context: 262K
text function calling

Compare Ollama model pricing

Use our pricing calculator to find the cheapest Ollama model for your workload.

Pricing Calculator Compare Models All Models Directory

Related Reading

OpenAI vs Anthropic vs Google: Which AI API Should You Choose? → Cheapest LLM API in 2026: Complete Pricing Comparison → OpenAI API Pricing Guide 2026 →