Calculate embedding costs for your document collection. Compare pricing across OpenAI, Cohere, Voyage, and Google embedding models with chunking estimates.
FreeNo SignupNo Server UploadsZero Tracking
Select models to compare
Chunks / Doc
6
Total Chunks
6,000
Total Tokens
3,000,000
Model
Provider
$/1M Tokens
Dimensions
Cost/Doc
Total Cost
Doubao Embedding
Volcengine
Free
2,560
$0.00
$0.00
Doubao Embedding Large
Volcengine
Free
2,048
$0.00
$0.00
Doubao Embedding Large Text 240915
Volcengine
Free
4,096
$0.00
$0.00
Doubao Embedding Large Text 250515
Volcengine
Free
2,048
$0.00
$0.00
Cost Comparison
Doubao Embedding
$0.00
Doubao Embedding Large
$0.00
Doubao Embedding Large Text 240915
$0.00
Doubao Embedding Large Text 250515
$0.00
Token estimation assumes ~1.33 tokens per word or ~4 characters per token. Chunks may exceed model max input length for some providers; real implementations truncate accordingly. Google Gemini Embedding is free within usage limits. Prices reflect current published rates.
Export
How to Use Embeddings Cost Calculator
1
Enter document details
Input the number of documents, average length (in words or characters), chunk size, and overlap.
2
Select models to compare
Toggle embedding models on or off to compare 3 or more models side by side.
3
Review chunking stats
See how many chunks your documents produce and total tokens to embed.
4
Compare costs
View total embedding cost and cost per document for each model, sorted cheapest first.
Frequently Asked Questions
Chunks per document = ceil(document_tokens / (chunk_size - chunk_overlap)). For example, a 2000-word document (~2660 tokens) with 500-token chunks and 50-token overlap produces ceil(2660/450) = 6 chunks.
Chunk overlap means each chunk shares some tokens with the previous chunk, improving retrieval quality at chunk boundaries. Common values are 10-20% of chunk size. More overlap means more chunks and higher cost.
We estimate 1.33 tokens per word for English text (the average across major tokenizers). For character input, we use 4 characters per token. Actual counts vary by language and content.
Google offers Gemini Embedding at no cost within certain usage limits. For high-volume production use, check Google's current terms as free tier limits may apply.
For most use cases, OpenAI text-embedding-3-small offers the best balance of cost and quality. For higher retrieval accuracy, text-embedding-3-large or Voyage 3 are good choices. Cohere embed-v4 excels at multilingual content.