Cost-performance analysis

Which Model Is Best for Coding?

A cost-performance leaderboard of the models coding agents run on - benchmark scores, real token-mix pricing, latency, and decoding speed.

Models used by coding agents

Sort by:

Model	Modality	DF Coding Score	Avg Coding Task Cost
GPT-5.6 Sol OpenAI	1.05M	88.1	$0.486 USD / task
GPT-5.5 OpenAI	1.05M	85.6	$0.490 USD / task
Gemini 3.5 Flash Google	1M	81.8	$0.620 USD / task
GPT-5.6 Terra OpenAI	1.05M	81.1	$0.293 USD / task
Claude Fable 5 Anthropic	1M	79.4	$2.36 USD / task
GPT-5.4 OpenAI	1M	76.4	$0.254 USD / task
Gemini 3.1 Pro (Preview) Google	1M	74.9	$0.783 USD / task
GPT-5.6 Luna OpenAI	1.05M	71.1	$0.151 USD / task
Claude Sonnet 5 Anthropic	1M	70.2	$0.969 USD / task
Claude Opus 4.7 Anthropic	1M	70.1	$1.58 USD / task
Claude Opus 4.8 Anthropic	1M	68.6	$1.16 USD / task
GPT-5.3 Codex OpenAI	400k	62.4	—
GPT-5.4 mini OpenAI	400k	60.2	$0.183 USD / task
Gemini 3 Flash (Preview) Google	1M	57.2	—
GPT-5.4 nano OpenAI	400k	53.9	—
Claude Sonnet 4.6 Anthropic	1M	50.3	—
Kimi-K2.7-Code Moonshot AI	256k	49.3	—
Claude Opus 4.6 Anthropic	1M	45.7	$0.581 USD / task
Claude Opus 4.5 Anthropic	200k	37.5	—
Claude Sonnet 4.5 Anthropic	1M	34.1	—
Claude Haiku 4.5 Anthropic	200k	28.2	$0.159 USD / task
GPT-5 mini OpenAI	400k	26.9	—
GPT-OSS 120B OpenAI	131k	16.0	—
Gemini 2.5 Pro Google	1M	12.2	—
GPT-5.3 Codex Spark OpenAI		—	—
Claude Opus 4.8 Fast Anthropic	1M	—	—
MAI-Code-1-Flash Microsoft		—	—
Raptor mini GitHub	400k	—	—

GPT-5.6 Sol OpenAI
#1

DF Score 88.1

Avg task cost $0.486

Latency —

Context window 1.05M
GPT-5.5 OpenAI
#2

DF Score 85.6

Avg task cost $0.490

Latency 118.46s

Context window 1.05M
Gemini 3.5 Flash Google
#3

DF Score 81.8

Avg task cost $0.620

Latency 20.59s

Context window 1M
GPT-5.6 Terra OpenAI
#4

DF Score 81.1

Avg task cost $0.293

Latency —

Context window 1.05M
Claude Fable 5 Anthropic
#5

DF Score 79.4

Avg task cost $2.36

Latency 239.01s

Context window 1M
GPT-5.4 OpenAI
#6

DF Score 76.4

Avg task cost $0.254

Latency 113.80s

Context window 1M
Gemini 3.1 Pro (Preview) Google
#7

DF Score 74.9

Avg task cost $0.783

Latency 25.74s

Context window 1M
GPT-5.6 Luna OpenAI
#8

DF Score 71.1

Avg task cost $0.151

Latency —

Context window 1.05M
Claude Sonnet 5 Anthropic
#9

DF Score 70.2

Avg task cost $0.969

Latency 144.37s

Context window 1M
Claude Opus 4.7 Anthropic
#10

DF Score 70.1

Avg task cost $1.58

Latency —

Context window 1M
Claude Opus 4.8 Anthropic
#11

DF Score 68.6

Avg task cost $1.16

Latency 30.05s

Context window 1M
GPT-5.3 Codex OpenAI
#12

DF Score 62.4

Avg task cost —

Latency 79.89s

Context window 400k
GPT-5.4 mini OpenAI
#13

DF Score 60.2

Avg task cost $0.183

Latency 11.87s

Context window 400k
Gemini 3 Flash (Preview) Google
#14

DF Score 57.2

Avg task cost —

Latency 7.60s

Context window 1M
GPT-5.4 nano OpenAI
#15

DF Score 53.9

Avg task cost —

Latency 4.74s

Context window 400k
Claude Sonnet 4.6 Anthropic
#16

DF Score 50.3

Avg task cost —

Latency 1.16s

Context window 1M
Kimi-K2.7-Code Moonshot AI
#17

DF Score 49.3

Avg task cost —

Latency 2.92s

Context window 256k
Claude Opus 4.6 Anthropic
#18

DF Score 45.7

Avg task cost $0.581

Latency —

Context window 1M
Claude Opus 4.5 Anthropic
#19

DF Score 37.5

Avg task cost —

Latency 15.04s

Context window 200k
Claude Sonnet 4.5 Anthropic
#20

DF Score 34.1

Avg task cost —

Latency 1.36s

Context window 1M
Claude Haiku 4.5 Anthropic
#21

DF Score 28.2

Avg task cost $0.159

Latency 0.81s

Context window 200k
GPT-5 mini OpenAI
#22

DF Score 26.9

Avg task cost —

Latency 85.30s

Context window 400k
GPT-OSS 120B OpenAI
#23

DF Score 16.0

Avg task cost —

Latency 0.94s

Context window 131k
Gemini 2.5 Pro Google
#24

DF Score 12.2

Avg task cost —

Latency 22.54s

Context window 1M
GPT-5.3 Codex Spark OpenAI
#25

DF Score —

Avg task cost —

Latency —

Context window
Claude Opus 4.8 Fast Anthropic
#26

DF Score —

Avg task cost —

Latency —

Context window 1M
MAI-Code-1-Flash Microsoft
#27

DF Score —

Avg task cost —

Latency —

Context window
Raptor mini GitHub
#28

DF Score —

Avg task cost —

Latency —

Context window 400k

Help us to detect updates in plans values

The prices and token mixes above are driven by real coding usage. Share yours via letmecode to allow us discover updates in plans values.

$ npx letmecode@latest

FAQ

The agent decision

Your model choice sets the budget. The agent sets the workflow.

Compare leading coding agents on the models they bundle, limits, and pricing to see which one turns your model choice into the best engineering results.

See the agent rankings Limits, models & cost

Something unclear or missing?

If any of the numbers or terms above don't add up, or you spotted something that looks off, tell us - we'll clarify and keep the data sharper for everyone.

Report inconsistency

Found it useful - share

Share this comparison with the world to give everyone the opportunity to make the right choice based on real numbers instead of marketing claims.