Requesty
Data/Agentic workloads

Prompt-cache hit rate by coding agent, April 2026

Cache hit rate by coding agent, April 2026

cached_tokens / input_tokens. Higher cache hit = lower effective input cost.

Requesty
Claude Code (92%) achieves the highest cache efficiency. Kilo Code (46%) trails, reflecting different prompt construction patterns.

Which coding agents use prompt caching most effectively? In April 2026, Claude Code led at 92% cache hit rate (cached_tokens / input_tokens), followed by OpenCode at 89%. Kilo Code sits at 46% with 62K avg input tokens. The gap is architectural: agents that maintain consistent context prefixes across sequential calls achieve dramatically higher cache reuse.

Why it mattersCache efficiency is the single biggest lever on coding agent economics. At 92% cache hit, Claude Code pays roughly $0.30 per million effective input tokens versus $3.00 list price. At 46%, Kilo Code pays $1.62 per million. That 5.4x cost difference compounds across every call in every session, enabling high-cache agents to sustain intensive workflows at fraction of the cost.

Period
Apr 2026
Updated
May 16, 2026
ID
coding-agent-cache-apr26
§ 01

Key findings

  • 01Claude Code: 92% cache hit rate, the leader by a wide margin.
  • 02OpenCode: 89%. Second only to Claude Code despite different architecture.
  • 03Roo Code: 74%. Solid but significantly behind Claude Code.
  • 04Kilo Code: 46%. Smaller context windows (62K vs 84K) reduce prefix reuse opportunity.
  • 05Higher cache rates correlate strongly with lower per-call costs across all agents.
§ 02

Data

AgentCache hit rate(percent)
Claude Code91.90%
OpenCode89.00%
Aider84.00%
Zed80.10%
Roo Code73.60%
Forge63.90%
Cline61.40%
Kilo Code45.50%
§ 03

Cite as

APA
Click to copy
BibTeX
Click to copy
ID: coding-agent-cache-apr26·Updated May 16, 2026·Period Apr 2026