---
id: coding-agent-cache-apr26
slug: coding-agent-cache-hit-rate-apr-2026
title: "Prompt-cache hit rate by coding agent, April 2026"
topic: agentic
period: Apr 2026
updated: 2026-05-16
license: CC BY 4.0
canonical: https://requesty.ai/data/coding-agent-cache-hit-rate-apr-2026
---

# Prompt-cache hit rate by coding agent, April 2026

> Which coding agents use prompt caching most effectively? In April 2026, Claude Code led at 92% cache hit rate (cached_tokens / input_tokens), followed by OpenCode at 89%. Kilo Code sits at 46% with 62K avg input tokens. The gap is architectural: agents that maintain consistent context prefixes across sequential calls achieve dramatically higher cache reuse.

*Topic: Agentic workloads. Period: Apr 2026. Last updated 2026-05-16.*

## Why it matters

Cache efficiency is the single biggest lever on coding agent economics. At 92% cache hit, Claude Code pays roughly $0.30 per million effective input tokens versus $3.00 list price. At 46%, Kilo Code pays $1.62 per million. That 5.4x cost difference compounds across every call in every session, enabling high-cache agents to sustain intensive workflows at fraction of the cost.

## Questions this answers

- Which coding agent has the best prompt caching efficiency?
- How much does prompt caching reduce coding agent costs?
- How does Claude Code achieve 92% cache hit rate?

## Key findings

1. Claude Code: 92% cache hit rate, the leader by a wide margin.
2. OpenCode: 89%. Second only to Claude Code despite different architecture.
3. Roo Code: 74%. Solid but significantly behind Claude Code.
4. Kilo Code: 46%. Smaller context windows (62K vs 84K) reduce prefix reuse opportunity.
5. Higher cache rates correlate strongly with lower per-call costs across all agents.

## Data

| Agent | Cache hit rate (percent) |
| --- | --- |
| Claude Code | 91.90% |
| OpenCode | 89.00% |
| Aider | 84.00% |
| Zed | 80.10% |
| Roo Code | 73.60% |
| Forge | 63.90% |
| Cline | 61.40% |
| Kilo Code | 45.50% |

## Caveats

- Cache hit rate depends on both agent architecture and model provider. Anthropic, Bedrock, and Vertex have different caching implementations.
- Agents with very low traffic (Cursor, GitHub Copilot, Codex CLI) are excluded due to insufficient sample size.

## Cite as

**APA.** Requesty (2026). Prompt-cache hit rate by coding agent, April 2026. Requesty Data. https://requesty.ai/data/coding-agent-cache-hit-rate-apr-2026

```bibtex
@misc{requesty_coding_agent_cache_hit_rate_apr_2026,
  author       = {{Requesty}},
  title        = {Prompt-cache hit rate by coding agent, April 2026},
  year         = {2026},
  howpublished = {\url{https://requesty.ai/data/coding-agent-cache-hit-rate-apr-2026}},
  note         = {Requesty Data}
}
```

---

Downloads: [JSON](https://requesty.ai/data/coding-agent-cache-hit-rate-apr-2026/data.json) · [CSV](https://requesty.ai/data/coding-agent-cache-hit-rate-apr-2026/data.csv) · [Markdown](https://requesty.ai/data/coding-agent-cache-hit-rate-apr-2026/data.md)