Claude Code median latency by provider and model, April 2026
Claude Code median latency by provider and model, April 2026
Median provider_latency in seconds for Claude Code traffic routed through Anthropic, Bedrock, and Vertex.

How does Claude Code latency vary by cloud provider? In April 2026, Anthropic Haiku is the fastest at 1.8s median provider latency. Opus latency is remarkably consistent across providers (4.5-4.9s). Vertex Sonnet is the slowest at 6.2s, roughly 40% slower than the same model on Anthropic direct.
Why it mattersProvider choice affects both latency and reliability for the same model. Anthropic direct offers the lowest latency for Haiku and Sonnet, while Bedrock provides higher cache hit rates. Vertex delivers the fastest TTFT for Sonnet but the slowest total completion time. These tradeoffs matter for coding agents that make 50-200 API calls per session.
Key findings
- 01Anthropic Haiku: 1.8s median, the fastest Claude Code path. Sub-second TTFT at 0.79s.
- 02Opus latency is nearly identical across Anthropic (4.9s), Bedrock (4.9s), and Vertex (4.5s).
- 03Vertex has the lowest Opus latency (4.5s) and best TTFT for Sonnet (1.4s), but highest Sonnet total latency (6.2s).
- 04Bedrock achieves the highest cache hit rates (94-95%) across all model families.
- 05P95 latency ranges from 8s (Anthropic Haiku) to 32s (Vertex Sonnet). Tail latency varies 4x across providers.
Data
| Provider (Model) | Median latency | P95 latency | Median TTFT | Success rate(percent) | Cache hit rate(percent) |
|---|---|---|---|---|---|
| Anthropic (Haiku) | 1.8s | 8.1s | 0.8s | 95.96% | 90.33% |
| Vertex (Haiku) | 2.1s | 9.3s | 0.9s | 92.82% | 94.75% |
| Bedrock (Haiku) | 2.6s | 17.1s | 1.4s | 83.43% | 84.38% |
| Anthropic (Sonnet) | 4.4s | 24.3s | 1.9s | 97.01% | 91.71% |
| Vertex (Opus) | 4.5s | 15.8s | 1.9s | 96.46% | 95.59% |
| Bedrock (Sonnet) | 4.8s | 24.7s | 2.1s | 97.61% | 94.14% |
| Bedrock (Opus) | 4.9s | 27.4s | 2.3s | 95.99% | 94.64% |
| Anthropic (Opus) | 4.9s | 27.1s | 2.5s | 98.73% | 92.48% |
| Vertex (Sonnet) | 6.2s | 32.1s | 1.4s | 97.08% | 85.93% |