---
id: reasoning-share-april-2026
slug: reasoning-token-share-by-provider-april-2026
title: "Reasoning-token share of provider output, April 2026"
topic: agentic
period: Apr 2026
updated: 2026-05-09
license: CC BY 4.0
canonical: https://requesty.ai/data/reasoning-token-share-by-provider-april-2026
---

# Reasoning-token share of provider output, April 2026

> How much of LLM output is reasoning/thinking tokens? In April 2026 on the Requesty gateway, Groq led at 82%, followed by Coding (79%), xAI (60%) and z.ai (51%). These routes are dominated by thinking models. Frontier routes ran around a third: Vertex (Gemini) 40%, OpenAI 36%, OpenAI Responses 33%. Anthropic and Bedrock report 0% because Anthropic does not surface reasoning tokens separately; extended thinking is delivered inline.

*Topic: Agentic workloads. Period: Apr 2026. Last updated 2026-05-09.*

## Why it matters

The industry narrative is "everything is reasoning now", but the data says reasoning is concentrated in a specific subset of routes, and even there, absolute volume is dwarfed by regular completion output. The Anthropic and Bedrock 0% is a measurement artefact, not a usage signal, which matters for any cost or quality comparison that relies on the reasoning-tokens column.

## Questions this answers

- How much LLM output is reasoning tokens?
- Which providers use the most reasoning models in 2026?
- Why does Anthropic show 0% reasoning tokens?
- Are AI agents mostly thinking or mostly responding?

## Key findings

1. High-reasoning routes: Groq 82%, Coding 79%, xAI 60%, z.ai 51%.
2. Frontier routes around a third: Vertex (Gemini) 40%, OpenAI 36%, OpenAI Responses 33%.
3. Vertex (Claude) does not appear here: Anthropic does not report reasoning tokens separately, so Claude thinking output is not counted.
4. Azure at 18%, leans on GPT-4.1-class models more than the latest reasoning checkpoints.
5. Anthropic, Bedrock, Mistral, Moonshot: 0%. Anthropic does not report reasoning tokens separately (thinking is inline). Mistral and Moonshot have no reasoning models routed.
6. Industry narrative is "everything is reasoning now". The data says reasoning is concentrated in a specific subset of providers and even there the absolute volume is dwarfed by regular completion output.

## Data

| Provider | Reasoning share (percent) |
| --- | --- |
| Groq | 82.30% |
| Coding | 79.00% |
| xAI | 59.70% |
| z.ai | 51.30% |
| Vertex (Gemini) | 39.90% |
| Minimaxi | 37.20% |
| OpenAI | 35.90% |
| OpenAI Responses | 32.50% |
| Azure | 18.10% |
| Novita | 3.00% |
| DeepSeek | 2.70% |

## Caveats

- Reasoning tokens were not tracked before 2026, so this is April 2026 only. Year-over-year comparison is not possible.
- A 0% reading does not necessarily mean a provider has no reasoning models - only that reasoning output is not reported separately on that route (e.g. Anthropic delivers thinking inline).

## Cite as

**APA.** Requesty (2026). Reasoning-token share of provider output, April 2026. Requesty Data. https://requesty.ai/data/reasoning-token-share-by-provider-april-2026

```bibtex
@misc{requesty_reasoning_token_share_by_provider_april_2026,
  author       = {{Requesty}},
  title        = {Reasoning-token share of provider output, April 2026},
  year         = {2026},
  howpublished = {\url{https://requesty.ai/data/reasoning-token-share-by-provider-april-2026}},
  note         = {Requesty Data}
}
```

## Cited in

- [What the gateway saw in April 2026](https://requesty.ai/blog/provider-trends-april-2026-agentic-share-latency)

---

Downloads: [JSON](https://requesty.ai/data/reasoning-token-share-by-provider-april-2026/data.json) · [CSV](https://requesty.ai/data/reasoning-token-share-by-provider-april-2026/data.csv) · [Markdown](https://requesty.ai/data/reasoning-token-share-by-provider-april-2026/data.md)