Requesty
Data/Agentic workloads

Reasoning-token share of provider output, April 2026

Reasoning-token share of provider output, April 2026

reasoning_tokens / output_tokens within each provider. Pure reasoning routes versus mixed routes.

High-share providers (Groq, Coding, Google, xAI) almost exclusively serve thinking models. Frontier providers are around a third reasoning. Anthropic / Bedrock / Mistral / Moonshot are at zero, either no thinking models routed, or thinking content surfaced differently.Providers with negligible reasoning output in April are omitted (ratio too noisy).

How much of LLM output is reasoning/thinking tokens? In April 2026 on the Requesty gateway, Groq led at 82%, followed by Coding (79%), xAI (60%) and z.ai (51%). These routes are dominated by thinking models. Frontier routes ran around a third: Vertex (Gemini) 40%, OpenAI 36%, OpenAI Responses 33%. Anthropic and Bedrock report 0% because Anthropic does not surface reasoning tokens separately; extended thinking is delivered inline.

Why it mattersThe industry narrative is "everything is reasoning now", but the data says reasoning is concentrated in a specific subset of routes, and even there, absolute volume is dwarfed by regular completion output. The Anthropic and Bedrock 0% is a measurement artefact, not a usage signal, which matters for any cost or quality comparison that relies on the reasoning-tokens column.

Period
Apr 2026
Updated
May 9, 2026
ID
reasoning-share-april-2026
§ 01

Key findings

  • 01High-reasoning routes: Groq 82%, Coding 79%, xAI 60%, z.ai 51%.
  • 02Frontier routes around a third: Vertex (Gemini) 40%, OpenAI 36%, OpenAI Responses 33%.
  • 03Vertex (Claude) does not appear here: Anthropic does not report reasoning tokens separately, so Claude thinking output is not counted.
  • 04Azure at 18%, leans on GPT-4.1-class models more than the latest reasoning checkpoints.
  • 05Anthropic, Bedrock, Mistral, Moonshot: 0%. Anthropic does not report reasoning tokens separately (thinking is inline). Mistral and Moonshot have no reasoning models routed.
  • 06Industry narrative is "everything is reasoning now". The data says reasoning is concentrated in a specific subset of providers and even there the absolute volume is dwarfed by regular completion output.
§ 02

Data

ProviderReasoning share(percent)
Groq82.30%
Coding79.00%
xAI59.70%
z.ai51.30%
Vertex (Gemini)39.90%
Minimaxi37.20%
OpenAI35.90%
OpenAI Responses32.50%
Azure18.10%
Novita3.00%
DeepSeek2.70%
§ 03

Cite as

APA
Click to copy
BibTeX
Click to copy
§ 04

Cited in

ID: reasoning-share-april-2026·Updated May 9, 2026·Period Apr 2026