Requesty

Compare 472+ AI models.

Flagship and open-weight models from OpenAI, Anthropic, Google, AWS Bedrock, Azure, DeepSeek, Meta, xAI, Mistral, Moonshot and more, through one OpenAI-compatible API with zero markup.

Frequently asked questions

Everything you need to know about accessing hundreds of AI models through a single API.

How many AI models can I access through Requesty?
Requesty routes to 472+ models across 24 providers, including OpenAI, Anthropic, Google Gemini, AWS Bedrock, Azure OpenAI, Google Vertex AI, DeepSeek, Meta Llama, xAI Grok, Mistral, Moonshot Kimi, Alibaba Qwen, Zhipu GLM and MiniMax. Use any of them through a single OpenAI-compatible API.
Does Requesty charge markup on top of provider pricing?
No. Requesty passes through exactly what the upstream provider charges. You pay the same per-token rate as going direct to OpenAI, Anthropic or Google — and you get smart routing, automatic failover, prompt caching, analytics, and a single unified API included. Requesty makes money on a small platform fee for enterprise features, not on per-token markup.
Which model is best for coding?
On SWE-Bench Verified — the most realistic coding benchmark, based on real GitHub issues — GPT-5.2 Codex, Claude Opus 4.7 and Claude Sonnet 4.6 currently lead. MiniMax M2.5 is the strongest open-weights option. See the "Best for coding" leaderboard above for live rankings, and each model detail page for full benchmark charts.
Which model is best for reasoning and math?
For graduate-level reasoning (GPQA Diamond), GPT-5.4, Grok 4 and Claude Opus 4.7 lead the pack. For math (AIME, MATH benchmarks), GPT-5.4 and Grok 4 currently top the charts, with DeepSeek R1 offering strong performance at a fraction of the price.
What is the longest context window available?
Several models now support 1M+ token context windows — great for whole-codebase analysis or long document reasoning. Gemini 2.5 Pro and some Claude variants lead on context length. Note that effective quality often degrades past 128K tokens; prompt caching (supported on many models) is usually a better approach for repeated long context.
Are there free AI models I can use?
Yes — 0 models on Requesty have a zero-cost tier, including several Llama variants and DeepSeek models via third-party hosts. They're ideal for prototyping and development. You can filter by "Free" in the model explorer above.
How do I switch between models in my code?
Requesty is OpenAI-SDK compatible. Point base_url to "https://router.requesty.ai/v1", set your API key, and change the "model" parameter to any supported model ID (e.g. "anthropic/claude-opus-4-7", "openai/gpt-5.2", "google/gemini-2.5-pro"). No library changes needed — the same code works across providers.
Is my data private? Is it used for training?
Most major providers (Anthropic, Vertex AI, Azure OpenAI, AWS Bedrock) do not use API data for training by default. OpenAI offers zero-retention deployments via enterprise tiers. Each model detail page shows the specific data retention and training policy for that provider. Requesty itself never uses your data for training.
Can I get regional deployments (EU, US, APAC)?
Yes. Models available through AWS Bedrock, Azure OpenAI, and Google Vertex AI can be pinned to specific regions (eu-west-1, us-east5, etc.) using the @region suffix. Useful for GDPR, HIPAA, and data residency requirements. Filter by Region in the explorer to see all options.
How are benchmark scores calculated?
Benchmark scores shown on Requesty are sourced from official model cards, Artificial Analysis, and public leaderboards (LiveBench, SWE-Bench, Vellum). Scores measure specific skills and do not capture every aspect of model quality — always test on your own workload. Each model detail page links the canonical benchmark sources.