We’re thrilled to announce that Claude 4—both Opus 4 and Sonnet 4—is now live on Requesty, your go-to LLM gateway and routing platform. Experience Anthropic’s most powerful models seamlessly integrated into your favorite developer tools, complete with Requesty’s reliable prompt-caching layer to accelerate responses and reduce costs.
What’s New?
1. Hybrid Models for Every Use Case
- Claude Opus 4
- World’s best coding model: 72.5 % on SWE-bench, 43.2 % on Terminal-bench.
- Sustained multi-hour workflows: Tackle thousands of steps without losing context.
- Frontier agent performance: Ideal for complex, autonomous pipelines.
- Claude Sonnet 4
- Industry-leading reasoning: 72.7 % on SWE-bench with extended thinking.
- Steerable and efficient: Great balance of speed, precision, and cost.
- Free-tier availability: Perfect for prototyping and lighter tasks.
2. Enforced Prompt Caching on Requesty
- Cache Validity: Up to one hour per prompt, ensuring instant cold-start performance.
- Cost Savings: Cached inputs incur just 25% of the normal input rate—ideal for repeat calls in agent loops.
- Cache Control: Developers can tag, invalidate, or override caches via API flags.
3. Full Tool Support, Optimized for Parallel Calls
- Extended Thinking with Web Search, Code Execution, and Image Transformations all available through Requesty’s standard tool interface.
- Parallel Tool Execution: Fire off web searches, invoke your Python sandbox, and call local file tools simultaneously for lightning-fast, multi-faceted reasoning.
4. Deep IDE Integrations
Plug Claude 4 into the coding tools you already love:
| Tool | Highlights |
|---|---|
| Roo Code | Inline multi-file refactoring, advanced debugging, and CI feedback loops. |
| Cline | Terminal-first experience with toggleable “extended reasoning” mode. |
| Aider | Code suggestions in your local editor, now harnessing Claude 4’s precision. |
| Continue | Session persistence across your workflows, leveraging enforced caching. |
Just point your IDE at Requesty’s API endpoint, choose claude-opus-4 or claude-sonnet-4, and you’re off to the races.
Why Choose Requesty for Claude 4?
- Unified Billing & Transparent Pricing
- Input Tokens: $15 / $3 per million (Opus 4 / Sonnet 4)
- Cached Input: 25% of input rate
- Output Tokens: $75 / $15 per million
- Custom Tool Combinations
- Chain web searches, code runs, and file edits in a single request.
- Fine-tune call order or parallelism via our JSON workflow spec.
- Superior Throughput & Reliability
- Optimized request routing with retry logic and fallbacks to secondary zones.
- Detailed analytics dashboard for token usage, cache hit rates, and latency.
Getting Started
- Sign In or Create a Requesty Account
- Add Claude 4 to Your Plan: Head to the Models page and enable
claude-opus-4orclaude-sonnet-4. - Configure Prompt Caching: Toggle “Enforced Caching” in your project settings to start saving instantly.
- Integrate with Your Tools:
- Roo Code: In advanced settings, set your Requesty endpoint and pick Claude 4.
- Cline: Update your
requesty.config.jsonto"model": "claude-opus-4"(or Sonnet). - Aider & Continue: Select Requesty under “Providers” and choose your model.
Try It Today
Empower your applications, bots, and agentic workflows with the next generation of Claude models, now turbo-charged by Requesty’s prompt caching and robust routing. We can’t wait to see what you build—whether it’s a multi-hour code refactor, a sophisticated research assistant, or the next great AI-powered product.
Get Started → Visit Requesty Dashboard
Stay tuned for more features, benchmarks, and deep dives coming soon. As always, your feedback fuels our improvements—drop us a line on Discord or in our support portal.
Happy building!

