Requesty

Best AI models for agentic coding

Terminal-Bench Hard measures how well a model operates as a coding agent in a real terminal — running commands, editing files, and fixing repositories end-to-end. It is the closest proxy to how models perform inside tools like Claude Code, Cursor and Codex.

  1. 🥇
    OpenAI Inc. logo
    gpt-5.5
    OpenAI Inc.·$5.00 / $30.00 per 1M
    60.6%
  2. 🥈
    Anthropic PBC logo
    claude-opus-4-8
    Anthropic PBC·$5.00 / $25.00 per 1M
    58.3%
  3. 🥉
    OpenAI Inc. logo
    gpt-5.4
    OpenAI Inc.·$2.50 / $15.00 per 1M
    57.6%
  4. 4
    Google LLC (Gemini API) logo
    gemini-3.1-pro-preview
    Google LLC (Gemini API)·$2.00 / $12.00 per 1M
    53.8%
  5. 5
    Anthropic PBC logo
    claude-sonnet-4-6
    Anthropic PBC·$3.00 / $15.00 per 1M
    53.0%
  6. 6
    OpenAI Responses logo
    gpt-5.3-codex
    OpenAI Responses·$1.75 / $14.00 per 1M
    53.0%
  7. 7
    OpenAI Inc. logo
    gpt-5.4-mini
    OpenAI Inc.·$0.75 / $4.50 per 1M
    52.3%
  8. 8
    Anthropic PBC logo
    claude-opus-4-7
    Anthropic PBC·$5.00 / $25.00 per 1M
    51.5%
  9. 9
    Alibaba Cloud logo
    qwen3.7-max
    Alibaba Cloud·$2.50 / $7.50 per 1M
    50.8%
  10. 10
    Anthropic PBC logo
    claude-opus-4-5
    Anthropic PBC·$5.00 / $25.00 per 1M
    47.0%
  11. 11
    OpenAI Inc. logo
    gpt-5.2-chat
    OpenAI Inc.·$1.75 / $14.00 per 1M
    47.0%
  12. 12
    Anthropic PBC logo
    claude-opus-4-6
    Anthropic PBC·$5.00 / $25.00 per 1M
    46.2%
  13. 13
    DeepSeek logo
    deepseek-v4-pro
    DeepSeek·$0.43 / $0.87 per 1M
    46.2%
  14. 14
    OpenAI Inc. logo
    gpt-5.1
    OpenAI Inc.·$1.25 / $10.00 per 1M
    45.5%
  15. 15
    Moonshot AI logo
    kimi-k2.6
    Moonshot AI·$0.95 / $4.00 per 1M
    43.9%
  16. 16
    Alibaba Cloud logo
    qwen3.6-plus
    Alibaba Cloud·$0.50 / $3.00 per 1M
    43.9%
  17. 17
    Z AI logo
    GLM-5.1
    Z AI·$1.40 / $4.40 per 1M
    43.2%
  18. 18
    Z AI logo
    GLM-5
    Z AI·$1.00 / $3.20 per 1M
    43.2%
  19. 19
    DeepInfra Inc. logo
    XiaomiMiMo/MiMo-V2.5-Pro
    DeepInfra Inc.·$1.00 / $3.00 per 1M
    43.2%
  20. 20
    OpenAI Inc. logo
    gpt-5.4-nano
    OpenAI Inc.·$0.20 / $1.25 per 1M
    42.4%
  21. 21
    minimax-m3
    MiniMax·$0.30 / $1.20 per 1M
    42.4%
  22. 22
    Google LLC (Gemini API) logo
    gemini-3-pro-preview
    Google LLC (Gemini API)·$2.00 / $12.00 per 1M
    41.7%
  23. 23
    Google LLC (Vertex AI) logo
    gemini-3.5-flash
    Google LLC (Vertex AI)·$1.50 / $9.00 per 1M
    40.9%
  24. 24
    Novita AI logo
    qwen/qwen3.5-397b-a17b
    Novita AI·$0.60 / $3.60 per 1M
    40.9%
  25. 25
    Novita AI logo
    xiaomimimo/mimo-v2-pro
    Novita AI·$2.00 / $6.00 per 1M
    40.9%
  26. 26
    MiniMax-M2.7
    MiniMax·$0.30 / $1.20 per 1M
    39.4%
  27. 27
    Google LLC (Gemini API) logo
    gemini-3-flash-preview
    Google LLC (Gemini API)·$0.50 / $3.00 per 1M
    38.6%
  28. 28
    OpenAI Responses logo
    gpt-5-codex
    OpenAI Responses·$1.25 / $10.00 per 1M
    37.9%
  29. 29
    grok-4
    xAI Corp.·$3.00 / $15.00 per 1M
    37.9%
  30. 30
    grok-4.3
    xAI Corp.·$1.25 / $2.50 per 1M
    37.9%

How we rank

Scores for Terminal-Bench Hard come from Artificial Analysis, an independent AI benchmarking service. When a model is available through multiple providers (e.g. Anthropic direct, AWS Bedrock, Google Vertex), we show one canonical entry per model family so the ranking isn't polluted by duplicates. Benchmarks measure specific skills — always validate on your own workload before committing.

One API for every model on this list

Requesty is OpenAI-compatible and routes to 400+ models. Switch between any of the models above by changing one parameter in your code.

Get started free