AI models with the longest context window

A larger context window means more tokens you can fit in a single prompt — useful for whole-codebase analysis, long document Q&A, and agentic workflows. Note: effective quality often degrades past 128K tokens; prompt caching (supported on many models) is usually a better approach for repeated long context than brute-forcing more tokens in every call.

Explore other rankings

Best for coding

Ranked by SWE-Bench Verified

Best for reasoning

Ranked by GPQA Diamond

Best at math

Ranked by AIME 2024

Cheapest

Lowest input + output price per 1M tokens

How we rank

Ranked by the model's maximum context window. Context window is the total tokens (input + output) the model can process in a single request. Note that effective quality often degrades well below the advertised maximum — most production workloads get better results from prompt caching and retrieval than from stuffing more tokens in every call.

One API for every model on this list

Requesty is OpenAI-compatible and routes to 400+ models. Switch between any of the models above by changing one parameter in your code.

Get started free