Capability Patterns

Use this page for durable capability classes, not leaderboard snapshots. Specific rankings move too quickly to build the guide around.

Capability Classes That Matter

Best for:

Tradeoff:

Best for:

Tradeoff:

Best for:

Tradeoff:

Best for:

Tradeoff:

Best for:

Tradeoff:

Workflow	Start with this capability class	Why
Complex bug fix	Deep reasoning	Root-cause analysis matters more than speed
New feature with many moving parts	Deep reasoning	Planning and recovery matter
UI build from design references	Multimodal	Visual understanding changes the result
Tight edit loop	Fast iteration	Lower latency keeps the workflow moving
Large codebase exploration	Long-context	Breadth helps when paired with context hygiene
Sensitive or regulated work	Local or open-weight	Operational boundaries may matter more than peak capability

Advertised context is not the same thing as reliable context. Once the prompt gets noisy, even very large windows help less than people expect.

prefer selective retrieval over giant prompt dumps
treat long context as a tool for breadth, not permission to include everything
keep core rules pushed into project context files and retrieve the rest on demand

See Context Engineering for the workflow implications.

For time-sensitive benchmark details, use Benchmarks That Matter and confirm current data before making team-level decisions.

Research-backed: verification, selective context, and review costs matter more than leaderboard chasing
Practitioner-backed: capability classes are how many teams actually choose models in daily work

The taxonomy on this page is a workflow-first simplification, not one benchmark’s official ontology.