Skip to content

Context Engineering

“Context engineering is building dynamic systems to provide the right information and tools in the right format such that the LLM can plausibly accomplish the task.” — Harrison Chase, LangChain

Prompt engineering is writing a good instruction. Context engineering is deciding what the model sees, what tools it gets, and what stays out of view.

Ask yourself: “Can the model plausibly accomplish this task with the context I’m providing?”

If the answer is no, the problem is context, not the model.

This stopped being a prompt-writing trick a while ago. Good teams now treat context like a systems problem: what to include, what to leave out, what to fetch on demand, and what to isolate in another agent.

ComponentWhat It MeansExample
Right InformationModel has what it needsRelevant files, not entire codebase
Right ToolsModel can look things upgrep, file read, test runner
Right FormatStructured for consumptionShort errors, not raw JSON blobs
Plausibility CheckCould a human succeed?If you couldn’t, neither can the AI

1. Static Context (CLAUDE.md, .cursorrules)

Section titled “1. Static Context (CLAUDE.md, .cursorrules)”

Project-specific information that doesn’t change per-task:

  • Project architecture overview
  • Commands to run (dev, test, lint)
  • Known gotchas
  • Code style preferences

Keep ruthlessly short. For each line, ask: “Would removing this cause the AI to make mistakes?” If not, cut it.

The best static context is usually a list of gotchas, not a full project handbook.

Information the AI gathers based on the current task:

  • Relevant files based on the task
  • Recent git history
  • Test results
  • Error messages

Good AI tools do this automatically. They read files, grep for patterns, check git status.

Explore the project first. Then reach for the docs. If you reverse that order, the model often latches onto the clean example from the docs instead of the mess in front of it.

Evidence tags: Practitioner-backed (Workflow Archetypes); Research-backed (Productivity Research).

Give the model tools to look things up rather than stuffing everything in the prompt:

  • Let it grep the codebase
  • Let it read files on demand
  • Let it run commands

Tools > pre-loaded context for large codebases.

What the AI should remember between sessions:

  • Decisions made
  • Patterns established
  • What worked before

Most tools don’t have this yet. You simulate it with context files.

The more robust version is a small harness: PLAN.md, STATE.md, and a spec or scratchpad that can survive session resets and compaction.

Push-based project guidance such as AGENTS.md, CLAUDE.md, or .cursorrules often works better than hoping the model remembers to fetch the right rule at the right time.

This is grounded partly in eval evidence and partly in practitioner convergence. The safest claim is not that push-based context always wins, but that core rules and gotchas should be present without relying on retrieval luck.

Evidence tags: Practitioner-backed (Agent Harness). The push-vs-pull framing here is an editorial judgment built on those patterns.

Why:

  • Core rules stay visible every turn
  • The model does not need to remember to ask for the rule
  • You control instruction ordering and reduce retrieval misses

Use push-based files for rules and gotchas. Use tools and retrieval for everything else.

The order matters:

  1. Explore the local project — files, patterns, tests, commands
  2. Load only relevant context — not the whole repo
  3. Consult external docs if needed — only after the codebase frame is clear
  4. Execute with verification

This avoids a common failure mode: the model grabs a neat doc example that does not match your codebase at all.

There does not appear to be a strong primary-source citation for an exact 40% threshold. Treat it as a practitioner heuristic, not a measured law.

The better-supported point is broader: context quality degrades before the window is “full,” and long-running agents need compaction, selective retrieval, and persistent artifacts rather than giant prompt dumps.

If you want a more defensible working rule, think in terms of budget discipline rather than a magic cutoff. For example, the local context-engineering playbook uses a practical target range of roughly 60-85% utilization depending on model size and task shape.

Context Window40% ThresholdPractical Limit
128k tokens~51k tokens~40 files
200k tokens~80k tokens~65 files
1M tokens~400k tokens~325 files

Compaction strategies:

  1. Summarize — “Here’s what we’ve learned so far: [summary]”
  2. Clear and restart — Start fresh with only essential context
  3. Use subagents — Delegate research to separate AI instances
  4. Be selective — Include only files relevant to current task

For long logs and large command output, keep the beginning and end, not the full middle. Many teams now use “head-tail” compaction because it preserves setup and failure context without flooding the model with noise.

Subagents are not just a convenience feature. They are a way to keep your main context from turning into a junk drawer.

Use them when:

  • You need to search a large codebase from multiple angles
  • You need external documentation or examples
  • You want one agent exploring while another stays focused on implementation

The main benefit is not just parallelism. It is protecting the implementation context from search noise.

That claim is stronger than “multi-agent is always better.” In fact, research and tool practice both suggest orchestration overhead can erase gains unless the subagents have clearly separated jobs.

Evidence tags: Research-backed (Productivity Research); Practitioner-backed (Subagent Architectures).

  • Give the model search tools instead of pre-loading everything
  • Keep context files short and focused
  • Include only relevant files for the current task
  • Provide verification methods (run this test, check this file)
  • Put non-obvious repo rules in AGENTS.md, CLAUDE.md, or equivalent
  • Use a persistent plan file for multi-session work
  • Dump your entire codebase into context
  • Include verbose documentation in context files
  • Expect the AI to remember previous sessions
  • Fight over 40% context utilization
  • Start with external docs before the agent has explored your project
  • Mix research and implementation in one giant context when subagents would keep it clean

Bad context file (too verbose):

# Project Context
This is a web application built with React and Node.js. We started
this project in 2023 and it has grown significantly over time. The
team consists of 5 developers and we follow agile methodology...
[500 more lines of background]

Good context file (actionable):

# Project Context
TypeScript + React + Express. Tests with Vitest.
## Commands
- `npm run dev` - Start dev server
- `npm test` - Run tests
## Gotchas
- Auth tokens in cookies, not localStorage
- Use `date-fns`, not moment