Agent Harness

Long-running agent work usually falls apart for a boring reason: the system loses the thread. A harness is how you stop that from happening.

What a Harness Is

A harness is the small amount of structure around the model that keeps the work legible:

persistent task files
verification commands
clear constraints
a repeatable review loop

If the session resets, compacts, or goes sideways, the harness is what lets the next pass pick the work back up without starting from scratch.

The Minimal Harness

You do not need much. For most projects, three files are enough:

`spec.md`

The source of truth for what should be built.

requirements
acceptance criteria
out of scope

`PLAN.md`

The executable task list.

current step
remaining steps
blockers
verification needed

`STATE.md`

The compact memory of what already happened.

decisions made
files touched
important constraints
unresolved risks

Why This Matters

Without a harness, long sessions turn into a pile of stale context, half-finished attempts, and forgotten constraints. With one, the task survives even when the conversation does not.

This matches current harness practice in Anthropic-style long-running agent setups, Codex-style operational harnesses, and open-source agent architecture analyses.

Examples in the Wild

Anthropic long-running agents - initializer + coding-agent workflow with durable progress artifacts and incremental commits
Codex / AGENTS.md - project instructions and verification commands discovered from the repo itself
Cline / implementation plans - structured planning files used before deep execution
GitHub Spec Kit / plan.md - project metadata and plan artifacts used to keep agent context aligned

Example Pattern

- [x] Define API contract
- [x] Add tests for auth middleware
- [ ] Implement token refresh flow
- [ ] Update docs

## Current focus
Implement token refresh flow without changing login behavior.

## Verification
- `npm test -- auth`
- `npm run build`

Rules That Actually Matter

Keep files short enough to reread quickly.
Update the harness when the task changes, not hours later.
Put constraints in writing so the next session cannot forget them.
Store the verification commands next to the plan.
Treat specs and plans like code: review them, tighten them, and keep them current.

Agent Harness

What a Harness Is

The Minimal Harness

`spec.md`

`PLAN.md`

`STATE.md`

Why This Matters

Examples in the Wild

Example Pattern

Rules That Actually Matter

When to Use a Harness

When You Can Skip It

Next Steps

Supporting Evidence

Bibliography