Skip to content

Effective Patterns

These are the patterns that keep showing up when AI-assisted development actually goes well.

“Give Claude a way to verify its work. This is the single highest-leverage thing you can do.” — Anthropic

Verification is not optional. It’s the foundation everything else builds on.

  • AI produces plausible-looking code that may be subtly wrong
  • Without verification, you’re trusting output you can’t validate
  • Verification closes the loop. The agent can see its own mistakes and respond to them.
MethodExampleBest For
Tests”Run pytest after changes”Logic correctness
Type checker”Run mypy / tscType safety
Linter”Run eslint / ruffStyle, common bugs
Build”Run cargo buildCompilation
Screenshot”Take a screenshot”UI work
Expected output”Result should be X”Specific behavior

One reliable version looks like this:

1. Write the test first (or have the AI write it)
2. Commit the test
3. Prompt: "Make this test pass. Don't modify the test."

This forces the AI to produce code that demonstrably works.

See Veracode GenAI Code Security Report and METR uplift update for why verification matters so consistently even when the raw productivity results are mixed.

“The big secret is always close the loop. The model needs to be able to debug and test itself.” — Peter Steinberger

Set the workflow up so the agent can check its own work:

  • Have it run tests, not just write them
  • Use linters that catch errors immediately
  • Build CLIs for common operations

Instead of: “Build a login system”

Try: “Let’s discuss how authentication should work in this app. What are my options?”

This prevents premature building and surfaces better solutions.

Never ask the AI to “build the whole app.” Break it down:

  1. “Define the data structures in models.py
  2. “Implement the repository pattern for these models”
  3. “Write unit tests for the business logic”

Each step is verifiable before moving to the next.

“The more the model knows, the dumber it gets.” — Theo (t3.gg)

  • Don’t dump your entire codebase into context
  • Do provide only relevant files
  • Do give tools to search rather than pre-loading

See Context Engineering for the evidence and caveats behind this claim.

“Clone datasette/datasette-enrichments from GitHub to /tmp and imitate the testing patterns it uses.” — Simon Willison

The fastest way to get consistent output is to show an example:

Clone https://github.com/simonw/datasette to /tmp.
Look at how tests are structured in tests/.
Now write tests for my new plugin following the same patterns.

Use this when you need:

  • Setting up test patterns
  • Adopting library conventions
  • Replicating a coding style

If research lives in the same context as implementation, the main thread gets noisy fast. Let a subagent do the reading and come back with file paths and patterns.

Use subagents to investigate how authentication is implemented
in this codebase. Report back with file paths and patterns.

What you get:

  • Main context stays clean
  • Research happens in isolation
  • You get a summary, not raw exploration

Best fit:

  • Exploring unfamiliar codebases
  • Looking up documentation
  • Investigating multiple approaches
  • Any task that’s “read a lot, summarize a little”

Loose prompts are fine for tiny changes. They break down on real feature work.

Before a larger task, write a small spec with:

  • requirements
  • acceptance criteria
  • out-of-scope items
  • constraints or non-negotiables

Then prompt the model to read the spec and discuss the plan before writing code.

This is the easiest step up from improvising.

When work spans multiple sessions, keep a tiny set of persistent artifacts:

  • PLAN.md — what remains to be done
  • STATE.md — current status and decisions
  • spec.md or equivalent — the source of truth for intent

This keeps the task stable even when the model’s conversational context gets compacted or cleared.

Anti-PatternProblemFix
No verificationCan’t tell if code worksAlways include test/lint step
Giant promptsContext rotBreak into smaller asks
”Fix it” loopsFailed attempts pollute contextClear and rewrite prompt
Skipping reviewShipping code you don’t understandAlways read diffs