Part 1 covered configuration — CLAUDE.md files, rules, and skills. That’s the foundation. But configuration without workflow is like having a well-organized toolbox and no blueprint.
This part covers what happens after setup: how to plan work so Claude executes well, the discipline that keeps you safe during execution, and how to run multiple instances when one Claude isn’t enough.
Same caveat as Part 1: these are mostly notes to myself on what’s working now. The space is evolving fast.
Planning: Specs Before Code
The single most important thing I’ve learned using Claude Code: always start with a plan. Not having one almost always leads to failure — Claude generates something, but it solves the wrong problem or misses constraints you hadn’t articulated. Output quality scales with plan quality too, but the first-order lesson is simpler: plan first, code second.
Why Planning Matters
Current LLMs have implicit planning — they reason through steps internally before generating code. So why bother using “Plan Mode”? Two reasons:
- Review before execution. A plan is cheaper to change than implemented code. You catch misunderstandings before they become refactoring sessions.
- Persistence across sessions. Plans and plan files survive context compaction — this is paramount for keeping multi-session development on track. Internal reasoning doesn’t persist. When Claude’s context window fills and gets summarized, the reasoning chain is gone. A markdown file in your repo stays.
PRD, Plan, Execute
The workflow I’ve settled on is three phases:
- PRD — Define what and why. What feature, what problem it solves, what constraints exist. The scale matches the work: a paragraph for a bug fix, a full phased spec for a product. For a chess openings app I built with Claude Code, the PRD was ~900 lines — phases with priorities, TypeScript interfaces, acceptance criteria per feature, dependency chains, and pricing decisions. All in a
docs/PRD.mdthat Claude references during implementation. The PRD lives in your repo as a persistent, version-controlled artifact. - Plan — Define how. File paths, function signatures, data flow, edge cases. Claude Code’s plan mode (
/plan) works well here — it explores the codebase and proposes an approach for your approval before writing anything. Plan mode output is ephemeral and session-scoped (stored in.claude/plans/). It answers “how do I implement this specific task right now” — it’s not a replacement for the PRD. The PRD defines the product; plan mode figures out how to build the next piece of it. - Execute — Mechanical implementation of the plan. This is where Claude Code does its best work, because the decisions are already made.
The temptation is to skip to execute. Resist it. Five minutes of planning saves an hour of debugging code that solves the wrong problem.
Spec-Driven Development
Martin Fowler’s recent article on spec-driven development formalizes this idea. SDD treats specifications as primary artifacts, not secondary documentation. Three levels:
- Spec-first — Write a spec before each coding task. The spec defines behavior, acceptance criteria, and constraints.
- Spec-anchored — Specs persist beyond initial creation. As the feature evolves, the spec evolves with it.
- Spec-as-source — Specs become the maintainable source files. Humans edit specs, never code directly.
Most of my work sits at level 1 or 2. Level 3 is interesting but requires tooling that isn’t mature yet. The practical takeaway: write a structured markdown spec before asking Claude to build anything non-trivial. Include what it should do, how it should integrate with existing code, and what success looks like. Feed that spec to Claude Code alongside the codebase context.
Beads for Tracking
For work that spans multiple sessions, I use Beads — a distributed, git-backed issue tracker designed for AI agents. Issues persist as structured data in your repo, surviving context compaction and session boundaries. You can define dependencies between issues, track blockers, and recover full context after starting a new session.
But “multi-session work” undersells it. I use Beads for bug tracking, feature ideas, and creative brainstorming too — this site’s future article pipeline and feature ideas live in Beads (keep an eye out for some fun easter eggs coming soon). Beads also serves as the task coordination layer in AI agent harnesses — tracking and distributing work across agents in multi-agent workflows.
It’s not required — a markdown TODO file works for simple projects. But when you’re coordinating multi-session work with dependencies, having a tracker that lives in git and understands the agent workflow is valuable.
The Verify-Commit Loop
If planning is how you start well, the verify-commit loop is how you stay safe during execution.
The Pattern
Make a change. Verify it works. Commit. Every time.
That’s it. The discipline is in the consistency. Not “make five changes, verify, commit.” Not “make a change, it probably works, commit.” One change, one verification, one commit.
Why Commits Matter
git commit is your cheapest undo. If Claude goes off the rails — and it will, eventually — you revert to the last good state. The cost of reverting is proportional to the distance between commits.
Without frequent commits, you’re debugging a diff of unknown size. Did the bug come from the auth change or the database migration or the CSS refactor? With one commit per change, you know exactly what introduced the problem.
What “Verify” Means
Verification depends on context:
- Build check — Does it compile? Does the dev server start?
- Test suite — Do existing tests pass? Did you add tests for new behavior?
- Manual check — Does the output look right? Is the UI rendering correctly?
Claude Code can run these checks itself, but you need to tell it to. Add verification steps to your CLAUDE.md or plan: “After each change, run npm test and confirm the dev server renders correctly.” Otherwise Claude will claim the code works based on its own reasoning, which is not the same as actually running it.
Practical Rhythm
The natural rhythm with Claude Code looks like this:
- Describe the change (or reference the plan step)
- Claude implements it
- You (or Claude) run the build/tests
- Review the diff
- Commit with a descriptive message
- Move to the next change
Each commit should represent a working state. If you need to git bisect later, every commit in the history should build and pass tests. This is standard software engineering practice — Claude Code just makes it easier to maintain because the changes tend to be small and focused.
Running Parallel Instances
When one Claude isn’t enough.
Why Parallel
Independent tasks don’t need to be serial. Code review, test writing, and documentation can happen simultaneously. Each Claude Code session gets its own context window, so parallel instances don’t compete for context space. For a deeper look at the infrastructure that makes multi-agent workflows reliable, see AI Agent Harnesses.
The prerequisite: the tasks must be independent. Two instances editing the same file is a recipe for merge conflicts and lost work. Two instances working on separate features in separate files is fine.
Git Worktrees
Git worktrees are the foundation for safe parallel work. Each worktree is an independent checkout of the same repository, on a separate branch in a separate directory.
# Create worktrees for parallel work
git worktree add ../project-feature-auth feature/auth
git worktree add ../project-feature-search feature/search
# Each directory is a full working copy
cd ../project-feature-auth && claude # Instance 1
cd ../project-feature-search && claude # Instance 2
# When done, merge both branches
git worktree remove ../project-feature-auth
git worktree remove ../project-feature-search
No merge conflicts between instances because they’re on separate branches working on separate files. Each Claude instance sees a clean working directory with the full project context.
Tmux Multi-Agent Setup
For coordinated parallel work, I use a tmux layout with a coordinator and workers:
┌──────────────────┬──────────────────┐
│ COORDINATOR │ WORKER-1 │
│ (Pane 1) │ (Pane 3) │
├──────────────────┼──────────────────┤
│ WORKER-2 │ OUTPUT │
│ (Pane 2) │ (Pane 4) │
└──────────────────┴──────────────────┘
The coordinator assigns tasks and monitors progress. Workers execute independently. The output pane runs watch commands to track status.
For inter-agent communication, file-based messaging is the most reliable approach. Each worker reads tasks from and writes results to files in a shared .claude-agents/ directory:
# Coordinator assigns tasks
echo "Review src/api/auth.ts" > .claude-agents/worker1-task.txt
echo "Write integration tests for search" > .claude-agents/worker2-task.txt
# Workers report completion
echo "COMPLETE" > .claude-agents/worker1.status
# Coordinator checks progress
cat .claude-agents/worker*.status
File-based communication is simple, debuggable, and persists across sessions. Named pipes and tmux send-keys are alternatives, but files are harder to get wrong.
One gotcha: if you use oh-my-tmux, panes use 1-based indexing. The default tmux config is 0-based. If your send-keys commands fail with “can’t find window: 0”, check tmux show-options -g base-index. A dedicated walkthrough of this tmux multi-agent setup with working examples is coming in a future post.
Ralph Wiggum Loops
Geoffrey Huntley’s Ralph Wiggum pattern is a bash loop for continuous iteration:
while :; do cat PROMPT.md | claude-code ; done
Each iteration runs the full prompt from scratch. If the previous run failed or produced incomplete results, the next iteration picks up where it left off — the codebase state carries forward even though the Claude session doesn’t.
The key insight: failures are deterministic and correctable. When Ralph produces bad output, you refine the prompt, not the code. The loop runs again with better instructions. Huntley describes it as “tuning a guitar” — each iteration gets closer to correct.
When to use this: defined greenfield projects with clear prompts and measurable success criteria. It works less well for ambiguous tasks or complex refactoring where each iteration needs human judgment.
Gas Town
Steve Yegge’s Gas Town takes multi-agent coordination further. Where the tmux setup is manual, Gas Town is a full workspace manager built on Beads.
The architecture uses playful Mad Max terminology:
- The Mayor — Your primary coordinator agent. It maintains context about the workspace and distributes tasks.
- Polecats — Worker agents with persistent identities but ephemeral sessions. They can be spawned, complete work, and shut down while maintaining history.
- Convoys — Work bundles that group multiple Beads issues and assign them to agents.
The key advantage over manual tmux orchestration: Gas Town scales to 20-30 agents through git-backed persistent storage. Work state survives crashes and restarts because everything is tracked in Beads, not in session memory.
Agent Teams (Built-in)
Claude Code has experimental built-in agent teams support:
export CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1
This enables automatic teammate spawning, built-in peer-to-peer communication via mailboxes, shared task lists with dependency management, and auto-detection of tmux vs iTerm2 for pane management. It handles all the orchestration that the manual tmux approach requires you to build yourself.
It’s still experimental, but it’s the direction things are heading — the coordination layer moving from user-managed scripts to native tooling. See the Agent Teams documentation for current capabilities.
What’s Next
The remaining parts of this series:
- Part 3: SRE, Documentation & Team Management — Incident response, runbook generation, documentation workflows, team onboarding
- Part 4: Personal Life & Knowledge Management — Obsidian integration, daily notes, reading logs, personal automation
Resources
- Martin Fowler: Spec-Driven Development — SDD levels and tooling landscape
- Geoffrey Huntley: Ralph Wiggum — Continuous iteration loops for AI development
- Gas Town on GitHub — Multi-agent workspace manager
- Beads on GitHub — Git-backed issue tracker for AI agents
- Claude Code Agent Teams — Built-in multi-agent support (experimental)
- AI Agent Harnesses — How harness quality determines agent reliability
- Part 1: Setup and Configuration — CLAUDE.md hierarchy, rules, and skills

Comments