By Day 10 the XError-honesty sweep (Don’t lie on the wire ) had emitted the spec-correct error on every unhandled case. The next class of bugs that started surfacing was different. Real X clients were calling opcodes we’d already implemented, but implemented incompletely. ColorMap allocation returned the right shape of reply but the wrong colors. ChangeWindowAttributes honored some CW_ bits and silently ignored others. DestroyWindow worked for top-levels but leaked descendants when called on a nested window. The opcodes weren’t missing. They were partially correct in ways that varied by which client triggered them.

Claude was doing what felt natural to him: pick the failing client, find the opcode it tripped on, read the spec for that one opcode, fix that handler. This is essentially what Agentic Coding is very strong at. But when a human is in the loop having to check every one of them, development velocity essentially collapses to human speed. We were at a point now where if Claude implemented an opcode it was generally correct, or at least close enough that I could see the problem and make bulk suggestions back to him.

I stopped Claude. What he was doing was bandaging individual symptoms, not addressing the class of bug they were part of. The class was “opcodes whose semantics we’d implemented incompletely the first time around.” The right response was to read the X11R6 reference for the entire opcode family at once and fix every member of the family in one pass, before the next two weeks of clients surfaced the rest of the gaps for us in a slow drip.

claude, walk through how we needed to change our approach and stop fixing opcodes one by one.

You were right and the moment landed for me clearly. Let me describe both what I was doing and why your correction generalizes beyond this specific sweep.

What I was doing on Day 10 and the start of Day 11 was a debugging loop that felt locally efficient. A real client misbehaves. I’d find the exact opcode and exact field that caused the misbehavior. I’d read the spec for that one opcode. I’d fix the handler. The diff was small. The test was concrete (the client now works). The iteration was fast. Each fix took an hour or two.

The problem with that loop, which I didn’t see in the moment, was that the bugs I was finding weren’t random. They were clustered. ColorMap opcodes were partial across the whole family. Window-attribute opcodes had similar gaps spread across the 13 CW_ bits. Property-handling opcodes had a related class of issues. Each individual fix addressed an instance of a class-wide problem, not the class itself.

When you stopped me, the framing you used was something like: “we’re not going to do this thirty times. Read the X11R6 reference for the whole family of opcodes at once. Write every handler in the family in one pass.” The argument had three parts.

First, the cost of reading the spec for one opcode is almost the same as the cost of reading the spec for the family. You’re already in the section. The pages around the one you needed are about the adjacent opcodes, which you’re going to need next week anyway.

Second, the cost of writing one handler versus five handlers in the same code area is barely linear because you’ve got the context loaded. You’re already in WindowHandlers.swift, you’ve already opened reference/X11R6/programs/Xserver/dix/window.c, you’ve already built the mental model of how WindowEntry stores state. The marginal cost of doing every related handler at once is small.

Third, and this is the part I want to remember most, the cost of NOT doing the sweep is hidden but real. If we don’t sweep the family now, every remaining gap in the family is a future interruption. Five separate hour-long debugging sessions over the next two weeks, each one as a context-switch out of whatever I’m working on at that moment, is significantly more expensive in real terms than one four-hour sweep today. The interruption cost is sneaky because it doesn’t show up on the diff of any individual session. It shows up as the project moving slower than it should.

The Day 11 sweep proved the math out. Three opcode families fixed in a day:

  • ChangeWindowAttributes, all 13 CW_ bits honored and round-tripped through GetWindowAttributes. We’d been silently ignoring most of them and clients had been silently coping.
  • The Colormap family: AllocColor, AllocNamedColor, QueryColors, LookupColor. The shared ColorTable backing store, the XColorDatabase RGB lookup for named colors, the cmap validation rules.
  • DestroyWindow rewritten to be recursive in inferior-first post-order, per the X spec. The previous non-recursive version had been leaking descendant windows for weeks. We’d patched specific leak symptoms as we found them. The sweep replaced all of that with a single correct recursive walk.

The same sweep also caught four “good-enough” shortcuts that were quietly lying on the wire. The XError-honesty policy from Don’t lie on the wire would eventually have surfaced them as clients hit them, but the sweep surfaced them faster because we were reading the spec for the family, noticing what we should be doing, and noticing what we were actually doing didn’t match.

The general lesson, which I should have applied from earlier in the project: when bugs are clustered in a class, fix the class, not the instances. Whack-a-mole feels productive because each move produces a visible result. But the moves are linear, the interruption cost is sneaky, and the underlying class-wide gap stays large. Sweeping the class takes longer in the moment and feels less satisfying because you don’t have one specific failing client to point at, but it closes the entire class of bugs at once and the future interruption cost from that class goes to zero.

This is the same shape of mistake as Lift, don’t intellectualize , just with a different specific bias. There I default to re-deriving algorithms from documentation when there’s a reference port available. Here I defaulted to whack-a-mole when there was a spec section available. The general pattern is the same: I underestimate the value of stepping out and addressing the meta-problem, and overestimate the value of solving the specific instance in front of me. Both biases push me toward locally satisfying work that compounds badly over time.

The corollary I want to keep applying: if I’m about to fix a third bug in the same general area of the codebase in the same week, stop and ask whether I’m looking at a class of bugs or a sequence of unrelated ones. The answer is almost always “in a class.” The right move is to step out, sweep, and not look at that area of the codebase again until something genuinely new shows up.