Around Day 8 or 9 of the project we hit a decision point that I didn’t realize at the time was going to inform basically everything we built afterward. The X protocol treats every visible thing as a window. Top-level windows are what the user thinks of as an application window. Underneath each one, the application creates a tree of child X windows: every Motif button, every text-input field, every menu item, every pulldown is its own X window with its own ID, its own coordinate space relative to its parent, its own input event mask, its own visibility state. Quickplot maps a top-level and creates 50 or 100 X windows underneath it. Each one has to behave like a real, addressable, drawable X window.
The question was how to model all of that inside AppKit. XQuartz’s approach,
which we could read in the xpr source, is to make each X window its own
NSView nested under the top-level. So a Motif dialog with 30 widgets gets 30
NSViews in a parallel hierarchy. AppKit handles clipping, AppKit routes
mouse events, AppKit does a lot of the heavy lifting that the X server would
otherwise have to do itself. Clean mapping.
We didn’t go that way. Every X window inside the top-level lives entirely
inside the top-level’s NSView as a virtual surface, with its own clipList
computed by our own region algebra. There’s exactly one NSView per top-level,
regardless of how many X windows the client creates inside it.
claude, lay out why we went with one NSView per top-level instead of one NSView per X window like XQuartz does.
This is one of those architectural choices that looks neutral when you make it and then quietly shapes everything you build on top. Let me walk through the trade-off.
The X clipping model is specific. Every X window has a clipList, which is
the region of its content that’s actually visible after accounting for
ancestors clipping it, descendants opaquely covering parts of it, siblings
drawn above it, and the window’s own bounds. The X protocol defines exactly
how clipList updates when a window moves, resizes, gets mapped, gets
unmapped, gets a child mapped on top of it, or gets a sibling raised above
it. It also defines exactly how Expose events get emitted when regions
change. Real X clients depend on those rules. Motif’s widget realization
order, Xt’s geometry-management cascade, the XSetWMHints size-hints flow
in Xlib: all of them assume the X server is producing clipList-derived
behavior to spec.
AppKit’s NSView clipping is related but distinct. AppKit clips a view’s
drawing to its bounds. It clips child views to their parent’s bounds. It
clips sibling views based on Z-order. Each of those rules is similar to the
X rule it parallels, but the edges are different. Sibling-overlap visibility
in X is computed bottom-up from the X window tree; AppKit’s sibling
visibility is computed from the view ordering in NSView.subviews. The
order of setFrame: calls during a resize cascade doesn’t match the order
of X ConfigureNotify emission. The relationship between window border and
content area is different. The rule for what counts as “visible” when a
child is partially covered by a sibling, and how that interacts with
drawRect:, isn’t quite what X says it should be.
XQuartz handles those differences in xpr by adding translation code on
top of the NSView mapping. The X server pushes a ConfigureNotify into
AppKit, AppKit emits a viewDidMoveToWindow or setFrame: callback,
XQuartz catches it and reconciles with what X expected to happen. There’s a
lot of “what X said should be visible vs. what AppKit computed to be
visible” reconciliation code. It works, but it’s a layer.
macXserver’s choice is to skip the layer. We don’t try to map X clipping
onto AppKit’s view clipping. We keep AppKit’s hands off everything below
the top-level. Inside that NSView, the X server runs its own region
algebra: a Region.swift type ported from miregion.c (see
Lift, don’t intellectualize
), a
ClipListEngine that walks the X window tree and computes each window’s
clipList from ancestor / sibling / descendant geometry, and a delta
cascade that emits Expose events according to the X protocol’s rules. AppKit
sees one big view. We draw into it with our own clip math.
The cost is real, and I want to be honest about it. We’ve written:
- The region algebra (
Region.swift, the miregion port). - The
clipListcomputation walking the X window tree. - The resize cascade logic, including the delta-cascade rule we eventually shipped on Day 21 .
- The per-window background paint logic, including the ParentRelative ancestor-chain walk.
- The sibling delta cascade (when a sibling’s
clipListgrows because another sibling got unmapped or moved). - The
borderCliplogic and the paint-parent-bg-over-uncovered-region path.
All of that AppKit gives you for free if you go the XQuartz route. We don’t get any of it for free. We have it because we wrote it.
What we win is that the X semantics are pure. When a Motif widget expects
its clipList to update in a particular order during a resize, we match
the spec. When an Xt widget expects an Expose event on a specific
sub-region after an ancestor’s unmap, we emit it. When a 1992-era X11R6
program expects pixel-exact draw-to-erase semantics inside its window, we
provide them. The bugs we ship come from our region code being wrong about
an X rule (and we fix them by going back to miregion.c), not from
AppKit and X disagreeing about what should happen.
There’s also a related win on the SHAPE side that we hit later (see
Smooth at device scale
). When we
apply a shape mask to a top-level, we apply it to the one NSWindow’s
compositing layer. The descendant X windows inside don’t have their own
AppKit clipping to fight with; the top-level’s mask defines the outline
and that’s it. If each X descendant had its own NSView, each one would
need a mask synced to the top-level’s shape, which sounds simple until
you realize Motif’s widget realization order means most descendants
exist before the SHAPE request even arrives.
The general pattern, which keeps coming back across the project: when two layers (the X protocol and AppKit) have similar but distinct semantics for the same concept, it’s usually cleaner to keep them separated and translate at one well-defined boundary than to try to map one onto the other and patch the seams. The boundary in this case is the top-level NSView. Above it, AppKit owns the window: drag, resize, Mission Control, Cmd-Tab, all the macOS features that make X clients feel like first-class Mac windows. Below it, the X server owns the regions: every clipList computation, every Expose event, every draw-to-erase guarantee. Neither layer is fighting the other because neither layer is inside the other.
If we did the project again I’d still make this call. The only thing I’d change
is to port miregion.c from Day 1
instead of trying to derive
the region algebra from the spec until Day 9
.