An Experiment in Heterogeneous Software Builds

A typed builder DSL we built to test one idea

What is a builder?

Pick any two real things you build at work. A Node HTTP service has a runtime version, an entrypoint, some JavaScript or TypeScript sources, a package manager, a list of npm dependencies, test commands, a health endpoint, an HTTP protocol declaration. A C library with generated code has a compiler, header and source files, generated config headers, native dependencies pulled in through pkg-config, link steps, a test binary, output metadata. They differ in tools, dependency models, output shapes, and operational semantics. And yet both of them are builders. In each case something typed and structured turns into something Nix can realize.

You feel this in the small. Each ad-hoc builder grows its own validation pass, its own "show me the plan" debug mode (usually a --dry-run flag whose semantics have drifted from the next builder's --dry-run), and its own way to hand dependencies to a downstream tool. By the fourth time you have re-implemented the same four-way fan-out of validate, deps, dry-run, and materialize, it has stopped feeling like a coincidence.

The deeper problem is that we lack a formal account of what relates these builders. Build-system theory has a great deal to say about engines. Mokhov, Mitchell, and Peyton Jones ^[1] sort Make, Shake, Bazel, and Nix by their Task abstraction, whether Functor, Applicative, or Monad, paired with a scheduler and a rebuilder, which tells you what kind of engine you are sitting inside. Some of the theory does take on the interpretation question. Pieters, Rivas, and Schrijvers ^[19] generalize effects and handlers to a setting whose motivating example is build systems, where one task is written as a free computation and run through several handlers. That is the shape we wanted. It interprets a single Mokhov-style task across computation classes, with no typed builder spec at the centre, no relation between heterogeneous builder kinds, and no forget map. What stayed missing was an account of what it means for two unlike descriptions, an IDL codegen and a sandbox profile, to be principled refinements of one underlying thing.

This post is about that gap.

metaBuilder is a typed builder DSL that runs entirely at nix eval time. It treats a build as a description, a typed list of operations, and uses ornaments to relate different kinds of build to one base. We built it to test one idea, that ornaments are what relate heterogeneous build descriptions, where most people would reach for modules, record subtyping, or Provider-style edges. That is still a conjecture. We have not settled it. What we have is a working system that assumes it, and a lot of pushing on that system to find where it breaks. When a sentence here sounds certain, there is a theorem or a test or a shipped example behind it. Where we are guessing, we say so.

The basic idea

Before any formalism, the plain version. A builder, in metaBuilder, is a list of typed operations over typed inputs and outputs. The operations are things like "read this source", "declare this tool", "run this tool with these arguments", "write this file", "materialize a derivation". The list is the program. You write a spec by writing the list down, with a name and typed records for the parameters, dependencies, tools, and outputs.

Once you have a spec, a small family of interpreters fold over it. The bundled set:

validate asks whether every operation has what it needs.
deps returns the dependency graph.
dry-run shows the steps without producing derivations.
plan-view renders the shell that would run.
describe produces documentation.
materialize produces the actual pkgs.runCommand derivation.
plan-export serializes the plan as plain data, with the derivation-carrying fields omitted.

The same spec and the same operations list run through each of these folds, and the catalogue is open to extension; we describe the bundled set throughout.

Each bundled interpreter folds the same program into its own result. The spec is written once, and every interpreter reads it without modification.

You can try this below. Snap operations into a program, then fold it through six views, from a typed description to a dependency graph to a Dockerfile. The presets mirror the worked examples described later in the post.

What can you build?

Stack operation bricks on the plate to write a builder: BuilderOp in teal, RuntimeOp in amber. Then fold the program through an interpreter below. Every fold reads the same bricks; each one makes a different artifact.

load an example

BuilderOp

RuntimeOp

readSource

declareTool

writeFile

runTool

emitDescriptor

transformOutput

materializeDerivation

declareCapability

declareProtocol

declareService

materializeUnit

builder name

fold through

BuilderSpecnode-service-demo-artifactNodeServiceBuilder

BuilderOp · 10

readSourceserver.js <- ./server.js

declareToolnode = nodejs

declareToolbash = bash

writeFileshare/node-service-demo/package.json

runToolsyntax-check: node --check server.js

runToolinstall-service: bash -c install-node-service

emitDescriptornode-service-demo-metadata

transformOutputservice-tree ($out)

transformOutputpackage-json (share/node-service-demo/package.json)

materializeDerivationnode-service-demo-artifact (runCommand)

RuntimeOp · 4

declareCapabilitylifecycle

declareProtocolhttp :3000

declareServicenode-service-demo-service (http :3000)

materializeUnitnode-service-demo-service

toolsnode, bash

outputsservice-tree, package-json, share/node-service-demo/package.json

readSource

declareTool

resolveDependency

runTool

writeFile

copyPath

transformOutput

validateValue

emitDescriptor

declareEvidence

materializeDerivation

declareCapability

declareProtocol

declareService

materializeUnit

readSource

declareTool

writeFile

runTool

emitDescriptor

transformOutput

materializeDerivation

declareCapability

declareProtocol

declareService

materializeUnit

readSource

declareTool

writeFile

runTool

writeFile

emitDescriptor

transformOutput

materializeDerivation

declareTool

copyPath

runTool

writeFile

runTool

writeFile

runTool

writeFile

runTool

transformOutput

materializeDerivation

declareEvidence

readSource

declareTool

resolveDependency

runTool

writeFile

emitDescriptor

transformOutput

materializeDerivation

The structure earns its keep across different kinds of builder. A Node service, a C library, and a multi-implementation language project all share the same operation language, while each carries its own domain-specific structure on top. A Node service builder knows about a runtime version, a package manager, npm dependencies, and a health-check protocol; a C codegen builder knows about a compiler, header files, generated sources, and pkg-config resolution; a project builder knows about a language name. Adding that structure without disturbing the underlying interpretation is the job ornaments do.

When you write a NodeServiceBuilder value, you write a typed record with extra fields, and when the interpreters run they see a BuilderSpec, because the ornament's forget map strips the extras automatically. The extras it strips are data fields, not operations. A Node service still declares its capabilities, its protocol, and its service unit, but the smart constructor writes those into the one shared operation list as it builds the value, drawing on the same fixed alphabet every builder uses. The typed service field is the surface you author against; its content already rides into the program as a declareService operation, so dropping the field at forget time costs the fold nothing. You get type-checked Node-service semantics while the trampoline sees a plain builder program, and the path between the two is functorial.

Three properties have to hold at once for the account we are after, and no existing approach brings all three together. They are (a) a typed spec as the central object, (b) several typed interpretations of that same spec that we can tie together by theorem, and (c) a formal relation between heterogeneous specs, the kind that lets us say this IDL builder and that multi-implementation builder are both refinements of one shared base.

The construction we reached for in that gap is ornaments, which come from dependently-typed programming, introduced by McBride ^[7] and developed further by Dagand and McBride ^[8], ^[9]. An ornament is a principled refinement of an inductive datatype, one that adds structure while keeping a forgetful map back to the base. As far as we know no prior build system has put them to this use. The nearest precedent we can find is Dagand, Tabareau, and Tanter ^[10], who relate types across an interoperability boundary using partial type equivalences and partial Galois connections, a setting where ornaments themselves turn out to give rise to Galois connections. That work concerns type interoperability. Ours concerns builder interoperability, a different question that borrows related machinery.

The conjecture

The conjecture we are testing is that ornaments are the right formal account of how heterogeneous build descriptions relate, ahead of modules, record subtyping, or Provider-style edges. We have not proven it. Those three are all well established and run in production at scale, and any of them is the obvious first answer. We did not prototype metaBuilder three more times to rule them out. The decision that set the shape was wanting the whole thing in Nix, evaluated at nix eval time, one description carrying many interpretations. What follows takes each alternative against that goal and says where it comes up short.

A typed builder algebra with an ornament layer on top turns out to give four properties at once, in a single design, sharing one set of interpreters. There is a single base type for a build, BuilderSpec, whose programs are lists of typed operations, and the operation language is fixed once at the base. Sitting on top of that base is a family of refinements (CodeGenBuilder, IdlBuilder, ProjectBuilder, ServiceSpec, SandboxProfile, and so on) that add domain-specific structure while staying typed, composable, and named. Each refinement comes with a forgetful functor back to the base. forget is a generated function, derived by the kernel, so the base operation list survives the projection by construction rather than by anyone remembering to preserve it. And because every refinement forgets down to the base, a small fixed family of interpreters that fold over the base program applies, unchanged, to every refinement. That is what makes the implementation cost additive instead of multiplicative, $N$ builder kinds and $k$ interpreters costing $N + k$ implementation sites rather than $N \times k$. Each interpreter is written once against the base, and each new builder kind supplies only its ornament definition.

metaBuilder is the thing the conjecture predicts, built so that we could test it. It is a typed builder DSL evaluated entirely at nix eval time, on top of the nix-effects experimental description-interpretation kernel.

The obvious alternatives

Each of these is the obvious answer if you have not seen ornaments. The question for us is narrower. Inside an all-in-Nix, eval-time setting, does this account give us one description with many interpretations and a structural relation between heterogeneous kinds. None of them quite does, for reasons that take a little work to lay out in each case.

Take the NixOS module system, which shows up here as drv-parts ^[3], dream2nix v1, and flake-parts. It has typed options, merging, defaults, and a real schema layer over mkDerivation. We use it every day, and the drv-parts team did about the hardest engineering anyone has done on top of mkDerivation. The limit is subtler than a missing feature. A drv-parts module refines a configuration whose output is, in the end, a derivation. The same configuration does not also become some second kind of thing. There is no dry-run interpreter running over the same module values without building anything, and no describe interpreter walking the same structure to produce documentation. The $N + k$ collapse never gets its chance, because $k$ is one here, in effect. A fair objection is that the module system can be introspected; nixos-option, the docs generators, and getSubOptions all read a module. What they read is the option-declaration schema, the types and defaults and descriptions, not a value-level description of the build, which is the thing the metaBuilder interpreters fold over. Write NodeServiceBuilder as a drv-parts module and you can express the runtime version, the package manager, the npm dependency list, the service record. The operation-list-as-data is the part that goes missing, because the module is the derivation recipe and not a description of one, so there is no typed program left for validate, deps, dry-run, plan-view, and describe to fold over. The Thunk-invariant carrier goes with it (the Thunk invariant section below says what that is), since module merging works on attrset values that carry no forget map to strip the Thunks back out.

Consider record subtyping next. A structural-typing discipline, like TypeScript's structural subtyping or ML-flavoured row polymorphism, could in principle hand us NodeServiceBuilder <: BuilderSpec by row extension. The base operations : [BuilderEff] field is there in both, the extra fields widen the type, and that much works. What it does not give us is any guarantee about the supertype projection, so nothing promises that the operation list comes through position by position with no rewrites. Ornaments generate the forget map field by field from a keep/insert algebra, which puts the relationship between NodeServiceBuilder and BuilderSpec in actual code rather than in a typing judgement. The old failure mode of inheritance-based config languages shows up here too. Structural widening lets you override fields the supertype never meant to expose, and the supertype cannot declare which projections it treats as structural. This is roughly the family the Nix ecosystem's own contract work belongs to. NixOS RFC 0189 "Contracts", and Nickel with Organist beside it, attach single-level structural predicates to a service interface and check them. That is useful and real, and it transports no interpreter across the relation and generates no forget map, so it sits next to the ornament account as a complement and not a counterexample.

The third candidate is the Provider model from Bazel and Buck2. It lives in another build ecosystem, which the all-in-Nix goal already rules out as an implementation, though the model is worth weighing on its own terms. Providers are typed records that one rule's outputs hand to the next rule's inputs, the industry's best answer to how a C++ rule and a Rust rule talk about a shared library without either side knowing the other. The type discipline is real, and it sits at the edges between rules. The rules themselves stay untyped in this sense. A Bazel cc_library is a Starlark function that emits a Provider, and there is no first-class object that says what a cc_library is. Ornaments put a type on the rule. A Provider tells you what a rule will output. An ornament also tells you what operations the rule will perform before any of them run, and it answers the same for every rule kind through one bundled set of folds. There is no Provider analog to the Thunk-invariant machinery, because nothing in the Provider model has to survive a deepSeq over a cyclic derivation attrset. A Provider only carries the result of running a rule. The recipe never travels with it.

None of this makes modules, subtyping, or Providers wrong for the problems they were built for. For the one question of how heterogeneous build descriptions formally relate, each comes up short in the way just described, and ornaments are the account we landed on that does not.

Three worked examples

Different builders look the same from the interpreter's side; that is what the shared operation list buys you. The formal treatment that follows names the machinery these examples lean on, so a term like BuilderEff or the agreement theorem is defined just after. Each of the three worked examples we ship comes in two layers, a builder design (the typed vocabulary, smart constructor, and generated program) and a worked artifact (a small real thing built with that vocabulary). A vocabulary with nothing built from it would only show that the algebra typechecks, so the artifact matters as much as the design.

A Node HTTP service. The builder design adds typed domain fields to BuilderSpec, namely runtimeVersion, entrypoint, packageManager, scripts, testCommand, and a service of type ServiceSpec (whose embedded protocol carries the port and the health path). The generated operation list mixes both domains, using BuilderOp for source reads, tool declarations, install and syntax-check runs, and final materialization, and using RuntimeOp for capability declarations (start, stop, health), the HTTP protocol declaration, the service declaration itself, and a materializeUnit that emits a JSON service descriptor as a Nix derivation. The smart constructor emits both domains into the single operations list of BuilderEff values, and the program is built by the canonical fromOrnamentedSpec entry point rather than by hand-concatenating two lists. The worked artifact is a real tiny HTTP server checked into examples/node-service/server.js that serves /health, /version, and /metrics endpoints, runs a smoke check as its test step, and ends up as a Nix derivation alongside a materialized unit descriptor that captures the service contract.

A C library with generated sources. The builder design adds typed domain fields to BuilderSpec, namely compiler (the compiler identity, a string), schema (the source-of-truth schema input, an arbitrary path), configDefines (a #define map for the generated config header), and artifactKind (a tag distinguishing static-library-with-CLI from other shapes). The actual source files, generated headers, output paths, and toolchain references are not ornament fields, since they live in the base BuilderSpec's inputs, tools, outputs, and operations. The operation list stays in the builder domain. Source reads bring in the .c files and the schema, tool declarations name the compiler, the archive tool, and a bash generator shell, and a writeFile lays down the generated config header. From there, runTool steps carry codegen, compile, archive, link, and test execution, transformOutput exposes the library, header, CLI, and pkg-config outputs, emitDescriptor records compiler/ABI metadata, and a final materializeDerivation closes the program. The worked artifact starts from a checked-in schema at examples/c-codegen/messages.def from which the generator emits generated_messages.h and generated_messages.c, alongside a pkg-config-style descriptor and a smoke-test binary that exercises the result. The output is a static library and CLI inside a Nix derivation that contains both the generated files and the compiled artifacts.

An OCI container image. The builder design adds typed image-configuration fields, and the generated program is an 18-operation pipeline over six of the existing BuilderOp constructors with zero interpreter changes. The count is parametric in the number of layers $L$. The program is $16 + 2L$ operations, of which $6 + 2L$ are runTool steps, so this single-layer image compiles to 18 operations with 8 runTool steps. Spec evidence adds one trailing declareEvidence operation per entry, which is what carries this example's single-evidence build to 19 operations over seven constructors. The whole digest chain runs as build-time bash in named runTool steps. The layer tarball is hashed for its diff id, the config references the diff id, the manifest references the config and layer digests, and the index references the manifest. Layers are uncompressed, which keeps the diff id equal to the layer digest. The worked artifact is a busybox shell image assembled without any container tooling at build time. The resulting image runs under podman directly, and reaches docker through skopeo's docker-archive bridge since docker load does not accept a bare OCI layout.

The three builders have nothing structural in common at the domain layer. One produces a long-running service with a typed protocol and capability surface, one a static library and an executable with native dependencies, and one a content-addressed image laid out for a container runtime. Yet all three end up as values of types that are ornaments of BuilderSpec, with the operation list at the centre, so once the spec is written the same interpreters run:

nix

{
  builder;       # the typed ornament definition
  spec;          # the typed domain value (a builder instance)
  program;       # the free-monad term, ready to fold
  validation;    # mb.program.validate.run     program
  deps;          # mb.program.deps.run         program
  dryRun;        # mb.program."dry-run".run    program
  planView;      # mb.program."plan-view".run  program
  docs;          # mb.program.describe.run     program
  materialize;   # mb.program.materialize.run  program
  selfView;      # mb.program.introspect.run   program
}

Every example exposes that attribute set, and there are four of them, node-service, c-codegen, oci-image, and idl. mb.examples.node-service.validation.ok and mb.examples.c-codegen.validation.ok are both true, and mb.examples.node-service.materialize.derivation and mb.examples.c-codegen.materialize.derivation are both real Nix derivations. Both of those examples expose non-empty docs.model and dryRun.steps and a materialize.plan that validates against BuildPlan.T, with the full test suite staying green under nix-unit. The node and C examples additionally expose materializeShell, materializeShellCheck, materializeDockerfile, and planExport, the renderings and the serialization of the same plan through the backends and the plan exporter.

In the produced attribute set, materialize.derivation is a pkgs.runCommand you can hand to nix-build, planView.shell is that same shell rendered as a string and suitable for showing in a tutorial, docs.markdown is documentation, and deps.nodes is the dependency graph. selfView is the composed "one build, many views" rendering produced by introspect, a meta-interpreter that runs the bundled interpreters against the same program and emits a markdown-safe summary covering validation, dependencies, dry-run, plan-view, self-docs, and materialization in one document. The coherence between these views rests on a kernel theorem. mattagfoldk-agrees-with-drytagfoldk-forall-p proves that materialize and dry-run agree on the constructor structure of their step sequences for every program, by induction over the program. Payloads, the shell renderer, and the store sit outside the theorem by design, and an extraction drift gate checks that the proven fold reproduces the live interpreter, with its two halves carrying different coverage. The materialize-step facets cover all four shipped examples (node, C, OCI, and IDL), while the rendered-shell facets cover the two examples that render shells, node and C. The remaining store-level property, that plan-view's shell when executed produces the same outputs as materialize's derivation, is asserted by an automated test rather than a kernel theorem. It builds the materialize derivation, runs the rendered plan-view shell inside its own sandbox carrying the plan's tool closure, and diffs the resulting tree against the derivation outputs byte for byte, so the build fails on any divergence. It is wired on the C example; the node example is out of scope because its launcher embeds /nix/store paths. Appendix A.3 states exactly where the line between proof and test sits.

The formal treatment

Prerequisite contract. To follow the main thread of this section you need a sum type, a fold over a tree, and rough familiarity with the free monad (in our case a list of operations you interpret later, since each operation has arity one). The main body does not assume category theory and presents the machinery operationally. Appendix A states the same content in categorical vocabulary (signature functors, free monads as initial algebras, catamorphisms, sections of forgetful natural transformations, the Dagand–McBride ornament algebra) for readers who want the full mathematical account, and nothing in the main body depends on the appendix.

The operation algebra

The central types are generated through nix-effects' HOAS kernel. BuilderOp is a closed sum of typed operations (currently readSource, resolveDependency, declareTool, runTool, writeFile, copyPath, transformOutput, validateValue, emitDescriptor, materializeDerivation, declareEvidence) where each constructor carries typed fields, and the list of constructors grows only when a deliberate base decision adds one. Alongside it is a second operation domain, RuntimeOp, whose constructors cover capability, protocol, and service declaration together with service materialization. The two fuse into a single closed sum, the fixed total signature that every program ranges over.

$$\mathsf{BuilderEff} \;=\; \mathsf{BuilderOp} \,\oplus\, \mathsf{RuntimeOp}$$

In Nix this is one line.

nix

BuilderEff = H.sum BuilderOp.T RuntimeOp.T

This line is not written by hand. A kind former derives it together with the plan carrier from a single definition of the builder kind, which the plan-state section below describes, so the operation sum and the state it folds into cannot drift apart.

The alphabet is closed in two senses at once. Every program is a list over this one fixed signature, and the signature itself changes only by a deliberate base decision, never by a refinement. BuilderSpec is the typed product that carries that list.

nix

BuilderSpec = H.product "MetaBuilderSpec" [
  (H.field "name"         H.string)
  (H.field "parameters"   (H.listOf ParameterSpec.T))
  (H.field "inputs"       (H.listOf SourceSpec.T))
  (H.field "dependencies" (H.listOf DependencySpec.T))
  (H.field "tools"        (H.listOf ToolSpec.T))
  (H.field "operations"   (H.listOf BuilderEff))   # the program
  (H.field "outputs"      (H.listOf OutputSpec.T))
  (H.field "evidence"     (H.listOf EvidenceSpec.T))
];

The H is the HOAS (higher-order abstract syntax) kernel from nix-effects. Every type here is a first-class Nix value with a derived JSON schema and a descriptor, where BuilderSpec.T is the type, BuilderSpec.schema is the JSON Schema for it, and BuilderSpec.descriptor is the structural projection used for documentation and JSON export.

The operation list is just data. Nothing executes when you write a spec, and that is what makes multiple interpretations possible.

The evidence field is part of that story rather than a side annotation some fold might forget to read. The canonical spec-to-program entry point compiles each evidence entry into a trailing declareEvidence operation, so provenance travels in the program itself. describe renders it as an Evidence section, validate checks it, and the folds that do not care ignore it explicitly. One program, many folds.

Programs and the free monad

A program is a free-monad term over BuilderEff, the signature fixed in the previous section. We get the constructors from nix-effects' description-interpretation kernel, namely K.pure for identity, K.bind for sequencing, and K.send for injecting an operation into the term. Sequencing a spec's operations is a left fold.

nix

sequence = ops:
  builtins.foldl'
    (comp: op: K.bind comp (_: emit op))
    (K.pure tt)
    ops;

The program is a lazy Nix value, so nothing runs until you hand it to the trampoline T.handle together with a dispatch function and an initial state. The trampoline then loops, calling dispatch { op; state } for each operation and threading the new state through; a handful of lines of glue per interpreter is the whole interpretation machinery.

This is the same pattern realized by Swierstra's Data Types à la Carte ^[11] and Carette-Kiselyov-Shan's Finally Tagless ^[12], where you write the program once as data, then write multiple folds. The categorical reading (programs as free-monad terms on a signature functor, interpreters as catamorphisms) is in Appendix A.

The interpreters

The interpreters share a common shape and differ only in the (dispatch, initialState) pair:

Interpreter	Initial state	Result
`validate`	`[]`	`{ ok; diagnostics }`
`deps`	`{ nodes; nodeIds; edges; services }`	dependency graph
`dry-run`	`{ steps }`	step list, no derivations
`materialize`	`emptyPlan` (PlanState)	`{ plan; derivation; value }`
`plan-view`	via materialize	rendered shell + artifact views
`describe`	`emptyModel`	`{ model; markdown }`
`plan-export`	via materialize	plan as plain data, boxed fields dropped

The shell and Dockerfile backends are not rows in this table. They translate the finished plan downstream of materialize, so they never dispatch on operations at all.

Every interpreter terminates when the program is finite and every dispatch is total on the operations that occur. Totality of the dispatches is tested rather than observed. Each interpreter does explicit case analysis per constructor and throws on anything it does not know, the dispatch combinator has no silent fall-through default, and a generated coverage grid asserts for every interpreter-and-constructor pair that the pair is handled or explicitly ignored. Termination we still observe rather than prove.

Ornaments, at two levels

We use ornaments twice, in two places, for two reasons.

HOAS-level ornaments refine BuilderSpec itself. The combinator H.ornament takes a base HOAS type and a field spec (a list of keep and insert markers) and produces a new named HOAS type that preserves the base structure while adding new fields. To take a small example, the NodeServiceBuilder ornament inserts a handful of fields on top of the BuilderSpec fields it keeps, namely a runtime version, an entrypoint, a package manager tag, a scripts map, a test command, and a typed service whose type is ServiceSpec (carrying name, description, package, capabilities, protocols, and config, with port and health path living inside the protocol record). The smart constructor emits this builder kind's runtime operations directly into the single operations list alongside the build-time ones, both being BuilderEff values, so the ornament contributes domain data only and never a second operation list. The full catalogue we ship includes generic codegen, project-style builders with language refinements, vendoring and test-suite refinements, a dependency-shape ornament that classifies how transitive resolution works for a given ecosystem, and an IDL ornament for multi-language schema codegen.

Every ornamented type reuses the base constructor name _con = "MetaBuilderSpec", the base operations field operations : [BuilderEff] is preserved everywhere, and the forgetful functor at this level is "project out the base fields, ignore the inserts," which is the identity on the operation list.

In practice each ornamented value first projects to its BuilderSpec view through forget, and the interpreters only ever see that view. Appendix A states this as the precomposition of each interpreter with a forgetful natural transformation in the Dagand–McBride ^[8] sense.

Four builder kinds share one base. Each ornament adds domain fields and forgets back to BuilderSpec, so the interpreter family lifts to all of them. The idl kind refines code-gen first, so its forget composes through two levels.

Generic-level ornaments refine runtime state, meaning the plan being built up during interpretation. The combinator here is G.ornaments.ornament, and its field specs are richer than the HOAS case, supporting keep, keep with sub-ornament morphism, and insert. The central case is PlanState, which refines BuildPlan:

nix

builderKind = mb.program.former.mkBuilderKind {
  name = "metaBuilder";
  base = BuildPlan;
  builderOps = BuilderOp;
  runtimeOps = RuntimeOp;
  stateSpec = {
    name = "MetaBuilderPlanState";
    constructors.MetaBuilderPlan.fields = [
      { keep = "name"; }
      { keep = "builder"; }
      { keep = "tools"; sub = G.ornaments.lift.list ToolStateForget; }
      { keep = "outputs"; }
      { keep = "pathMap"; }
      { keep = "nativePaths"; sub = G.ornaments.lift.list libRootOrnament; }
      { keep = "steps"; }
      { keep = "declaredCapabilities"; }
      { keep = "declaredProtocols"; }
      { keep = "declaredServices"; sub = G.ornaments.lift.list ServiceStateForget; }
      { keep = "runtimeArtifacts"; sub = G.ornaments.lift.list RuntimeArtifactStateForget; }
      { insert = "declaredTools"; type = H.attrs; }
      { insert = "serviceIndex"; type = H.attrs; }
    ];
  };
};

PlanState = builderKind.State;

The refined type has more fields than the base, with two inserts for O(1) sideband lookup tables, and every derivation-carrying field is kept through a named sub-ornament that boxes the derivation in flight and projects it to a store path on the way out. ToolStateForget forgets a boxed tool package to its dev-output store path, libRootOrnament to a lib-output store path, ServiceStateForget and RuntimeArtifactStateForget to plain store paths. The forget function is generated from the same definition, so finalizePlan = builderKind.forgetWith flattenListFields is the one-line call that produces a BuildPlan validating against BuildPlan.T, and that BuildPlan contains no live derivation values at all. Strings and steps only. The sole place a live derivation enters the picture is toDerivation, where the plan meets pkgs.runCommand.

The wrapper mkBuilderKind is doing more than packaging. The operation coproduct and the state ornament have to agree, and drift between two separate definitions of them is the one carrier bug we have shipped. The former derives both projections from one kind definition, which closes that gap by construction. The node, C, and OCI examples are spec-layer ornaments consuming this single kind definition, not three instances of the former.

The Thunk invariant, and the work ornaments do

PlanState boxes every derivation-carrying field in a Thunk ornament, and that boxing is what keeps a derivation safe to carry inside a plan that downstream consumers will walk.

A Nix derivation is a self-referential attrset. Its output attributes point back at the derivation itself, so d.out resolves to d and d.all contains d, and a structural walk that lacks cycle detection loops or overflows when it reaches one. It is tempting to blame normal-form forcing in general, but probing every evaluator back to Nix 2.3 shows that builtins.deepSeq detects the cycle by object identity and terminates; its only divergence mode is a lazy graph that regenerates a fresh object on each force, which no real plan state does. The walks that do fail are the cycle-blind ones, the ones that keep no seen-set. builtins.toJSON overflows the stack on a cyclic attrset without an outPath in a way tryEval cannot catch, and any user-written recursive walk does the same. Serialization is also exactly the walk metaBuilder ships.

The Thunk ornament wraps each derivation-carrying field in a nullary closure, written $\eta_X(v) = \lambda().\,v$, and a structural walk that treats functions as leaves cannot enter a closure body. The boxed field is therefore inert under exactly those cycle-blind, function-leaf walks, deepSeq and the fuel-walker among them; it is not magically safe under everything, since toJSON would throw uncatchably on the closure itself. The serialization path stays safe by omission rather than by the box being inert. plan-export drops the boxed positions before toJSON ever runs. When interpretation is done the kind's forget map projects each boxed derivation to its store path, so the finished BuildPlan carries no cycles at all.

The consumer that makes the boxing load-bearing is plan-export. It serializes the pre-finalize PlanState, keeping the BuildPlan fields and dropping the thunk-boxed positions, and its output feeds builtins.toJSON. metaBuilder's own trampoline never deep-forces the threaded state, since its per-step force is a scalar pin and its genericClosure worklist keeps long runs stack-safe, so the boxing earns its keep at the serialization boundary rather than inside interpretation. A fuel-bounded walker test pins the necessity. Walking the boxed state completes with fuel to spare, while walking a hand-unboxed twin of the same state exhausts any finite fuel on the derivation self-cycle. Delete the boxing and a shipped feature fails with a red test.

This is the sense in which the ornaments do real work. The forget map is the boundary between the boxed derivations the carrier holds during interpretation and the store paths the reader takes off the finished plan.

The end-to-end pipeline

A short chain of typed stages connects the user-facing spec to the Nix store derivation. Each arrow is a structural map; each intermediate type validates against its declared HOAS type.

The pipeline as a three-layer composition. Each layer holds typed objects related by structural maps (sequence, G.ornaments.forget); each cross-layer arrow names the morphism that bridges levels (T.handle (αₘₐₜ, s₀) runs the materialize interpreter, toDerivation calls pkgs.runCommand). The Nix-store layer (dashed) is the only stage outside the categorical world.

In symbols, the type-level chain is

$$\mathsf{BuilderSpec}\;\xrightarrow{\text{sequence}}\;\mathsf{Free}\,(\mathsf{BuilderOp}\oplus\mathsf{RuntimeOp})\;\xrightarrow{\text{T.handle}}\;\mathsf{PlanState}\;\xrightarrow{\text{forget}}\;\mathsf{BuildPlan}\;\xrightarrow{\text{toDerivation}}\;\mathsf{Derivation}$$

Every one of these boundaries is a structural map, and each of them carries a test.

Trade-offs and limitations

Eval-time only. Everything in metaBuilder runs at nix eval time, and we don't model build-time effects. The things that happen inside pkgs.runCommand are opaque to us. That's deliberate, since pushing more into Nix would mean reimplementing Nix, but it's also a real limit because we can't reason about what a build does once it's running.

Not incremental at the spec level. We delegate incrementality to Nix. In Mokhov's taxonomy ^[1] metaBuilder is not a build engine; it runs inside one, classified along a different axis, the spec-to-interpretations relation instead of the task-to-schedule relation. The two are orthogonal, and a Mokhov-style classification of metaBuilder would say "Nix is the build engine; metaBuilder is a description language compiled by it."

The ornament algebra is restricted. We use ornaments in the forgetful shape from Dagand-McBride ^[8], meaning refinements that add fields and admit a forget map back to the base. The full McBride 2011 ^[7] algebraic ornament machinery, which lets refinements change the index of an inductive family, is present in the kernel but unused by the ornaments we ship. We haven't needed it yet, and we suspect it would be useful for cross-language schema refinement, but that suspicion is a hypothesis rather than a result.

The closed alphabet. The additive cost story assumes new builder kinds keep landing inside the fixed operation alphabet, and the OCI image kind was the first real test of that. It held. The kind landed with zero interpreter edits and zero new operations, using six of the existing constructors. We also found the place where it would not hold. Closure-aware layer grouping needs derivation attributes that toDerivation hardcodes today, and supporting it would mean threading derivationArgs through the plan into toDerivation, an extension of the one backend rather than of the operation alphabet. The alphabet grows only by a deliberate base-level decision, and it has done so once, from ten constructors to eleven, when declareEvidence joined BuilderOp so that spec provenance compiles into the program. Ornaments never extend the alphabet.

Performance. The interpreter scalar-pins the threaded state after each step rather than deep-forcing it, and the carrier accumulates its list fields as cons-cells that flatten once at the end, so a program of $N$ operations stays linear in $N$. We have run it to 40,000 operations under a 4 GB cap with linear wall-clock and no stack overflow, and real specs of a few hundred operations sit far below any cliff.

Nix-substrate assumption. materialize produces a pkgs.runCommand, but the plan it folds out does not assume Nix. Every derivation-valued field is projected to a store-path string at the forget boundary, so BuildPlan is plain data and toDerivation is the single place a live derivation appears. Two non-Nix backends consume that same plan. mb.program.backends.shell emits a standalone bash script, mb.program.backends.dockerfile emits a Dockerfile, and a plan a backend cannot express makes it throw with unsupportedReasons rather than miscompile silently. The emitted c-codegen script reproduced the Nix derivation's outputs byte for byte in a clean directory, and podman built both generated Dockerfiles with the in-image output matching the derivation. One caveat survives. The node launcher step embeds /nix/store paths because the plan content says so, which means the launcher will not exec inside a container even though the image builds. Substrate independence is a tested property of the plan, though a Bazel backend remains unbuilt.

Three things we explicitly don't do. Cross-derivation dependency analysis (the deps interpreter is per-spec, not whole-system); sandbox enforcement beyond compilation to bubblewrap/landlock/seccomp/systemd directives (the runtime is delegated to those tools); and anything resembling a build cache (Nix's store is the cache, and we don't touch it).

Prior art

We surveyed the surrounding literature and a range of industry systems while writing this, reading closely the ones that bear on the design and collecting them in the bibliography at the end.

If you cluster the existing approaches by what each of them is optimizing for, something like five groups fall out.

Graph-and-protocol systems like Bazel and Buck2 lean hard on Providers. A Provider is a typed record that one rule's outputs hand to the next rule's inputs, so a C++ rule and a Rust rule can agree about a shared library without either of them knowing much about the other. There is real type discipline here, and it lives at the edges between rules. It says less about what a builder is on the inside, or about what makes two builder kinds related.

Type-dispatch systems like Pants v2 and Mill build a rule graph out of Python or Scala type signatures. The closest thing the industry has to interpretation selection is Pants' Get(Output, Input), where the engine picks a rule by matching types. That surface has shifted over time, from Get to a call-by-name form in recent versions, while the type-driven dispatch underneath has stayed the same. The spec itself stays implicit, since rules are first-class and a spec only ever gets reconstructed back out of their signatures.

Monadic and theoretical systems sit closest to the à-la-Carte theory. Shake ^[2] lives at the Monad end of the Tasks lattice, which is to say a Shake rule can work out what it depends on while it runs. The theory in that corner is mostly about when a scheduler can be sound and how incremental it can manage to be. It has fairly little to say about whether two heterogeneous descriptions are formally related to each other.

Schema-over-derivation systems in the Nix ecosystem, drv-parts ^[3], dream2nix v1, flake-parts and the like, put the NixOS module system to work on derivation definitions. In spirit that is the closest predecessor to what we wanted, a typed schema laid over mkDerivation. It is not quite right to say such a module produces a single output, since flake-parts emits packages, checks, devShells, and apps from one perSystem module. The outputs are still all derivations, produced once, with no second interpretation that runs over the same description, and no account of how two of these schemas might relate.

Pragmatic multi-derivation systems like crane ^[4] and Dune arrive at "one spec, multiple derivations" empirically. Crane's buildDepsOnly and buildPackage split is one description run through two interpretations to share artifact caches, a working instance of the pattern that is never named as one.

IDL orchestrators are the closest industry analog to what we want, and the cluster keeps growing. Buf ^[5], Smithy ^[6], and more recently Microsoft's TypeSpec ^[21] each take a single schema, a .proto, a .smithy model, a TypeSpec definition, and drive validation, linting, breaking-change detection, and codegen for several languages off it. That is one spec read through many typed interpretations. The formal story is the thin part. Smithy has a typed model-to-model transform layer, and Buf's breaking-change detector is a structural comparison across file, package, and wire levels, not a loose heuristic. What runs on convention is the spec-to-codegen fan-out, TypeSpec's imperative $onEmit emitters among them, where nothing ties the interpretations together type-theoretically.

Build systems theory. Mokhov, Mitchell, and Peyton Jones ^[1] is the canonical classification, pairing schedulers with rebuilders and lifting Tasks into Applicative or Monad. Erdweg, Lichter, and Weiel ^[13] pushed the dynamic-dependency story into the OOPSLA literature with Pluto, and Dolstra ^[14] is the original Nix model. The nearest build-and-workflow neighbour is the funflow and kernmantle line from Tweag, set out by Parès, Bernardy, and Eisenberg ^[20], where workflow tasks are typed GADTs and an arrow is interpreted across several phases, one of them an execution-free static analysis. That covers legs (a) and (b), more or less. The missing piece is a generated forget between heterogeneous task kinds; kernmantle composes effects by summing strands and by inclusion, with nothing in the role of leg (c). None of these is aimed at the spec-to-interpretation axis we care about.

Ornaments and datatype-generic programming. McBride ^[7] introduced ornaments, Dagand and McBride ^[8], ^[9] developed the function-transport theory, Ko and Gibbons ^[15] worked through the practice, and Williams and Rémy ^[16] brought ornaments into ML. The closest existing application is Dagand, Tabareau, and Tanter ^[10], who relate types across an interoperability boundary via partial type equivalences and Galois connections, a setting where ornaments themselves give rise to Galois connections. That is the precedent we build on.

DSL interpretation. Swierstra ^[11] for sum-of-effects free monads, Carette, Kiselyov, and Shan ^[12] for finally-tagless, and Plotkin and Pretnar ^[17] for algebraic effects and handlers. metaBuilder's program model is the algebraic-effects-and-handlers pattern realized in pure Nix evaluation. Taha and Sheard ^[18] supply the multi-stage view, in which the eval-time and store-realization boundary is a stage boundary.

Adjacent formal traditions. A few neighbouring traditions press on the same nerve without landing in the same place. Model-Driven Engineering, with EMF and Ecore metamodels and QVT-R transformations, has typed models, many model-to-model transformations, and a notion of metamodel refinement, so on paper it touches all three legs. The transformations are kept honest by convention and testing instead of a theorem, and the refinement is not a generated Dagand–McBride forget map, which leaves it a near-miss. The lenses and bidirectional-transformations literature, going back to Foster, Greenwald, Moore, Pierce, and Schmitt ^[22], is the established account of relating two descriptions under round-trip laws. A lens relates two peer artifacts; it does not project a refinement down to a shared base, and it brings no central typed spec with a bundled family of interpreters, so it speaks to leg (c) alone and not to the conjunction. CUE is the sharpest of these. Its unification is the greatest lower bound, the meet, in a subsumption lattice over values and schemas, and that is a formal relation between descriptions. The relation is single-level, all inside one lattice, and it carries no interpreter along with it. The ornament forget is the cross-level counterpart that lifts the whole interpreter family by precomposition, which is the move CUE's meet does not make.

As far as we know, no prior build system brings together (a) one typed spec, (b) several typed interpretations of that spec, and (c) a formal forget map between heterogeneous spec refinements. That claim is narrower than it sounds. Each leg, and even the abstract (a)+(b)+(c) shape, turns up elsewhere; interaction trees ^[23] and layered monadic interpreters give (a)+(b) with an agreement theorem, and the ornament algebra itself gives (a)+(c) with many folds. What looks new is the application of the whole shape to heterogeneous builds. And the (c) leg here is checked rather than hoped for. The kernel verifies, for each shipped builder kind, that forget computes the base builder spec and that program compilation factors through forget.

Across the systems above, the three legs land like this.

System	(a) central typed spec	(b) many typed interpretations of one spec	(c) generated forget between heterogeneous kinds
funflow / kernmantle ^[20]	yes	yes	no
Buf / Smithy / TypeSpec	yes	partial	no
drv-parts / dream2nix ^[3]	yes	no	no
CUE	yes	no	partial
MDE / QVT-R	yes	yes	partial
lenses ^[22]	no	no	partial
metaBuilder	yes	yes	yes

Here "partial" marks a leg present only by convention, only at a single level, or without a generated forget map.

What's launched

metaBuilder is available now at github.com/kleisli-io/metaBuilder. What you get:

The typed descriptions (BuilderOp, BuilderSpec, RuntimeOp, plus the runtime types for capabilities, protocols, services, sandboxes).
The bundled interpreters with the trampoline already wired, plus introspect as the composed selfView over them, plan-export for serializing plans, and the shell and Dockerfile backends that translate a finished plan off the Nix substrate.
The ornament catalogue, with code-gen, project-builder, implementations, vendoring, testing, dependencies, idl, and oci-image, plus the typed vocabularies for capabilities, protocol, service, sandbox, toolEnv, transform, and replServer.
End-to-end examples under examples/. Three of them ship a builder and a worked artifact. node-service ships an HTTP server with /health, /version, /metrics endpoints alongside the builder that produces it; c-codegen ships a real native library and CLI generated from a checked-in messages.def schema alongside the builder that drives the generation, compilation, and link steps; oci-image ships a runnable busybox shell image assembled digest by digest at build time alongside the builder that lays it out. The fourth, idl, demonstrates multi-language protobuf codegen through the idl ornament without a runtime artifact, since the artifact in that case is the generated code itself.
A doc generator (mb.mkDocsContent) that produces typed API documentation from the same descriptors, in which the examples appear as first-class pages with domain spec, operation summary, dry-run steps, plan-view shell, self-documentation, and materialization result.

Some parts are still rough. The describe interpreter is new, and we expect its output format to change. Sandbox backend coverage is uneven, with bwrap and systemd well-exercised but landlock and seccomp much less so. The ornament catalogue is still growing, the OCI-image ornament is recent, and a build-from-tarball ornament isn't written yet.

If you want to try the conjecture without leaving your editor, clone the repo, open examples/node-service/builder.nix, change the port or the health path, and run any of the interpreters. nix-build the materialized derivation and you have a real HTTP server you can hit on the new port. The C codegen example is the same loop in another domain. Edit messages.def, run the interpreters, build, and you get a static library carrying generated code from your edited schema. The OCI example closes the loop in a third. Build the image, run it under podman or bridge it to docker with skopeo, and the digests you see are the ones the plan computed.

Appendix A. The categorical reading

This appendix re-states the metaBuilder pipeline in category-theoretic vocabulary. It assumes familiarity with polynomial endofunctors on $\mathsf{Set}$, free monads as initial algebras, and the basic ornament algebra of Dagand and McBride ^[8], ^[9].

A.1 Signature, programs, interpreters

Let $E$ denote the closed coproduct $\mathsf{BuilderOp} \oplus \mathsf{RuntimeOp}$, viewed as a finite sum-of-products datatype. The kernel response functor $R : E \to \mathcal{U}$ assigns the result type of each operation, where $\mathcal{U}$ is the kernel's universe of response types; inspection of the kernel signature shows $R(e) = \mathbf{1}$ for every $e \in E$, and since that terminal response is shared by $\mathcal{U}$ and $\mathsf{Set}$ we read $\mathcal{U}$ as $\mathsf{Set}$ throughout this appendix. The signature functor on $\mathsf{Set}$ is therefore

$$\Sigma(X) \;=\; \coprod_{e \in E} X^{R(e)} \;=\; E \times X,$$

the polynomial functor whose container is the constant family $E$ with arity one at every position. The trivial response is a metaBuilder-specific simplification of the underlying nix-effects kernel, which supports arbitrary $R$; for the present design we exploit it freely.

The free monad on $\Sigma$ is given pointwise, with $\mathsf{Free}(\Sigma)(A)$ the initial algebra of $Y \mapsto A + \Sigma(Y)$ in $\mathsf{Set}$ for each object $A$. For the trivial-response signature this collapses to

$$\mathsf{Free}(\Sigma)(A) \;=\; \mu Y.\, A + E \times Y \;\cong\; E^{\ast} \times A,$$

where $E^{\ast}$ is the free monoid on $E$. Under this isomorphism, the kernel constructors are:

$K.\mathsf{pure}_A : A \to \mathsf{Free}(\Sigma)(A)$, the monad unit, sending $a \mapsto (\varepsilon, a)$ with $\varepsilon$ the empty word.
$K.\mathsf{bind}_{A,B} : \mathsf{Free}(\Sigma)(A) \to (A \to \mathsf{Free}(\Sigma)(B)) \to \mathsf{Free}(\Sigma)(B)$, the Kleisli extension $({>}{>}{=})$, which acts on the monoid factor by concatenation.
$K.\mathsf{send}_E : E \to \mathsf{Free}(\Sigma)(\mathbf{1})$, sending $e \mapsto ([e], \mathrm{tt})$.

The auxiliary sequence $: E^{\ast} \to \mathsf{Free}(\Sigma)(\mathbf{1})$ is the canonical Kleisli extension of $K.\mathsf{send}_E$ along list concatenation; under the isomorphism with $E^{\ast}$ it is the identity. Equivalently, sequence is the unique monoid homomorphism $E^{\ast} \to \mathsf{Free}(\Sigma)(\mathbf{1})$ extending the singleton-injection, which is the universal property of the free monoid applied at the level of programs.

A $\Sigma$-algebra on a carrier $C$ is a map $\alpha : E \times C \to C$. Currying gives $\alpha^{\sharp} : E \to \mathrm{End}(C)$. The universal property of $E^{\ast}$ extends this uniquely to a monoid homomorphism into the opposite endomorphism monoid

$$\widetilde{\alpha^{\sharp}} : E^{\ast} \to \mathrm{End}(C)^{\mathrm{op}}, \quad [e_1, \ldots, e_n] \;\mapsto\; \alpha^{\sharp}(e_n) \circ \cdots \circ \alpha^{\sharp}(e_1),$$

so that earlier operations are applied to state first. The trampoline $T.\mathsf{handle}$ realizes this homomorphism by left-folding $\alpha$ over the operation list. For a program $p$ corresponding to $[e_1, \ldots, e_n]$ with initial state $s_0$, the trampoline computes

$$s_n \;=\; \alpha(e_n,\, \alpha(e_{n-1},\, \ldots\, \alpha(e_1, s_0))) \;=\; (\widetilde{\alpha^{\sharp}}(p))(s_0).$$

Equivalently, $T.\mathsf{handle}$ is the value at $s_0$ of the unique monad map $\mathsf{Free}(\Sigma) \to S_C$ into the state monad $S_C(A) = C \to C \times A$, induced by treating $\alpha$ as a state-transformer at each operation. The two presentations agree by the universal property.

The bundled interpreters supply $\Sigma$-algebras on distinct carriers, namely a diagnostic list (validate), a dependency graph (deps), a step list (dry-run), a PlanState (materialize), and a documentation model (describe). The plan-view result is not a fold in its own right but the post-composition $\mathrm{render} \circ \mathrm{materialize}$, rendering the materialized plan to a shell string. The plan-export result is likewise a projection of the materialize fold's final state rather than a fold of its own, taken before the forget map so the boxed positions are still visible to drop. The composed meta-interpreter introspect is likewise not a $\Sigma$-algebra in its own right but the product-tupling of these post-composed with a markdown renderer; for the bundled family it factors through the diagonal $\mathsf{Free}(\Sigma)(\mathbf{1}) \to \mathsf{Free}(\Sigma)(\mathbf{1})^k$ with $k = 5$, the genuine folds being validate, deps, dry-run, materialize, and describe. plan-view adds no sixth leg; it rides on the materialize fold's result through $\mathrm{render} \circ \mathrm{materialize}$ rather than dispatching the program again.

A.2 Ornaments and the forget layer

BuilderSpec is a single-constructor record $B = \prod_{i \in I} B_i$ with finitely many typed fields. For single-constructor products, the Dagand–McBride ^[8], ^[9] ornament algebra reduces to a clean special case. An H-level ornament $\Omega$ over $B$ is given by a finite set $J$ disjoint from $I$ together with an insert-type family $(T_j)_{j \in J}$; it produces the refined product

$$\widetilde{B} \;=\; B \times \prod_{j \in J} T_j$$

and the forget map

$$\mathrm{forget}_{\Omega} \;=\; \pi_B : \widetilde{B} \to B,$$

the structural projection onto the base factor. The product is written with the inserts gathered after the base factor for readability, but the field spec interleaves $\mathsf{keep}$ and $\mathsf{insert}$ markers in any order, so $\widetilde{B}$ is determined only up to reordering of factors, and the forget map is the projection regardless of that order. For any $\Sigma$-algebra $\alpha$ on carrier $C$, write $\widehat{\alpha} : \mathsf{Spec} \to C$ for the induced spec interpreter $\widehat{\alpha}(\sigma) = \big(\widetilde{\alpha^{\sharp}}(\mathsf{sequence}\,(\mathsf{compile}\,\sigma))\big)(s_0)$, where $\mathsf{compile}\,\sigma$ is $\sigma.\mathsf{operations}$ extended with one trailing $\mathsf{declareEvidence}$ operation per entry of $\sigma.\mathsf{evidence}$, so the program is the compiled spec rather than the raw operation list, and the A.1 fold runs from the initial state $s_0$. The precomposition $\widehat{\alpha} \circ \mathrm{forget}_{\Omega} : \widetilde{B} \to C$ then uniformly extends $\widehat{\alpha}$ to every ornament, and the kernel checks for each shipped kind both that $\mathrm{forget}$ computes the base builder spec and that program compilation factors through $\mathrm{forget}$. The bundled interpreters lift to every ornamented builder kind without modification, which is the categorical source of the additive $N + k$ implementation-site count, where $k$ is the size of the interpreter family.

For ornaments of multi-constructor or recursive datatypes the general Dagand–McBride construction applies. An ornament is a code $u : \mathsf{Desc}(I) \to \mathsf{Desc}(J)$ with reindexing $r : J \to I$; the refined type is $\mu(u(\Phi))$ for the base description $\Phi : \mathsf{Desc}(I)$; and the forget map is the algebra morphism induced by the natural transformation $u(\Phi) \Rightarrow \Phi \circ r^{\ast}$. The HOAS kernel implements this construction in full generality; the metaBuilder shipping ornaments exercise only the single-constructor product case above.

Define the endofunctor $\mathsf{Thunk}(X) = \mathbf{1} \to X$ on $\mathsf{Set}$, with unit

$$\eta_X : X \to \mathsf{Thunk}(X), \quad \eta_X(x) = \lambda().\, x,$$

and counit

$$\epsilon_X : \mathsf{Thunk}(X) \to X, \quad \epsilon_X(t) = t().$$

These satisfy $\epsilon_X \circ \eta_X = \mathrm{id}_X$, so $(\eta_X, \epsilon_X)$ form a retraction with $\eta_X$ a section of $\epsilon_X$. (Indeed $\mathsf{Thunk}(-) \cong (-)^{\mathbf{1}}$ is the right adjoint of $- \times \mathbf{1}$ and hence preserves limits.) The combinator $H.\mathsf{thunkOrnament}\, X$ packages $(\eta_X, \epsilon_X)$ as a sub-ornament morphism that the G-level forget algebra applies pointwise at marked field positions.

The operational role of $\mathsf{Thunk}$ in PlanState is as a barrier against cycle-blind structural walks. The semantic fact in play is that a structural traversal does not enter function bodies, so a walk leaves $\eta_X(v)$ untouched regardless of whether $v$ contains cycles, including the self-referential Nix derivation $d$ whose output attributes point back at it, $d.\mathsf{out} = d$ and $d.\mathsf{all} \ni d$. We note for precision that $\mathsf{deepSeq}$ itself detects cycles on every evaluator we probed back to Nix 2.3 and is therefore not the operator the barrier defends against. The operators that fail are $\mathsf{toJSON}$, which overflows the stack on a cyclic attrset without an $\mathsf{outPath}$ in a way $\mathsf{tryEval}$ cannot catch, and any user-written recursive walk. A walk that treats a function as a leaf, such as $\mathsf{deepSeq}$ or the fuel-bounded walker, leaves $\eta_X(v)$ inert. $\mathsf{toJSON}$ is the exception. It throws uncatchably on the boxing closure itself, so a boxed field is not inert under $\mathsf{toJSON}$, and the safety at the serialization boundary comes instead from dropping the boxed positions before $\mathsf{toJSON}$ ever runs. The serializing consumer is plan-export, which walks the pre-finalize state with the boxed positions dropped, and a fuel-bounded walker test witnesses necessity, since the boxed state completes within finite fuel while a hand-unboxed twin exhausts any fuel on the derivation self-cycle. After interpretation $\mathrm{forget}_{\mathrm{Plan}}$ applies the marked sub-ornaments' forget maps pointwise, each of which composes $\epsilon_X$ with a store-path projection, so the forgotten $\mathsf{BuildPlan}$ carries store-path strings rather than derivation values.

The G-level kernel handles ornaments of generic datatype descriptions in $G.\mathsf{Desc}$. For $\mathsf{BuildPlan}$ (a single-constructor product of finitely many typed fields), a G-ornament is a family of markers

$$\{\mathsf{keep}(i)\}_{i \in I_0} \;\cup\; \{\mathsf{keep\text{-}sub}(i, \Omega'_i)\}_{i \in I_1} \;\cup\; \{\mathsf{insert}(j, T_j)\}_{j \in J},$$

where $I_0, I_1 \subseteq I$ are disjoint with $I_0 \sqcup I_1 = I$ (every base field is kept, possibly through a sub-ornament), $J$ is disjoint from $I$, and each $\Omega'_i$ is a sub-ornament acting on field $i$'s type. The refined description has fields $\{B_i\}_{i \in I_0} \cup \{\widetilde{B_i}\}_{i \in I_1} \cup \{T_j\}_{j \in J}$, and the forget natural transformation $\mathrm{forget}_{\Omega}$ acts as identity on $I_0$, as $\mathrm{forget}_{\Omega'_i}$ on $I_1$, and drops every $j \in J$.

For $\mathsf{PlanState}$ the markers are:

$\mathsf{keep}$ on $\{\mathsf{name},\, \mathsf{builder},\, \mathsf{outputs},\, \mathsf{pathMap},\, \mathsf{steps},\, \mathsf{declaredCapabilities},\, \mathsf{declaredProtocols}\}$.
$\mathsf{keep\text{-}sub}$ with $\mathsf{lift}_{[-]}(\Omega')$ on $\{\mathsf{tools},\, \mathsf{nativePaths},\, \mathsf{declaredServices},\, \mathsf{runtimeArtifacts}\}$, where $\mathsf{lift}_{[-]}$ is the list endofunctor and the respective $\Omega'$ are the named sub-ornaments $\mathsf{ToolStateForget}$, $\mathsf{libRootOrnament}$, $\mathsf{ServiceStateForget}$, and $\mathsf{RuntimeArtifactStateForget}$, each of whose forget maps composes the counit $\epsilon_X$ with a store-path projection.
$\mathsf{insert}$ on $\{\mathsf{declaredTools},\, \mathsf{serviceIndex}\}$ with type $\mathsf{Attrs}$, used as $O(1)$ sideband indices during interpretation.

Every sub-ornament here is named. The record-shaped ones are ornaments of named product types, each forgetting one boxed field through a leaf ornament. $\mathsf{ToolStateForget}$ and $\mathsf{ServiceStateForget}$ box a $\mathsf{package}$ field and forget it to a dev-output and a plain store path respectively, while $\mathsf{RuntimeArtifactStateForget}$ boxes a $\mathsf{storePath}$ field and forgets that. $\mathsf{libRootOrnament}$ is itself a leaf ornament on a single boxed-derivation position, projecting the lib-output root. Their forget maps land in store-path strings rather than derivations, which is what makes the forgotten $\mathsf{BuildPlan}$ substrate-neutral, and their sections throw by design, since a store path does not determine a derivation.

A.3 The pipeline and its position

The metaBuilder pipeline factors as

$$\widetilde{\mathsf{Spec}} \xrightarrow{\mathrm{forget}_{\Omega}} \mathsf{Spec} \xrightarrow{\mathrm{sequence}\, \circ\, \mathrm{compile}} \mathsf{Free}(\Sigma)(\mathbf{1}) \xrightarrow{T.\mathsf{handle}(\alpha_{\mathrm{mat}},\, s_0)} \mathsf{PlanState} \xrightarrow{\mathrm{forget}_{\mathrm{Plan}}} \mathsf{BuildPlan} \xrightarrow{\mathrm{toDerivation}} \mathsf{Derivation}$$

in which:

$\mathrm{forget}_{\Omega}$ is the H-level structural projection of A.2, which preserves the entire operations list because that list is the unified $[\mathsf{BuilderEff}]$ carrier and ornaments add only data fields.
$\mathrm{sequence} \circ \mathrm{compile}$ first lowers the spec to its operation list ($\sigma.\mathsf{operations}$ extended with one trailing $\mathsf{declareEvidence}$ per evidence entry, the same $\mathsf{compile}$ A.2 defines on its spec interpreter $\widehat{\alpha}$), and then applies the monoid homomorphism $\mathrm{sequence}$ of A.1.
$T.\mathsf{handle}(\alpha_{\mathrm{mat}}, s_0)$ is the catamorphism of A.1 into the $\Sigma$-algebra on $\mathsf{PlanState}$ supplied by the materialize interpreter.
$\mathrm{forget}_{\mathrm{Plan}}$ is the G-level forget natural transformation of A.2.
$\mathrm{toDerivation}$ is the injection into the Nix store layer.

Arrows 1–4 are typed structural maps between objects of $\mathsf{Set}$, and the composition $\mathrm{forget}_{\mathrm{Plan}} \circ T.\mathsf{handle}(\alpha_{\mathrm{mat}}, s_0) \circ \mathrm{sequence} \circ \mathrm{compile} \circ \mathrm{forget}_{\Omega}$ is a well-defined map of sets. Only arrow 5 touches pkgs.runCommand, and is therefore the Mokhov-à-la-Carte ^[1] task boundary into the Nix scheduler.

The coherence story splits in two at exactly the store boundary. At the plan level the kernel proves a theorem. The materialize step fold and the dry-run tag fold are internalized as kernel catamorphisms, and mattagfoldk-agrees-with-drytagfoldk-forall-p establishes by induction over the program that the two folds produce the same constructor-tag sequence for every program $p$, with the step payloads, the shell renderer $\rho$ (axiomatised at $\mathsf{String}$), and the store external to the statement. An extraction drift gate ties the proven folds to the running interpreter, checking that the extracted fold reproduces the live materialize steps on all four shipped examples (node, C, OCI, and IDL) and the live rendered shells on node and C (the two kinds whose builders run a shell), and that $\rho$ is the live renderStep itself. We chose not to run the kernel fold on the live path, since measurement put the cost at roughly an order of magnitude of evaluation overhead, so the drift gate is the designed trust boundary between proof and production. Below the plan level the old caveat stands. That $\mathrm{plan\text{-}view}$'s rendered shell (itself the rendering of A.1's materialize catamorphism) produces the same outputs as $\mathrm{materialize}$'s derivation when executed is asserted by test rather than proved, a proof would require a denotational semantics for the Nix store layer that the kernel does not supply, and we view that as the principal piece of formal work outstanding.

Dagand and McBride ^[8] and Williams and Rémy ^[16] use ornaments in the forward direction. Given $\Omega : D \rightrightarrows D'$, they construct liftings of programs at $\mu D$ to programs at $\mu D'$, with the forget map as the coherence witness. Dagand, Tabareau, and Tanter ^[10] sit on the type-interoperability side of this lineage, relating types across an interoperability boundary via partial type equivalences and partial Galois connections rather than ornaments. metaBuilder uses only the backward direction, in which ornamented builder kinds project to the base $\mathsf{BuilderSpec}$ via $\mathrm{forget}_{\Omega}$ and base interpreters lift by precomposition. The bidirectional case in which a base operation's behavior must be extended at the refined type (which is where most of the technical depth of ^[8], ^[9] lives) does not arise here, because metaBuilder's refinements add ornament fields without altering the operation algebra. The runtime operations are part of that fixed base algebra, living in the signature $\mathsf{BuilderEff}$ rather than in any ornament, so refinement never extends the operation alphabet and the backward-direction-only account stays honest. The base alphabet itself may grow by a base-level decision, as it did when $\mathsf{declareEvidence}$ joined $\mathsf{BuilderOp}$, and that is a change to $E$ rather than a refinement, so the invariant is untouched.

This backward-only discipline is a strictly weaker use of the ornament algebra than the full McBride 2011 ^[7] machinery. We view that machinery as still available if cross-language schema refinement or behavior-altering refinements ever demand it; the question is open, and constitutes one of the natural extension axes for this design.

References

A. Mokhov, N. Mitchell, S. Peyton Jones. Build Systems à la Carte. ICFP 2018. https://www.microsoft.com/en-us/research/publication/build-systems-la-carte/ ↩
N. Mitchell. Shake Before Building: Replacing Make with Haskell. ICFP 2012. ↩
DavHau. drv-parts (now part of dream2nix). https://github.com/DavHau/drv-parts ↩
I. Petkov. Introducing Crane: Composable and Cacheable Builds with Cargo and Nix, 2022. https://ipetkov.dev/blog/introducing-crane/ ↩
Buf Technologies. Buf CLI. https://buf.build/docs/cli/ ↩
AWS. Smithy 2.0. https://smithy.io/2.0/ ↩
C. McBride. Ornamental Algebras, Algebraic Ornaments. 2011. ↩
P.-É. Dagand, C. McBride. Transporting Functions across Ornaments. ICFP 2012 / JFP 2014. arXiv:1201.4801 ↩
P.-É. Dagand, C. McBride. A Categorical Treatment of Ornaments. LICS 2013. arXiv:1212.3806 ↩
P.-É. Dagand, N. Tabareau, É. Tanter. Foundations of Dependent Interoperability. JFP 2018. ↩
W. Swierstra. Data Types à la Carte. JFP 18(4), 2008. ↩
J. Carette, O. Kiselyov, C.-C. Shan. Finally Tagless, Partially Evaluated: Tagless Staged Interpreters for Simpler Typed Languages. JFP 19(5), 2009. ↩
S. Erdweg, M. Lichter, M. Weiel. A Sound and Optimal Incremental Build System with Dynamic Dependencies. OOPSLA 2015. ↩
E. Dolstra. The Purely Functional Software Deployment Model. PhD thesis, Utrecht, 2006. ↩
H.-S. Ko, J. Gibbons. Programming with Ornaments. JFP 27:e2, 2017. ↩
T. Williams, D. Rémy. A Principled Approach to Ornamentation in ML. POPL 2018. ↩
G. D. Plotkin, M. Pretnar. Handling Algebraic Effects. LMCS 9(4:23), 2013. arXiv:1312.1399 ↩
W. Taha, T. Sheard. MetaML and Multi-Stage Programming with Explicit Annotations. TCS 248, 2000. ↩
R. P. Pieters, E. Rivas, T. Schrijvers. Generalized Monoidal Effects and Handlers. JFP 30, e23, 2020. https://doi.org/10.1017/S0956796820000106 ↩
Y. Parès, J.-P. Bernardy, R. A. Eisenberg. Composing Effects into Tasks and Workflows. Haskell Symposium 2020. https://doi.org/10.1145/3406088.3409023 ↩
Microsoft. TypeSpec. https://typespec.io ↩
J. N. Foster, M. B. Greenwald, J. T. Moore, B. C. Pierce, A. Schmitt. Combinators for Bidirectional Tree Transformations: A Linguistic Approach to the View-Update Problem. ACM TOPLAS 29(3), 2007. https://doi.org/10.1145/1232420.1232424 ↩
L. Xia, Y. Zakowski, P. He, C.-K. Hur, G. Malecha, B. C. Pierce, S. Zdancewic. Interaction Trees: Representing Recursive and Impure Programs in Coq. Proc. ACM Program. Lang. 4, POPL, Article 51, 2020. https://doi.org/10.1145/3371119 ↩