John Alexander Mobley & Claude MobCorp / MASCOM — February 2026
The dominant paradigm in prompt engineering is runtime interpretation: when a task arrives, a prompt is assembled on-the-fly from templates, context windows, and heuristic rules. This approach has two fundamental limitations: (1) latency — each compilation requires traversal of the fragment library and scoring, and (2) cold-start — novel situations receive no benefit from prior compilations in similar situations.
We observe that the space of situations a system encounters is not infinite. For a given domain with C categories, E error types, and K keyword clusters, the meaningful situation space is bounded by O(C × E × K) — and in practice is much smaller, since most cross-products are sparse (no fragments exist for many combinations). This observation motivates a shift from interpretation to ahead-of-time compilation: enumerate the situation space, pre-generate optimal prompts for every populated region, and at runtime simply retrieve the nearest pre-compiled result.
The analogy to software compilation is precise:
| Stage | Software | Prompt Engineering |
|---|---|---|
| Source | Code files | Fragment library (1,167 fragments) |
| Compilation | gcc/clang → object files | PreCompiler → pre-compiled prompts (322) |
| Linking | Object files → binary | Fragment assembly → final prompt |
| Execution | Binary runs directly | Router retrieves pre-compiled prompt (~5ms) |
| Fallback | Interpreter (Python) | Live compilation (~50ms) |
This paper introduces Anticipatory Prompt Compilation (APC), implements it in the Lacuna Engine, and identifies a novel mechanism — constructive fragment interference — that produces prompts no single pre-compilation contains.
APC is most directly descended from Case-Based Reasoning (Aamodt & Plaza, 1994; Kolodner, 1993). In CBR, a system maintains a case base of previously solved problems. When a new problem arrives, the system retrieves the most similar case, adapts its solution, and optionally retains the new solution for future use. The CBR cycle is: Retrieve → Reuse → Revise → Retain.
APC maps cleanly onto CBR: - Case base = precompiled_prompts table - Retrieve = FuzzyRouter.route() with multi-signal scoring - Reuse = direct prompt return (strong collapse) or fragment blending (interference) - Revise = Bayesian effectiveness updates from outcome learning - Retain = new fragments discovered by OutcomeLearner.sweep_recent_sessions()
The key departure from classical CBR is granularity: CBR retrieves and adapts whole cases. APC decomposes cases into fragments and can recombine fragments across multiple cases. This is closer to compositional case-based reasoning (Plaza & McGinty, 2005), but applied to prompt engineering rather than design or planning.
RAG systems (Lewis et al., 2020) retrieve documents from an external store and inject them into the language model’s context. APC inverts this: instead of retrieving input context for the model, APC retrieves the entire compiled prompt — the instructions themselves, not the data.
This distinction matters. In RAG, the quality of generation depends on the model’s ability to synthesize retrieved documents. In APC, the quality depends on the pre-compilation — the prompt was already optimized offline, and retrieval simply selects among optimized options.
DSPy (Khattab et al., 2023) introduced the “prompt compilation” metaphor explicitly: developers write declarative programs specifying what the LLM should do, and DSPy compiles these into optimized prompt chains through bootstrapping and search. APC differs in two ways: (1) DSPy compiles programs into prompts; APC compiles situations into prompts, and (2) DSPy’s compilation is demand-driven (compile when needed); APC’s compilation is anticipatory (compile everything ahead of time).
The FuzzyRouter’s multi-signal scoring function is structurally isomorphic to a sparse MoE gating function (Shazeer et al., 2017). Each pre-compiled prompt is an “expert,” and the router is the “gate” that selects which expert(s) to activate. The “constructive interference” blending mechanism is analogous to soft MoE routing where multiple experts contribute to the output — but applied at the fragment level rather than the hidden-state level.
To our knowledge, no prior work combines all of: 1. Ahead-of-time enumeration of the situation space for prompt pre-compilation 2. Multi-signal fuzzy routing (category × token similarity × error affinity) 3. Fragment-level constructive interference — blending sub-prompt components from multiple pre-compiled solutions when the routing signal is ambiguous
The fragment interference mechanism is the primary novel contribution.
The foundation is a library of 1,167 prompt fragments extracted from 11 sources across the MASCOM system. Each fragment has:
(text, category, fragment_type, priority, effectiveness, trigger_conditions)
Fragment types follow a primary assembly ordering:
setup → constraint → domain → recovery → meta, with 12
additional types assembled after the primary set: insight,
discovery, decision, milestone,
knowledge, pattern, signal,
specification, context, taxonomy,
concept, and design — 17 types total.
Categories span 42 domains from alignment to
vision. Effectiveness is tracked via Bayesian updating:
eff = (successes + 1) / (successes + failures + 2).
The 11 sources include: institutional memory (context.db key_facts), behavioral patterns (refractive_will.db), quality gaps (ouroboros_results.json), system constraints (CLAUDE.md NEVER rules), error recovery patterns (hardcoded from operational experience), and three tiers of legacy knowledge mining (lacuna_mined, lacuna_legacy, lacuna_legacy_broad).
The PreCompiler enumerates the situation space by:
mps_stall → training, import_error →
debuggingThis produces ~360 candidate situations, of which 322 yield prompts with sufficient signal (≥3 matching fragments and ≥50 characters).
For each enumerated situation, the PreCompiler:
Token vectors are initially term-frequency normalized, then updated with IDF weights after all compilations complete:
w(t, d) = tf(t, d) × log(N / (1 + df(t)))
Where N is total pre-compiled prompts and df(t) is the number of prompts containing term t.
The FuzzyRouter receives an incoming task and routes it to the best pre-compiled prompt via three orthogonal signals:
Signal 1: Category Overlap (weight 0.35)
score = 0.7 × I(primary_match) + 0.3 × I(secondary_overlap)Primary category exact match contributes 0.7. Any overlap between incoming secondary categories and the pre-compiled situation’s category set contributes 0.3.
Signal 2: Token Cosine Similarity (weight 0.40)
cos(a, b) = Σ(a_i × b_i) / (||a|| × ||b||)Sparse cosine similarity between the task’s token vector and the pre-compiled prompt’s TF-IDF vector. Keywords appearing in KEYWORD_MAP receive a 2× boost in the task vector.
Signal 3: Error Affinity (weight 0.25)
score = 1.0 if same_error_type
= 0.5 if both_none
= 0.2 if one_none
= 0.0 if different_errorsThe total routing score is:
S(task, precompiled) = 0.35 × category + 0.40 × cosine + 0.25 × error
The “quantum” metaphor: the incoming task exists in superposition across all 322 pre-compiled buckets simultaneously. Each bucket has an amplitude (the routing score). The router collapses the superposition by selecting the highest-amplitude match.
The key innovation. When the top 3 matches are within an interference band (score spread ≤ 0.15), the router does not simply select the best match. Instead, it performs constructive interference:
This produces prompts that no single pre-compilation contains. Fragments from a “deployment” pre-compilation combine with fragments from a “debugging” pre-compilation to produce a hybrid prompt optimized for “debug a deployment failure” — a situation that may not have been explicitly enumerated.
The analogy to quantum mechanics is deliberate: when the measurement (routing) is ambiguous between states, the system exhibits interference between those states, and the resulting output contains contributions from all interfering states.
If no pre-compiled prompt scores above the minimum threshold (0.30), the router returns None and the system falls back to live compilation via the existing PromptCompiler.compile() path. This ensures graceful degradation for truly novel situations.
From 1,167 fragments across 42 categories (17 fragment types, 38 sources), APC produces 322 pre-compiled prompts covering 27 categories:
| Category | Pre-compiled | Source Fragments |
|---|---|---|
| implementation | 24 | 92 |
| debugging | 24 | 20 |
| infrastructure | 23 | 38 |
| architecture | 20 | 378 |
| training | 20 | 28 |
| deployment | 19 | 9 |
| research | 17 | 119 |
| cognition | 17 | 100 |
| security | 15 | 10 |
| beings | 13 | 35 |
| creative | 12 | 24 |
| capability | 12 | 50 |
| prompt_engineering | 11 | 27 |
| operations | 11 | 57 |
| safety | 7 | 12 |
| philosophy | 7 | 9 |
| game_design | 7 | 9 |
| evolution | 7 | 26 |
| design | 7 | 33 |
| business | 7 | 37 |
| (7 others) | 42 | 33 |
Two implementation insights emerged during full-coverage development:
Fragment type completeness: The initial assembly
pass only included fragments with types in the primary ordering
(setup, constraint, domain,
recovery, meta). Legacy-mined fragments
carrying types like insight, discovery,
knowledge, milestone, and pattern
were gathered as candidates but silently dropped during assembly,
producing prompts below the 50-character minimum. Adding a second
assembly pass for remaining types resolved the silent dropout and
enabled coverage of 5 previously-unreachable categories.
Text-mined keyword extraction: Categories
sourced entirely from legacy mining (e.g., beings,
evolution, philosophy) had no entries in the
handcrafted KEYWORD_MAP, receiving zero keyword clusters and thus no
keyword-variant pre-compilations. Mining top terms from fragment text
via tokenization (frequency ≥ 2, top 10 terms per category) bootstrapped
keyword clusters for sparse categories, enabling full coverage.
| Test Case | Router Time | Match Type | Prompt Size |
|---|---|---|---|
| “deploy authfor worker” | 21.9ms | Fuzzy (blended) | 1,913 chars |
| “fix MPS training stall” | 14.9ms | Fuzzy (direct) | 2,807 chars |
| “quantum teleportation for neural weights” | 12.7ms | Fuzzy (weak) | 2,559 chars |
| “debug deployment failure with permission denied” | 11.8ms | Fuzzy (blended) | 2,876 chars |
| “evolve being cognition through self-play” | 11.9ms | Fuzzy (cross-category) | 2,559 chars |
| “coordinate beings across layers for evolution” | 11.7ms | Fuzzy (cross-category) | 2,559 chars |
All router invocations complete in under 25ms across 322 pre-compiled candidates. Live compilation (the fallback) requires 50-100ms. The cross-category queries (“evolve being cognition,” “coordinate beings”) demonstrate coverage of categories that were previously unreachable before the text-mined keyword and fragment type completeness fixes.
For the cross-category query “debug deployment failure on cloudflare worker with permission denied error”:
Top 3 matches:
score=0.538 cat=deployment err=None kw=[]
score=0.515 cat=deployment err=None kw=[cloudflare]
score=0.513 cat=deployment err=None kw=[worker]
Spread: 0.025 (< 0.15 threshold)
→ BLENDING TRIGGERED: constructive interference
The blended prompt contains fragments from all three pre-compilations — general deployment knowledge, Cloudflare-specific fragments, and worker-specific fragments — producing a result more specific than any individual pre-compilation.
Let P* be the oracle prompt that would maximize task success probability for a given situation s. Let P_APC(s) be the prompt returned by APC. The gap ||P* - P_APC(s)|| is bounded by:
||P* - P_APC(s)|| ≤ ε_enum + ε_select + ε_assemble
Where: - ε_enum is the enumeration error: how far the nearest pre-compiled situation is from the actual situation. Bounded by the granularity of the category × error × keyword grid. - ε_select is the selection error: how well the fuzzy router identifies the correct pre-compiled prompt. Bounded by the discriminative power of the three scoring signals. - ε_assemble is the assembly error: how well fragment ordering and truncation preserve the optimal prompt structure. Bounded by MAX_FRAGMENTS and MAX_PROMPT_CHARS.
As the fragment library grows and more situations are enumerated, ε_enum → 0. As the Bayesian effectiveness feedback loop accumulates data, ε_select → 0. The system converges on P*.
Formally, let |ψ⟩ be the incoming task state and |φ_i⟩ be the pre-compiled prompt states. The router computes amplitudes:
α_i = ⟨φ_i|ψ⟩ = w_cat · sim_cat(ψ, φ_i) + w_tok · sim_tok(ψ, φ_i) + w_err · sim_err(ψ, φ_i)
In the strong-collapse regime (max α_i > 0.80), the system returns |φ_best⟩ directly. In the interference regime (spread(α_top3) ≤ 0.15), the output is:
|output⟩ = Blend(fragments(φ_1) ∪ fragments(φ_2) ∪ fragments(φ_3), ψ)
This is analogous to quantum measurement in the degenerate case: when multiple eigenstates have nearly equal probability, the system remains in a superposition, and the output reflects contributions from all degenerate states.
The system has three feedback loops that drive convergence:
These loops form a fixed-point iteration: the prompt library improves → pre-compilations improve → task outcomes improve → effectiveness feedback improves the library → cycle repeats.
APC is not a replacement for the Lacuna Engine’s core concept — it is its culmination. The Lacuna Engine encodes what the system doesn’t know (lacunae) as fragments, then compiles those fragments into prompts that prevent future failures. APC takes this one step further: it pre-compiles prompts for the entire space of possible not-knowing, so that when a lacuna manifests at runtime, the response is instantaneous.
The metaphor from the user who designed this system: “Rumpelstiltskin spinning ALL the hay into gold in advance, so the queen just picks the right skein.”
This is the progression: 1. Interpreter mode: compile on every request (original Lacuna Engine) 2. Cached mode: decision tree cache for exact situation repeats (DecisionTreeCache) 3. Compiled mode: pre-generate for all situations, route instantly (APC)
The compiled mode subsumes the other two: exact matches are found instantly via situation_hash lookup, fuzzy matches via the router, and truly novel situations fall back to live compilation.
We propose the following terminology:
| Term | Definition |
|---|---|
| Anticipatory Prompt Compilation (APC) | The overall framework: enumerate situations, pre-compile prompts, route at runtime |
| Situation vector | The structured representation of a task context: {category, keywords, error_type, …} |
| Fuzzy case retrieval | Multi-signal routing to find the nearest pre-compiled prompt |
| Fragment interference | Blending fragments from multiple pre-compiled prompts when routing is ambiguous |
| Strong collapse | Direct retrieval of a single pre-compiled prompt (score > 0.80) |
| Constructive interference | Fragment blending that produces prompts no single pre-compilation contains |
| Prompt convergence | The fixed-point iteration where effectiveness feedback improves future compilations |
APC sits at the intersection of: - Case-Based Reasoning (retrieve and adapt solved problems) - Retrieval-Augmented Generation (retrieve context for generation) - Mixture of Experts (route to specialized sub-systems) - Compiler theory (ahead-of-time optimization of runtime behavior)
Anticipatory Prompt Compilation transforms the Lacuna Engine from a runtime interpreter into a pre-computed expertise database with instant retrieval. By enumerating the situation space ahead of time and pre-generating optimal prompts for each region, APC achieves sub-25ms prompt retrieval across 322 candidates compared to 50-100ms for live compilation. The fuzzy quantum router’s multi-signal scoring provides robust matching even for situations not explicitly enumerated, and constructive fragment interference produces novel prompt combinations that no single pre-compilation contains.
The system is self-improving: outcome feedback updates fragment effectiveness, which improves future pre-compilations, which improves future outcomes. This fixed-point iteration drives convergence toward the oracle prompt P* for each situation.
APC demonstrates that prompt engineering need not be a runtime activity. Like software compilation, it can be performed ahead of time, with the runtime cost reduced to a simple lookup. The hay is already gold. The queen just picks.