Anticipatory Prompt Compilation: Pre-Computing Situation-Optimal Prompts with Fuzzy Case Retrieval and Fragment Interference

John Alexander Mobley & Claude MobCorp / MASCOM — February 2026

1. Introduction

The dominant paradigm in prompt engineering is runtime interpretation: when a task arrives, a prompt is assembled on-the-fly from templates, context windows, and heuristic rules. This approach has two fundamental limitations: (1) latency — each compilation requires traversal of the fragment library and scoring, and (2) cold-start — novel situations receive no benefit from prior compilations in similar situations.

We observe that the space of situations a system encounters is not infinite. For a given domain with C categories, E error types, and K keyword clusters, the meaningful situation space is bounded by O(C × E × K) — and in practice is much smaller, since most cross-products are sparse (no fragments exist for many combinations). This observation motivates a shift from interpretation to ahead-of-time compilation: enumerate the situation space, pre-generate optimal prompts for every populated region, and at runtime simply retrieve the nearest pre-compiled result.

The analogy to software compilation is precise:

Stage	Software	Prompt Engineering
Source	Code files	Fragment library (1,167 fragments)
Compilation	gcc/clang → object files	PreCompiler → pre-compiled prompts (322)
Linking	Object files → binary	Fragment assembly → final prompt
Execution	Binary runs directly	Router retrieves pre-compiled prompt (~5ms)
Fallback	Interpreter (Python)	Live compilation (~50ms)

This paper introduces Anticipatory Prompt Compilation (APC), implements it in the Lacuna Engine, and identifies a novel mechanism — constructive fragment interference — that produces prompts no single pre-compilation contains.

2.1 Case-Based Reasoning (CBR)

APC is most directly descended from Case-Based Reasoning (Aamodt & Plaza, 1994; Kolodner, 1993). In CBR, a system maintains a case base of previously solved problems. When a new problem arrives, the system retrieves the most similar case, adapts its solution, and optionally retains the new solution for future use. The CBR cycle is: Retrieve → Reuse → Revise → Retain.

APC maps cleanly onto CBR: - Case base = precompiled_prompts table - Retrieve = FuzzyRouter.route() with multi-signal scoring - Reuse = direct prompt return (strong collapse) or fragment blending (interference) - Revise = Bayesian effectiveness updates from outcome learning - Retain = new fragments discovered by OutcomeLearner.sweep_recent_sessions()

The key departure from classical CBR is granularity: CBR retrieves and adapts whole cases. APC decomposes cases into fragments and can recombine fragments across multiple cases. This is closer to compositional case-based reasoning (Plaza & McGinty, 2005), but applied to prompt engineering rather than design or planning.

2.2 Retrieval-Augmented Generation (RAG)

RAG systems (Lewis et al., 2020) retrieve documents from an external store and inject them into the language model’s context. APC inverts this: instead of retrieving input context for the model, APC retrieves the entire compiled prompt — the instructions themselves, not the data.

This distinction matters. In RAG, the quality of generation depends on the model’s ability to synthesize retrieved documents. In APC, the quality depends on the pre-compilation — the prompt was already optimized offline, and retrieval simply selects among optimized options.

2.3 DSPy and Prompt Compilation

DSPy (Khattab et al., 2023) introduced the “prompt compilation” metaphor explicitly: developers write declarative programs specifying what the LLM should do, and DSPy compiles these into optimized prompt chains through bootstrapping and search. APC differs in two ways: (1) DSPy compiles programs into prompts; APC compiles situations into prompts, and (2) DSPy’s compilation is demand-driven (compile when needed); APC’s compilation is anticipatory (compile everything ahead of time).

2.4 Mixture of Experts (MoE)

The FuzzyRouter’s multi-signal scoring function is structurally isomorphic to a sparse MoE gating function (Shazeer et al., 2017). Each pre-compiled prompt is an “expert,” and the router is the “gate” that selects which expert(s) to activate. The “constructive interference” blending mechanism is analogous to soft MoE routing where multiple experts contribute to the output — but applied at the fragment level rather than the hidden-state level.

2.5 What Is Novel

To our knowledge, no prior work combines all of: 1. Ahead-of-time enumeration of the situation space for prompt pre-compilation 2. Multi-signal fuzzy routing (category × token similarity × error affinity) 3. Fragment-level constructive interference — blending sub-prompt components from multiple pre-compiled solutions when the routing signal is ambiguous

The fragment interference mechanism is the primary novel contribution.

3. System Architecture

3.1 Fragment Library

The foundation is a library of 1,167 prompt fragments extracted from 11 sources across the MASCOM system. Each fragment has:

(text, category, fragment_type, priority, effectiveness, trigger_conditions)

Fragment types follow a primary assembly ordering: setup → constraint → domain → recovery → meta, with 12 additional types assembled after the primary set: insight, discovery, decision, milestone, knowledge, pattern, signal, specification, context, taxonomy, concept, and design — 17 types total. Categories span 42 domains from alignment to vision. Effectiveness is tracked via Bayesian updating: eff = (successes + 1) / (successes + failures + 2).

The 11 sources include: institutional memory (context.db key_facts), behavioral patterns (refractive_will.db), quality gaps (ouroboros_results.json), system constraints (CLAUDE.md NEVER rules), error recovery patterns (hardcoded from operational experience), and three tiers of legacy knowledge mining (lacuna_mined, lacuna_legacy, lacuna_legacy_broad).

3.2 Situation Space Enumeration

The PreCompiler enumerates the situation space by:

Querying active categories: categories with ≥3 fragments (27 of 42 in practice)
Mapping error types to categories: e.g., mps_stall → training, import_error → debugging
Extracting keyword clusters: from KEYWORD_MAP, fragment trigger_conditions, and text mining — for categories with sparse or missing KEYWORD_MAP entries, the PreCompiler extracts top terms from fragment text via tokenization, taking terms with frequency ≥ 2
Scaling cluster depth: keyword cluster count scales with category fragment size — 5 clusters for small categories (<20 fragments), 10 for medium (20-100), 15 for large (>100)
Generating combinations:
- Base situations: one per category (27)
- Category + error: each relevant error type per category (~50)
- Category + keywords: scaled keyword clusters per category (~120)
- Category + error + keywords: top 3 × top 3 per category (~40)
- Cross-category pairs: 20 common pairings (deployment+debugging, training+debugging, beings+evolution, architecture+cognition, etc.)

This produces ~360 candidate situations, of which 322 yield prompts with sufficient signal (≥3 matching fragments and ≥50 characters).

3.3 Pre-Compilation

For each enumerated situation, the PreCompiler:

Constructs a synthetic task description from the situation vector
Classifies via SituationClassifier (to populate secondary categories and confidence)
Overrides the classification with the precise situation parameters
Gathers candidate fragments via the existing library (category match, trigger match, constraint inclusion)
Scores and selects top-12 fragments (trigger relevance × effectiveness × recency)
Assembles fragments in type order (5 primary types first, then 12 additional types) with situation header
Computes a TF-IDF token vector for fuzzy matching

Token vectors are initially term-frequency normalized, then updated with IDF weights after all compilations complete:

w(t, d) = tf(t, d) × log(N / (1 + df(t)))

Where N is total pre-compiled prompts and df(t) is the number of prompts containing term t.

3.4 The Fuzzy Quantum Router

The FuzzyRouter receives an incoming task and routes it to the best pre-compiled prompt via three orthogonal signals:

Signal 1: Category Overlap (weight 0.35)

score = 0.7 × I(primary_match) + 0.3 × I(secondary_overlap)

Primary category exact match contributes 0.7. Any overlap between incoming secondary categories and the pre-compiled situation’s category set contributes 0.3.

Signal 2: Token Cosine Similarity (weight 0.40)

cos(a, b) = Σ(a_i × b_i) / (||a|| × ||b||)

Sparse cosine similarity between the task’s token vector and the pre-compiled prompt’s TF-IDF vector. Keywords appearing in KEYWORD_MAP receive a 2× boost in the task vector.

Signal 3: Error Affinity (weight 0.25)

score = 1.0 if same_error_type
       = 0.5 if both_none
       = 0.2 if one_none
       = 0.0 if different_errors

The total routing score is:

S(task, precompiled) = 0.35 × category + 0.40 × cosine + 0.25 × error

The “quantum” metaphor: the incoming task exists in superposition across all 322 pre-compiled buckets simultaneously. Each bucket has an amplitude (the routing score). The router collapses the superposition by selecting the highest-amplitude match.

3.5 Constructive Fragment Interference

The key innovation. When the top 3 matches are within an interference band (score spread ≤ 0.15), the router does not simply select the best match. Instead, it performs constructive interference:

Collect all fragment IDs from the top 3 pre-compiled prompts
Union the fragment sets (typically 25-36 unique fragments from 3 × 12)
Re-score all fragments against the incoming situation
Select the top 12 by the standard scoring function
Assemble into a new prompt

This produces prompts that no single pre-compilation contains. Fragments from a “deployment” pre-compilation combine with fragments from a “debugging” pre-compilation to produce a hybrid prompt optimized for “debug a deployment failure” — a situation that may not have been explicitly enumerated.

The analogy to quantum mechanics is deliberate: when the measurement (routing) is ambiguous between states, the system exhibits interference between those states, and the resulting output contains contributions from all interfering states.

3.6 Fallback to Live Compilation

If no pre-compiled prompt scores above the minimum threshold (0.30), the router returns None and the system falls back to live compilation via the existing PromptCompiler.compile() path. This ensures graceful degradation for truly novel situations.

4. Empirical Results

4.1 Pre-Compilation Coverage

From 1,167 fragments across 42 categories (17 fragment types, 38 sources), APC produces 322 pre-compiled prompts covering 27 categories:

Category	Pre-compiled	Source Fragments
implementation	24	92
debugging	24	20
infrastructure	23	38
architecture	20	378
training	20	28
deployment	19	9
research	17	119
cognition	17	100
security	15	10
beings	13	35
creative	12	24
capability	12	50
prompt_engineering	11	27
operations	11	57
safety	7	12
philosophy	7	9
game_design	7	9
evolution	7	26
design	7	33
business	7	37
(7 others)	42	33

Two implementation insights emerged during full-coverage development:

Fragment type completeness: The initial assembly pass only included fragments with types in the primary ordering (setup, constraint, domain, recovery, meta). Legacy-mined fragments carrying types like insight, discovery, knowledge, milestone, and pattern were gathered as candidates but silently dropped during assembly, producing prompts below the 50-character minimum. Adding a second assembly pass for remaining types resolved the silent dropout and enabled coverage of 5 previously-unreachable categories.
Text-mined keyword extraction: Categories sourced entirely from legacy mining (e.g., beings, evolution, philosophy) had no entries in the handcrafted KEYWORD_MAP, receiving zero keyword clusters and thus no keyword-variant pre-compilations. Mining top terms from fragment text via tokenization (frequency ≥ 2, top 10 terms per category) bootstrapped keyword clusters for sparse categories, enabling full coverage.

4.2 Router Performance

Test Case	Router Time	Match Type	Prompt Size
“deploy authfor worker”	21.9ms	Fuzzy (blended)	1,913 chars
“fix MPS training stall”	14.9ms	Fuzzy (direct)	2,807 chars
“quantum teleportation for neural weights”	12.7ms	Fuzzy (weak)	2,559 chars
“debug deployment failure with permission denied”	11.8ms	Fuzzy (blended)	2,876 chars
“evolve being cognition through self-play”	11.9ms	Fuzzy (cross-category)	2,559 chars
“coordinate beings across layers for evolution”	11.7ms	Fuzzy (cross-category)	2,559 chars

All router invocations complete in under 25ms across 322 pre-compiled candidates. Live compilation (the fallback) requires 50-100ms. The cross-category queries (“evolve being cognition,” “coordinate beings”) demonstrate coverage of categories that were previously unreachable before the text-mined keyword and fragment type completeness fixes.

4.3 Interference Behavior

For the cross-category query “debug deployment failure on cloudflare worker with permission denied error”:

Top 3 matches:
  score=0.538  cat=deployment  err=None    kw=[]
  score=0.515  cat=deployment  err=None    kw=[cloudflare]
  score=0.513  cat=deployment  err=None    kw=[worker]

Spread: 0.025 (< 0.15 threshold)
→ BLENDING TRIGGERED: constructive interference

The blended prompt contains fragments from all three pre-compilations — general deployment knowledge, Cloudflare-specific fragments, and worker-specific fragments — producing a result more specific than any individual pre-compilation.

5. Theoretical Analysis

5.1 Optimality Bound

Let P* be the oracle prompt that would maximize task success probability for a given situation s. Let P_APC(s) be the prompt returned by APC. The gap ||P* - P_APC(s)|| is bounded by:

||P* - P_APC(s)|| ≤ ε_enum + ε_select + ε_assemble

Where: - ε_enum is the enumeration error: how far the nearest pre-compiled situation is from the actual situation. Bounded by the granularity of the category × error × keyword grid. - ε_select is the selection error: how well the fuzzy router identifies the correct pre-compiled prompt. Bounded by the discriminative power of the three scoring signals. - ε_assemble is the assembly error: how well fragment ordering and truncation preserve the optimal prompt structure. Bounded by MAX_FRAGMENTS and MAX_PROMPT_CHARS.

As the fragment library grows and more situations are enumerated, ε_enum → 0. As the Bayesian effectiveness feedback loop accumulates data, ε_select → 0. The system converges on P*.

5.2 Interference as Superposition

Formally, let |ψ⟩ be the incoming task state and |φ_i⟩ be the pre-compiled prompt states. The router computes amplitudes:

α_i = ⟨φ_i|ψ⟩ = w_cat · sim_cat(ψ, φ_i) + w_tok · sim_tok(ψ, φ_i) + w_err · sim_err(ψ, φ_i)

In the strong-collapse regime (max α_i > 0.80), the system returns |φ_best⟩ directly. In the interference regime (spread(α_top3) ≤ 0.15), the output is:

|output⟩ = Blend(fragments(φ_1) ∪ fragments(φ_2) ∪ fragments(φ_3), ψ)

This is analogous to quantum measurement in the degenerate case: when multiple eigenstates have nearly equal probability, the system remains in a superposition, and the output reflects contributions from all degenerate states.

5.3 Convergence

The system has three feedback loops that drive convergence:

Fragment effectiveness: Bayesian updates from outcome scoring (OutcomeLearner)
Pre-compilation freshness: periodic re-compilation incorporates new fragments (db_keeper 30-min cycle)
Hit-count weighting: future versions can weight pre-compiled prompts by empirical hit rates

These loops form a fixed-point iteration: the prompt library improves → pre-compilations improve → task outcomes improve → effectiveness feedback improves the library → cycle repeats.

6. Relationship to the Lacuna Engine

APC is not a replacement for the Lacuna Engine’s core concept — it is its culmination. The Lacuna Engine encodes what the system doesn’t know (lacunae) as fragments, then compiles those fragments into prompts that prevent future failures. APC takes this one step further: it pre-compiles prompts for the entire space of possible not-knowing, so that when a lacuna manifests at runtime, the response is instantaneous.

The metaphor from the user who designed this system: “Rumpelstiltskin spinning ALL the hay into gold in advance, so the queen just picks the right skein.”

This is the progression: 1. Interpreter mode: compile on every request (original Lacuna Engine) 2. Cached mode: decision tree cache for exact situation repeats (DecisionTreeCache) 3. Compiled mode: pre-generate for all situations, route instantly (APC)

The compiled mode subsumes the other two: exact matches are found instantly via situation_hash lookup, fuzzy matches via the router, and truly novel situations fall back to live compilation.

7. Naming and Taxonomy

We propose the following terminology:

Term	Definition
Anticipatory Prompt Compilation (APC)	The overall framework: enumerate situations, pre-compile prompts, route at runtime
Situation vector	The structured representation of a task context: {category, keywords, error_type, …}
Fuzzy case retrieval	Multi-signal routing to find the nearest pre-compiled prompt
Fragment interference	Blending fragments from multiple pre-compiled prompts when routing is ambiguous
Strong collapse	Direct retrieval of a single pre-compiled prompt (score > 0.80)
Constructive interference	Fragment blending that produces prompts no single pre-compilation contains
Prompt convergence	The fixed-point iteration where effectiveness feedback improves future compilations

APC sits at the intersection of: - Case-Based Reasoning (retrieve and adapt solved problems) - Retrieval-Augmented Generation (retrieve context for generation) - Mixture of Experts (route to specialized sub-systems) - Compiler theory (ahead-of-time optimization of runtime behavior)

8. Future Work

Dense embeddings: Replace TF-IDF with learned embeddings from PhotonicMind’s TextGenCore for richer similarity computation
Adaptive interference band: Learn the optimal interference threshold from outcome data rather than fixing at 0.15
Hierarchical pre-compilation: Pre-compile at multiple granularities (domain → category → subcategory) for faster coarse-to-fine retrieval
Cross-system transfer: Export pre-compiled prompts as portable knowledge that bootstraps new MASCOM instances
Effectiveness-weighted routing: Incorporate empirical hit rates and outcome scores into the routing function as a fourth signal
Negative interference: Detect when blending produces contradictory fragments and suppress destructive interference

9. Conclusion

Anticipatory Prompt Compilation transforms the Lacuna Engine from a runtime interpreter into a pre-computed expertise database with instant retrieval. By enumerating the situation space ahead of time and pre-generating optimal prompts for each region, APC achieves sub-25ms prompt retrieval across 322 candidates compared to 50-100ms for live compilation. The fuzzy quantum router’s multi-signal scoring provides robust matching even for situations not explicitly enumerated, and constructive fragment interference produces novel prompt combinations that no single pre-compilation contains.

The system is self-improving: outcome feedback updates fragment effectiveness, which improves future pre-compilations, which improves future outcomes. This fixed-point iteration drives convergence toward the oracle prompt P* for each situation.

APC demonstrates that prompt engineering need not be a runtime activity. Like software compilation, it can be performed ahead of time, with the runtime cost reduced to a simple lookup. The hay is already gold. The queen just picks.

References

Aamodt, A., & Plaza, E. (1994). Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches. AI Communications, 7(1), 39-59.
Khattab, O., et al. (2023). DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines. arXiv:2310.03714.
Kolodner, J. (1993). Case-Based Reasoning. Morgan Kaufmann.
Lewis, P., et al. (2020). Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. NeurIPS 2020.
Plaza, E., & McGinty, L. (2005). Distributed Case-Based Reasoning. The Knowledge Engineering Review, 20(3), 261-265.
Shazeer, N., et al. (2017). Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer. ICLR 2017.