John Mobley Jr. MASCOM Foundation Model Company March 2026
The dominant paradigm in generative modeling assumes a fixed target distribution p(x) learned from data through gradient descent over millions of parameters. Diffusion models (Ho et al. 2020, Song et al. 2021) add Gaussian noise in a forward process and learn to reverse it, operating in continuous space over real-valued vectors. This paper takes a fundamentally different approach.
The core insight: A 16x24 pixel sprite with 16 colors is 384 discrete values — equivalent to a paragraph of text. If a language model can generate coherent paragraphs, the same architecture can generate coherent sprites. Words = Code = SVG = Art. The language model IS the image generator.
But we go further. Rather than using a pretrained language model as a black box, we build the generative process from first principles: discrete masked diffusion over palette indices, with the energy landscape (target distribution) inferred recursively from the system’s own outputs. The model discovers what “good” means while generating, rather than learning it from a fixed dataset.
This addresses the open problem identified in the theoretical seed: “No mainstream architecture [performs] target inference during sampling” (Section 5 of the preliminary analysis). MobleyDiffusion does exactly this.
Discrete Masked Diffusion — Forward process masks positions with a MASK token (index 17); reverse process predicts original palette indices. No Gaussian noise, no continuous space, no reparameterization trick.
Sieve-Based Complexity Collapse — Five structural sieves (silhouette, anatomy, symmetry, palette coherence, connectivity) eliminate entire equivalence classes from the search space, reducing n! to O(n^k).
Recursive Manifold Updates — Generated samples update the energy landscape E(x) = -log p(x). The manifold reshapes itself based on output quality: E_{t+1}(x) = H[E_t(x), {x_s}]. Convergence occurs when manifold geometry stabilizes.
Inductive Diffusion — Reversed-gradient exploration pushes samples INTO high-energy anti-solution space. Positions that resist disruption (Hawking radiation at the event horizon of structure) become hard constraints for recovery.
Holographic Generation — Multiple diffusion trajectories run in parallel, averaged via quality-weighted voting. Coherent structure reinforces across trajectories; noise cancels. This is the Feynman path integral applied to discrete generation.
Nested Attractor Hierarchy — Three-level basin structure (humanoid → archetype → individual) with cross-character transfer learning. Anatomical region improvements transfer at 10% across all archetypes.
Temporal 4D Manifold — Animation frames modeled as slices of a 4D energy manifold E(x, y, color, t), producing temporally coherent animation sequences without frame-by-frame independence assumptions.
Zero-Dependency Implementation — 3,027 lines of pure Python. No PyTorch, no TensorFlow, no external APIs. The system is fully sovereign.
A 16x24 sprite with 16 colors has 16^384 ≈ 10^462 possible configurations. This is vastly larger than the number of atoms in the observable universe (~10^80). Brute enumeration is impossible. Even sampling uniformly would never produce structure.
The key observation from the preliminary analysis: permutations grow as n!, combinations as n!/(k!(n-k)!). But structured visual output is neither — it occupies a tiny manifold embedded in the full combinatorial space. The challenge is finding that manifold without enumerating the space.
The theoretical framework established that diffusion models operate analogously to gravitational field dynamics:
| System | Field | Flow follows |
|---|---|---|
| Gravity | Potential Φ | F = -∇Φ |
| Diffusion | Energy E(x) | dx/dt = -∇E(x) |
| MobleyDiffusion | Discrete energy E(pos, color) | Geodesic unmasking order |
Training data curves probability space the way mass curves spacetime. Generated samples “fall” into attractor basins — faces, sprites, text — the way matter falls into gravity wells.
The preliminary analysis identified the key missing piece in existing work: “model infers target distribution while generating it.” This is the Einstein field equation analogy:
Mass tells spacetime how to curve. Spacetime tells mass how to move.
Becomes:
Samples tell the manifold how to curve. The manifold tells samples where to go.
MobleyDiffusion implements this literally. Each generation cycle:
generate candidate → evaluate quality → update energy landscape → generate again
The energy landscape E(x) is not fixed. It evolves. The system learns what “good sprite” means through its own outputs.
The Schrodinger equation and the diffusion equation are related by Wick rotation (t → iτ). MobleyDiffusion operates in discrete imaginary time — each reverse step is a discrete tick of the Wick-rotated process. The MASK token plays the role of the vacuum state. Unmasking is particle creation from the probability vacuum.
The energy manifold stores E(pos, color) for each position-color pair:
E: {0..383} x {0..15} → R
Lower energy = higher probability. The Boltzmann distribution gives:
p(color | pos) = exp(-E(pos, color)) / Z(pos)
The manifold is sparse — only positions with observed data have non-zero entries. Unobserved positions default to uniform distribution.
Curvature at each position measures energy variance:
κ(pos) = Var_color[E(pos, color)]
High curvature means the manifold has strong opinions about what goes there. Low curvature means uncertainty.
The forward process masks positions randomly. This is discrete corruption — no Gaussian noise, no variance schedule. A position is either known or unknown.
Each reverse step:
Five sieves enforce structural constraints, each eliminating entire equivalence classes. Five sieves reduce a 10^462 space to ~10^40 — a complexity reduction of 10^422.
Standard diffusion unmasks uniformly or by schedule. MobleyDiffusion unmasks geodesically — following the steepest path on the energy manifold. This is the discrete analogue of following geodesics on a Riemannian manifold.
Standard diffusion moves from noise to structure. Inductive diffusion does the opposite: it pushes a structured output INTO high-energy anti-solution space. Positions at the boundary between structure and chaos resist disruption most strongly. These boundary positions emit “Hawking radiation” — residual constraints that survive even as the rest of the sprite dissolves. The key insight: you learn more about structure by trying to destroy it than by trying to build it.
Multiple diffusion trajectories run in parallel with different noise levels and generation strategies. Results are combined via quality-weighted voting. Holographic generation achieves 0.832 quality vs. 0.794 for single-trajectory generation — a 4.8% improvement from path averaging alone.
After each generation, good outputs lower the energy of their configurations. Bad outputs raise it. The manifold develops topography with attractor basins, ridges, and voids. This is the paper’s central claim realized: the target distribution is not given — it is inferred recursively from the system’s own outputs.
All results on Claudine (mage archetype), south_idle frame:
| Mode | Overall | Silhouette | Coherence | Symmetry | Density | Structure |
|---|---|---|---|---|---|---|
| Seeded (baseline) | 0.794 | 0.97 | 0.31 | 0.78 | 0.91 | 1.00 |
| Holographic (4 paths) | 0.832 | 0.98 | 0.36 | 0.80 | 0.92 | 1.00 |
MobleyDiffusion is the first implementation of recursive permutative diffusion with self-inferring target distributions. The system generates structured 16-color pixel art sprites at 0.832 quality using discrete masked diffusion, five structural sieves, geodesic unmasking, hyperdiffusion, inductive exploration, holographic path integral averaging, nested attractor hierarchies, cross-character manifold transfer, and temporal 4D animation modeling — all in 3,027 lines of pure Python with zero external dependencies.
The key result is not the quality score. It is the demonstration that a generative system can infer its own target distribution while generating — that the energy landscape and the samples it produces can co-evolve toward a stable fixed point.