{"slug": "introducing-drm-language-emitter", "title": "Introducing DRM Language Emitter", "summary": "A researcher released DRM Language Emitter, an experimental language model that generates text through learned geometric motion instead of Transformer attention. The project tests whether language can be modeled as controlled latent trajectories on a relational manifold, using no self-attention or KV cache. It is not a production system but a research scaffold for exploring geometry-first generative AI.", "body_md": "I’m sharing an experimental research project called **DRM Language Emitter**.\n\nIt is a geometry-first language model lab for exploring generative AI without Transformer blocks, without self-attention, without Q/K/V attention, and without KV cache inside the DRM model. The central idea is to treat language generation not as attention over a token window, but as controlled motion through a learned relational manifold.\n\nIn DRM, generation follows a different path:\n\n``` php\ntoken  \n-> latent state \n-> active directions  \n-> learned relational metric  \n-> controlled latent motion  \n-> next latent state  \n-> token logits\n```\n\nThe working hypothesis is simple:\n\nMaybe language can be generated as motion through a learned relational state space.\n\nThis is not a production model.\n\nThis is not a claim that DRM is better than Transformers in general.\n\nThis is not a claim that DRM is better than world models in general.\n\nIt is a research scaffold for testing whether explicit geometry, active directions, latent dynamics, and learned metrics can become useful components for small language models and symbolic dynamics. The README describes the model’s central computation as a latent trajectory through state, directions, gates, metric, velocity, state update, and logits.\n\nMost current language modeling research is organized around the Transformer paradigm.\n\nThat makes sense. Transformers work extremely well.\n\nBut I wanted to test a different question:\n\nWhat happens if the model does not attend backward over a context window, but instead carries an evolving latent state through a learned geometry?\n\nDRM Language Emitter is my attempt to explore that question in code.\n\nThe model has:\n\n`diag + U U^T`\n\n;The model is autoregressive, but its memory is the evolving latent state rather than attention over a token sequence.\n\nDRM Language Emitter does not use:\n\n`nn.MultiheadAttention`\n\n;Instead, it tries to make the geometry of generation explicit and measurable.\n\nThe project logs diagnostics such as cross-entropy, approximate perplexity, metric action, active dimension, gate entropy, metric norm, condition proxy, recurrence, stability, low-action path diagnostics, and symbolic world-modeling metrics.\n\nThat matters because I do not want the model to be only a black box that outputs tokens.\n\nI want to inspect how it moves.\n\nI want to measure whether the learned geometry collapses, stabilizes, expands, or forms useful trajectories.\n\nThe repository includes:\n\n```\nsrc/drm_language_emitter/   DRM model package\ntransformer/                tiny Transformer baseline\nworld_model/                tiny symbolic world-model baseline\nscripts/                    training, generation, evaluation, sweeps, dashboards\nconfigs/                    DRM and benchmark configs\ndocs/                       math, limitations, competition notes, benchmark artifacts\ntests/                      smoke and invariant tests\n```\n\nIt is CPU-runnable, with CUDA optional. The latest local benchmark reported CPU-only execution, so stronger CUDA and time-matched comparisons are still future work.\n\nThe long-term research question is:\n\nCan learned geometry become a useful primitive for language generation?\n\nMore specifically:\n\nI do not know the final answer yet.\n\nThat is why the repo exists.\n\nInstall:\n\n```\npip install -e .\n```\n\nTrain a tiny DRM model:\n\n```\npython scripts/train_tiny.py --config configs/tiny.yaml --text data/tiny.txt\n```\n\nGenerate text:\n\n```\npython scripts/generate.py --checkpoint runs/tiny/drm_tiny.pt --prompt \"DRM \"\n```\n\nRun geometry diagnostics:\n\n```\npython scripts/eval_geometry.py --checkpoint runs/tiny/drm_tiny.pt\npython scripts/eval_geodesic_paths.py --checkpoint runs/tiny/drm_tiny.pt\n```\n\nAt the end of the current README, I also include benchmark artifacts comparing:\n\nThe latest tiny symbolic world-model benchmark used a deterministic gridworld serialized as text. It produced:\n\n```\nruns: 72\naggregate rows: 24\n```\n\nThe top result by next-state exact match was:\n\n```\ndrm_tiny @ 2000 steps\nnext_state_exact_match = 0.0751\n```\n\nIn the same benchmark, `transformer_tiny_220k @ 3000`\n\nhad a lower invalid-state rate of `0.0026`\n\n, and the tiny supervised world model reached low CE but weak exact-match and rollout metrics.\n\nThe honest interpretation is:\n\nDRM showed an early signal on symbolic next-state prediction, but the absolute accuracy is still low. This is diagnostic, not decisive.\n\nThe benchmark does not prove that DRM is better than Transformers.\n\nIt does not prove that DRM is better than world models.\n\nIt does not say anything about large multimodal world models.\n\nIt only shows that this geometry-first emitter is now testable against baselines in a controlled tiny symbolic environment.\n\nAllowed claim:\n\nDRM Language Emitter is a functional non-Transformer language model prototype with explicit, measurable geometry and controlled tiny comparisons against Transformer and symbolic world-model baselines.\n\nNot allowed:\n\nDRM is broadly better than Transformers or world models.\n\nThat distinction matters.\n\nI am currently working on:\n\nRepository:\n\n```\nhttps://github.com/gnai-creator/drm-language-emitter\n```\n\nFeedback, criticism, reproduction attempts, and benchmark suggestions are very welcome.", "url": "https://wpnews.pro/news/introducing-drm-language-emitter", "canonical_source": "https://discuss.huggingface.co/t/introducing-drm-language-emitter/176947#post_1", "published_at": "2026-06-18 07:02:51+00:00", "updated_at": "2026-06-18 07:29:53.798491+00:00", "lang": "en", "topics": ["generative-ai", "large-language-models", "ai-research", "machine-learning"], "entities": ["DRM Language Emitter"], "alternates": {"html": "https://wpnews.pro/news/introducing-drm-language-emitter", "markdown": "https://wpnews.pro/news/introducing-drm-language-emitter.md", "text": "https://wpnews.pro/news/introducing-drm-language-emitter.txt", "jsonld": "https://wpnews.pro/news/introducing-drm-language-emitter.jsonld"}}