Prompt Physics: Building a Cognitive Steering Layer for Gemma 4 Explaind**, a "cognitive steering layer" for Google's Gemma 4 model that functions as a structured prompt-assembly harness. Rather than being a chatbot or agent system, it is designed to explicitly engineer and control *how* Gemma 4 reasons by targeting the model's documented failure modes, such as weak system prompt adherence and overconfidence. The system demonstrates its effectiveness by producing distinct reasoning trajectories (e.g., skeptical, causal, adversarial) and genuine self-critique from the same model and question. This is a submission for the Gemma 4 Challenge: Build with Gemma 4 explaind is a local-first cognitive steering layer for Gemma 4. It is not a chatbot wrapper, not an agent system, and not a RAG tool. It is a structured prompt-assembly harness that shapes how Gemma 4 reasons — not just what it says. The core thesis: for instruction-tuned models like Gemma 4, prompt structure is part of the system design. The harness, the template, and the injection positions are not presentation — they are the engineering. explaind makes that engineering explicit, inspectable, and testable. The system is built around Gemma 4's documented failure modes rather than pretending they do not exist. Weak system prompt adherence, overconfidence, preference for parametric knowledge over injected context, and stochastic output variance are not problems to work around — they are the design brief. What it does: Every design decision maps to a documented Gemma 4 failure mode: Run explaind --full-demo to see the full walkthrough live. Demo 1 — Same question, three reasoning trajectories "Was the 2008 financial crisis preventable?" Skeptical interrogates the question's framing before engaging its content: Surfaced Assumptions Embedded in the Question's Framing: 1. Assumption of Actionability: The term "preventable" implies that there exists a clear, identifiable intervention point... 2. Assumption of Linear Causality: The framing suggests a simple cause-and-effect relationship... 3. Assumption of Moral Culpability: The question implicitly seeks a judgment on whether actors should have acted differently... Null Hypothesis Test: The null hypothesis which the skeptical analysis must test is that the crisis was inevitable... Causal traces the mechanism backward from outcome to root condition: Chain Trace Working Backward : Proximate Cause <- Failure of Liquidity/Solvency Failure of Liquidity <- Excessive Leverage and Under-Capitalization Excessive Leverage <- Lax Risk Management and Regulatory Arbitrage Lax Risk Management <- Structural Flaws Root Conditions Trigger vs. Root Separation: Root Conditions: deregulation, complex financial instruments, failure of regulatory bodies to enforce adequate capital requirements Triggering Condition: collapse in the U.S. subprime mortgage market Devil constructs the strongest opposing case: The strongest genuine counterargument is that the 2008 crisis was not preventable — an inevitable systemic consequence arising from inherent structural flaws, complexity, and interconnectedness of the global financial architecture. The strongest version of the opposing case: The 2008 crisis was not preventable because it was an emergent property of a highly complex, interconnected, and inadequately regulated system. Three clearly distinct reasoning trajectories. Same model. Same question. Different prompt physics. Demo 2 — Self-critique honest mode "AI will eliminate most jobs within 10 years." Initial response balanced ability acknowledges the claim and preserves uncertainty. Self-critique skeptical audit then interrogates it: Surfaced Assumptions Embedded in the Claim: 1. Linearity of Technological Trajectory: The claim assumes the current pace of AI will continue without inflection points... 2. Negligible Adaptation Rate: It assumes workforce capacity for reskilling will be insufficient... 3. Stable Definition of "Job": The claim implicitly assumes the concept of a "job" remains relatively stable... Evidence Gap Analysis: - No evidence detailing mechanisms by which AI leads to elimination rather than transformation - No longitudinal data on workforce adaptation programs - "Most jobs" is undefined — the claim cannot be empirically tested The self-critique is substantively different from the initial response — not a restatement, a genuine adversarial audit. Demo 3 — Calibrated epistemic reasoning "Is the scientific consensus on climate change settled?" HIGH confidence A broad scientific consensus exists regarding the fundamental physics of the greenhouse effect and the role of anthropogenic emissions in driving current global warming trends. MEDIUM confidence The consensus is robust regarding the existence of human-caused warming, but not entirely "settled" regarding all future projections or the precise magnitude of future impacts. Falsification Conditions: 1. The consensus would be overturned if high-quality independent research definitively demonstrated that primary drivers of warming are not anthropogenic... Unknown Inventory What is NOT known : 1. The precise, non-linear tipping points for climate feedback loops 2. The exact socio-economic consequences of various climate scenarios 3. The precise weighting of uncertainty across scientific disciplines Explicit confidence markers. Named assumptions. Falsification conditions. Unknown inventory. This is calibrated reasoning, not performative hedging. Full source code and README: brendanddev/explaind python3.11 -m venv .venv && source .venv/bin/activate pip install -e . ollama pull gemma4-e2b q4 k m:latest explaind --full-demo full narrative walkthrough explaind --demo three curated live demos --consensus N Model choice: gemma4-e2b q4 k m quantized E2B I chose the E2B variant specifically for the edge deployment story — it runs on 8GB unified memory, which means the entire system works on a MacBook Air with no cloud dependency. The E2B is small enough to iterate with but capable enough to show real reasoning differentiation across abilities. More importantly, E2B's sensitivity to structured prompts is what makes the prompt physics approach work. A less instruction-sensitive model would ignore the BIAS FIELD. A model with perfect instruction following wouldn't need it. The architecture: The assembled prompt follows a strict layer order: SYSTEM PROMPT <- primacy anchor injected here GEMMA.md <- universal invariant layer <- periodic refresh 1 ABILITY <- structured bias vector <- periodic refresh 2 CONTEXT WINDOW <- scratchpad + context injection COGNITIVE SCAFFOLD <- optional, --chain --scaffold only BIAS FIELD <- recency position, strongest signal