OLMo-core + Engram graft: small-scale debug comparison

wpnews.pro

cd /news/large-language-models/olmo-core-engram-graft-small-scale-d… · home › topics › large-language-models › article

[ARTICLE · art-35842] src=discuss.huggingface.co ↗ pub=2026-06-21T19:13Z topic=large-language-models verified=true sentiment=↑ positive

OLMo-core + Engram graft: small-scale debug comparison

A debug comparison between a base OLMo3 600M model and an Engram memory variant showed the grafted model achieved lower training and evaluation cross-entropy loss and faster gradient norm stabilization, indicating successful integration and improved early learning behavior.

read1 min views1 publishedJun 21, 2026

I ran a 200-step

, with global_batch _size =32

debug comparison between a base OLMo3 600M model and the same dense backbone with a DeepSeek-style Engram memory graft.

The goal was to check whether the custom module was wired correctly, whether FSDP/HSDP wrapping and optimizer handling were stable, and whether the training/eval curves looked coherent.

Base model:

Engram variant:

~1.7B trainable parameters

Engram injected into layers 1 and 5

Most added parameters come from sparse/hash-memory capacity, so total parameter count is not an apples-to-apples proxy for dense active compute.

Both are trained with Dion optimizer. Under the same short debug setup, the Engram variant showed:

lower train CE loss

lower eval CE loss / PPL

slightly faster grad-norm stabilization

The early signal is encouraging: the Engram graft is training-shaped, stable, and appears to improve early learning behavior in this setup.

Custom architecture work is not just “does the forward pass run?”

For this integration, the parameter hierarchy, wrapping policy, optimizer handling, memory profile, and training curves all had to line up. Earlier versions trained mathematically, but had poor memory behavior because the custom modules were not placed inside the wrapped block hierarchy.

W&B logs: [Weights & Biases](https://wandb.ai/jenwei0312/olmo3-engram-experiments)

source & further reading

discuss.huggingface.co — original article Rakarrack-0.6.1 port making progress! ( AI assisted ) Cloud Storage Poll Welcome to Haiku basic(Haiku Docs, Haiku slide and Haiku sheets)

~/api · this article 200

$curl api.wpnews.pro/v1/news/olmo-core-engram-graft-s…

Read original on discuss.huggingface.co → discuss.huggingface.co/t/olmo-core-engram-graft-…

mentioned entities

OLMo

Engram

DeepSeek

Dion

Microsoft

Weights & Biases

metadata

slugolmo-core-engram-graft-small-scale-debug-comparison

topic#large-language-models

secondary2 topics

sentimentpositive

canonicaldiscuss.huggingface.co

navigation

← prevJailbroken Fable 5

next →Show HN: Memory Magico – CLI bas…

── more in #large-language-models 4 stories · sorted by recency

letsdatascience.com · 21 Jun · #large-language-models

Memory Shortages Leave Hyperscalers Trailing Memory Suppliers

danuker.go.ro · 21 Jun · #large-language-models

A cheaper and safer agentic AI workflow

code.visualstudio.com · 24 Jun · #large-language-models

Visual Studio Code 1.126

startupfortune.com · 21 Jun · #large-language-models

Kevin Warsh just killed the Fed's safety blanket and now everyone has to price their own risk

── more on @olmo 3 stories trending now

wpnews · 20 Jun · #ai-agents

Amazon Bedrock AgentCore Memory: Build AI Agents That Remember

wpnews · 21 Jun · #large-language-models

Anthropic faces a class action lawsuit accusing it of selling Claude Max subscribers far less than advertised

wpnews · 20 Jun · #artificial-intelligence

Microsoft is rewriting the economics of enterprise AI and the bill shock is just getting started

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required