{"slug": "released-larkos-0-3", "title": "Released larkos 0.3", "summary": "Larkos 0.3 introduces a GAT-based neuron graph reasoner, a temporal attention encoder, and a refactored fusion head. The update replaces mean-pooling with learned-query attention and increases memory capacity, while the C-side fusion pipeline is simplified. These changes aim to improve reasoning over neuron states and temporal dynamics.", "body_md": "Larkos 0.3: GAT neuron reasoning, temporal encoder, refactored fusion head\n\nCore architecture changes:\n\nAdd _NeuronGraphReasoner: two-layer GAT over the live neuron graph,\n\nproducing per-neuron token embeddings (MAX_NEURONS x FUSE_GRAPH_DMODEL).\n\nNode features include state, output, layer one-hot, connection degree,\n\nstate velocity, output magnitude, and mean edge weight (D_NODE=8).\n\nLearned per-neuron embedding ensures distinct tokens for symmetric nodes.\n\nAdd _GATLayer: hand-rolled multi-head GAT with edge-weight-modulated\n\nscores (tanh-squashed gain), dense [N,N] adjacency mask, and self-loops.\n\nAdd _TemporalAttentionEncoder: two-layer transformer over the\n\n[TEMPORAL_WINDOW, FOURIER_OUT_DIM] input history with learned positional\n\nembedding, replacing flat concatenation of window frames.\n\nRefactor _FusionTransformerHead: attends over MAX_NEURONS+3 token\n\nsequence (GAT tokens + band_q + band_m + driver). Token-type embeddings\n\n(4 types) distinguish token kinds. Replaces mean-pool with learned-query\n\nattention pool (single query, softmax over sequence). Head capacity\n\nincreased: 3 layers, d_model=64, dim_ff=128.\n\nC-side fusion (fusion_mechanism.c):\n\nRemove BAND_N and the neuron_flat projection pipeline; neuron reasoning\n\nis now handled end-to-end by the Python-side GAT.\n\nBAND_Q=32, BAND_M=32, FUSION_DIM=BAND_Q+BAND_M=64.\n\nMEM_TOP_K: 8→32, MAX_MEM_ENTRIES: 300→1200.\n\nTraining loop:\n\nFreeze cache extended to cover driver embedding (_cached_driver) and\n\nGAT inputs (_cached_graph_inputs). All three caches invalidated together\n\non target refresh. GAT runs forward_from_inputs in-graph every step\n\n(pinned inputs, live gradient).\n\nx_temporal detached before MAML inner loop to prevent double-backward\n\nthrough the temporal encoder graph.\n\ngraph_reasoner and temporal_encoder added to optimizer and checkpoint.\n\nVerifier re-runs temporal_encoder on cached raw sequence to avoid\n\nreusing a consumed autograd graph.\n\nLR sensitivity check uses relative threshold (15% of current loss)\n\ninstead of fixed absolute delta.\n\nRunner:\n\nAdd _advance_backend(): runs C-side decision/context/neuron/attractor/\n\naffective updates before each step() so multi-step inference sees\n\nevolving state.\n\nalpha and mem_weight_ratio derived from live backend context by default,\n\nmatching the training loop's per-epoch derivation.\n\nTemporal encoder and graph reasoner included in forward path.\n\nCheckpoint:\n\nSaves/loads graph_reasoner and temporal_encoder (strict=False for\n\nbackward compatibility with pre-0.3 checkpoints).\n\nFUSION_DIM and fused_cog_norm dimension mismatch detection with safe\n\nfallback to fresh init.\n\ncached_driver persisted alongside cached_fused_cog.", "url": "https://wpnews.pro/news/released-larkos-0-3", "canonical_source": "https://dev.to/okerew/released-larkos-03-19be", "published_at": "2026-06-13 15:56:54+00:00", "updated_at": "2026-06-13 16:14:50.483334+00:00", "lang": "en", "topics": ["artificial-intelligence", "machine-learning", "neural-networks", "ai-research", "developer-tools"], "entities": ["Larkos"], "alternates": {"html": "https://wpnews.pro/news/released-larkos-0-3", "markdown": "https://wpnews.pro/news/released-larkos-0-3.md", "text": "https://wpnews.pro/news/released-larkos-0-3.txt", "jsonld": "https://wpnews.pro/news/released-larkos-0-3.jsonld"}}