Released larkos 0.3

Larkos 0.3 introduces a GAT-based neuron graph reasoner, a temporal attention encoder, and a refactored fusion head. The update replaces mean-pooling with learned-query attention and increases memory capacity, while the C-side fusion pipeline is simplified. These changes aim to improve reasoning over neuron states and temporal dynamics.

Larkos 0.3: GAT neuron reasoning, temporal encoder, refactored fusion head Core architecture changes: Add NeuronGraphReasoner: two-layer GAT over the live neuron graph, producing per-neuron token embeddings MAX NEURONS x FUSE GRAPH DMODEL . Node features include state, output, layer one-hot, connection degree, state velocity, output magnitude, and mean edge weight D NODE=8 . Learned per-neuron embedding ensures distinct tokens for symmetric nodes. Add GATLayer: hand-rolled multi-head GAT with edge-weight-modulated scores tanh-squashed gain , dense N,N adjacency mask, and self-loops. Add TemporalAttentionEncoder: two-layer transformer over the TEMPORAL WINDOW, FOURIER OUT DIM input history with learned positional embedding, replacing flat concatenation of window frames. Refactor FusionTransformerHead: attends over MAX NEURONS+3 token sequence GAT tokens + band q + band m + driver . Token-type embeddings 4 types distinguish token kinds. Replaces mean-pool with learned-query attention pool single query, softmax over sequence . Head capacity increased: 3 layers, d model=64, dim ff=128. C-side fusion fusion mechanism.c : Remove BAND N and the neuron flat projection pipeline; neuron reasoning is now handled end-to-end by the Python-side GAT. BAND Q=32, BAND M=32, FUSION DIM=BAND Q+BAND M=64. MEM TOP K: 8→32, MAX MEM ENTRIES: 300→1200. Training loop: Freeze cache extended to cover driver embedding cached driver and GAT inputs cached graph inputs . All three caches invalidated together on target refresh. GAT runs forward from inputs in-graph every step pinned inputs, live gradient . x temporal detached before MAML inner loop to prevent double-backward through the temporal encoder graph. graph reasoner and temporal encoder added to optimizer and checkpoint. Verifier re-runs temporal encoder on cached raw sequence to avoid reusing a consumed autograd graph. LR sensitivity check uses relative threshold 15% of current loss instead of fixed absolute delta. Runner: Add advance backend : runs C-side decision/context/neuron/attractor/ affective updates before each step so multi-step inference sees evolving state. alpha and mem weight ratio derived from live backend context by default, matching the training loop's per-epoch derivation. Temporal encoder and graph reasoner included in forward path. Checkpoint: Saves/loads graph reasoner and temporal encoder strict=False for backward compatibility with pre-0.3 checkpoints . FUSION DIM and fused cog norm dimension mismatch detection with safe fallback to fresh init. cached driver persisted alongside cached fused cog.