cd /news/machine-learning/itnet-a-learnable-integral-transform… · home topics machine-learning article
[ARTICLE · art-33523] src=arxiv.org ↗ pub= topic=machine-learning verified=true sentiment=↑ positive

ITNet: A Learnable Integral Transform That Subsumes Convolution, Attention, and Recurrence

Researchers introduced the Integral Transform Network (ITNet), a unified architecture that subsumes convolution, attention, and recurrence as special cases of a learnable integral transform. ITNet matches or exceeds specialized baselines on ImageNet-1K, GLUE, ModelNet40, VQA v2, and NLVR2, demonstrating that a single learned interaction mechanism can recover the behavior of all three architectural families from data.

read1 min views3 publishedJun 19, 2026

arXiv:2606.19538v1 Announce Type: new Abstract: Convolutional networks, recurrent networks, and transformers each encode different inductive biases -- locality, sequential memory, and content-dependent pairwise interaction -- and have remained mathematically distinct since their inception. We show that this fragmentation reflects not a fundamental diversity in how signals should be processed, but rather incomplete views of a single underlying mathematical object: a learnable integral transform. We introduce the Integral Transform Network (ITNet), a unified architecture built around a learnable kernel that depends jointly on positions and features. This kernel is implemented as a small neural network, specifically an MLP, that models pairwise interactions, enabling the model to adapt its behavior from data. We show that convolution, self-attention (including multi-head), and autoregressive recurrence (including LSTM, GRU, S4, and Mamba) arise as special cases under appropriate parameterizations, and that ITNet is a universal approximator of continuous operators. To make this practical, we develop tiled kernel fusion, importance-weighted Monte Carlo integration, and learned low-rank factorization, enabling efficient and scalable computation. A single ITNet architecture with a shared operator and lightweight modality-specific encoders matches or exceeds specialized baselines on ImageNet-1K , GLUE, ModelNet40, VQA,v2 and NLVR2. The results demonstrate that a single learned interaction mechanism can recover the behavior of all three architectural families from data.

── more in #machine-learning 4 stories · sorted by recency
── more on @itnet 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/itnet-a-learnable-in…] indexed:0 read:1min 2026-06-19 ·