ls /news · home news
grep -r --recent /news | head -20

News

31277 articles page 623 of 1564 0 sources 30 min sync cycle

// latest articles 31277 indexed

04:00
2026-06-15
arxiv.org
large-language-models · 1m read · neu

The Culture Funnel: You Can't Align What isn't in the Data

Researchers at CohereLabs argue that large language models suffer from a 'cultural data funnel,' where cultural signals decline sharply during post-training while geographically concentrated data dominates. They release …

04:00
2026-06-15
arxiv.org
neural-networks · 1m read · neu

The Weight Norm Sets the Grokking Timescale: A Causal Delay Law

Researchers at arXiv have causally demonstrated that the weight norm sets the grokking timescale in neural networks, settling a dispute over whether weight norm causes the delayed generalization. By intervening on the no…

04:00
2026-06-15
arxiv.org
machine-learning · 1m read ↑ pos

Diffusion Policy Optimization without Drifting Apart

Researchers identified the double-drift phenomenon causing instability in diffusion policy-gradient methods and proposed DiPOD, a framework that interleaves self-distillation with policy-improving gradient updates to mai…

04:00
2026-06-15
arxiv.org
neural-networks · 1m read ↑ pos

Neural Slack Variables for Shape Constraints

Researchers introduced neural slack variables, a primal-side approach that converts constraint enforcement into a regression problem by coupling a primary network with an auxiliary network. The method achieved zero measu…

04:00
2026-06-15
arxiv.org
machine-learning · 1m read · neu

Uncertainty Estimation and Generalization Bounds for Modern Deep Learning

A new thesis investigates how Bayesian principles can improve understanding of modern deep learning systems, introducing the Deep Variational Implicit Process (DVIP) and post-hoc methods VaLLA and FMGP for uncertainty es…

04:00
2026-06-15
arxiv.org
machine-learning · 1m read ↑ pos

Gefen: Optimized Stochastic Optimizer

Researchers propose Gefen, a memory-efficient optimizer that reduces AdamW's memory footprint by ~8x while maintaining performance, enabling larger microbatches and improved throughput in deep learning training. Gefen au…

← prev page 623 / 1564 next →
LIVE [news] indexed:31277 page:623/1564 en · ua 2026-05-20 ·