HyperMLP: An Integrated Perspective for Sequence Modeling

Published in ICML 2026, 2026

HyperMLP develops an integrated architectural view of sequence modeling by interpreting autoregressive attention heads as dynamic MLPs whose weights are instantiated from context history. The method introduces learnable sequence-space mixing aligned with autoregressive semantics to improve expressive routing under practical compute constraints.

Links: arXiv · ICML poster page