M.H. Stewart Fellowship
Published:
Fellowship awarded by Georgia Tech.
Published:
Fellowship awarded by Georgia Tech.
Published:
Fellowship for research excellence awarded by Georgia Tech.
Published:
Best Poster Award (1st Place) at AASF AIX Summit 2026 for the Free Energy Mixer poster.
Published in ICLR 2024 (Poster), 2024
Presents ARM with AUEL, Random Dropping, and multi‑kernel local smoothing to better capture series‑wise patterns and inter‑series dependencies for long‑term multivariate TSF.
Published in ICML 2024 (Poster), PMLR 235: 32990–33006, 2024
Constructs Auxiliary Time Series (ATS) as exogenous inputs to capture inter‑series relations; identifies continuity, sparsity, and variability principles; improves multivariate TSF even with simple predictors.
Published in ICLR 2025 (Poster), 2025
Reformulates TSF as in‑context learning by constructing tokens of (lookback, future) task pairs, enabling Transformers to adapt predictors from context without parameter updates.
Published in ICML 2025 (Poster), PMLR 267: 40464–40490, 2025
Adds ARMA structure to autoregressive attention via a weighted varying gate, decoupling long‑range and local effects and improving TSF quality without increasing asymptotic complexity.
Published in ICML 2025 (Poster), PMLR 267: 40848–40867, 2025
Shows that a linear attention layer can be interpreted as a dynamic VAR; proposes SAMoVAR to realign multi‑layer Transformers with autoregressive forecasting for improved interpretability and accuracy.
Published in NeurIPS 2025 (Spotlight), 2025
Introduces Zero‑Sum Linear Attention (ZeroS), which removes the uniform zero‑order term and reweights residuals to enable stable positive/negative attention weights, allowing contrastive operations within a single layer while retaining O(N) complexity.
Published in ICLR 2026, 2026
Introduces Free Energy Mixer (FEM), which interprets (q,k) attention scores as a prior and performs a log-sum-exp free-energy readout to reweight values at the channel level, enabling a smooth transition from mean aggregation to selective channel-wise retrieval without increasing asymptotic complexity.
Published in ICML 2026, 2026
Introduces adaptive time series forecasting via symplectic attention, developed through mentored undergraduate research with Jiecheng Lu as corresponding author.
Published in ICML 2026, 2026
Presents an integrated dynamic-MLP perspective on sequence modeling, reinterpreting attention heads through context-instantiated MLP computation and learnable sequence-space mixing.
Published:
PhD seminar talk at Georgia Tech ISyE on scaling laws, expressivity-efficiency tradeoffs, and the role of architecture in sequence modeling.
Published:
Invited online talk hosted by Tsinghua University on HyperMLP and an integrated view of sequence modeling.
Published:
ML PhD seminar talk at Georgia Tech on scaling laws, expressivity-efficiency tradeoffs, and the role of architecture in sequence modeling.
Independent Instructor, Georgia Institute of Technology, 2026
Independent instructor for ISyE 4031 Regression and Forecasting at Georgia Tech in Summer 2026.