CV
Jiecheng Lu
Summary
Ph.D. student in Machine Learning at Georgia Tech advised by Shihao Yang. My research focuses on developing sequence-model architectures that improve expressivity under practical compute constraints, spanning attention mechanisms, linear attention, dynamic-MLP views of sequence modeling, and time series foundation models. In the 2024-2026 period, I have published 9 papers as the main contributor at top machine learning venues, including ICLR, ICML, and NeurIPS.
Research Interests
Foundation models; efficient and expressive sequence modeling; linear attention; long-context reasoning; time series analysis; sequence modeling across NLP, vision, and scientific data.
Education
- Ph.D. in Machine LearningAug. 2023 - presentGeorgia Institute of TechnologyAdvisor: Shihao YangAtlanta, GA
- M.S. in AnalyticsAug. 2021 - Aug. 2023Georgia Institute of TechnologyAtlanta, GA
- B.S. in Logistics EngineeringSep. 2016 - Jun. 2020Tianjin UniversityTianjin, China
Work Experience
- Data Scientist InternJul. 2021 - Jul. 2022Tencent Medical AI Lab (JARVIS Lab)Shenzhen, China
- Led development of deep learning and statistical models for medical insurance forecasting and epidemic monitoring; co-authored 2 medical time-series papers, filed 6 patents as first inventor, and delivered technical solutions for medical facilities across 4 major cities handling up to 10 million data points daily.
- Business Analyst InternJan. 2021 - Jul. 2021Amazon Private Brands, Global Sourcing TeamShenzhen, China
- Applied machine learning and causal inference to cost analysis and sourcing decisions; built AWS-based dashboard web apps that informed decision-making and reported savings of over USD 100,000 monthly.
- Research AssistantSep. 2020 - Jan. 2021Peking University, Guanghua School of ManagementBeijing, China
- Strategic and Data Analytics InternJul. 2020 - Sep. 2020Country Garden GroupFoshan, China
Publications
First-authored papers
- HyperMLP: An Integrated Perspective for Sequence Modeling2026
- Free Energy Mixer2026Jiecheng Lu, Shihao YangICLR 2026Introduces Free Energy Mixer (FEM), which interprets attention scores as a prior and performs a free-energy readout for channel-wise selective retrieval without increasing asymptotic complexity.
- ZeroS: Zero-Sum Linear Attention for Efficient Transformers2025Jiecheng Lu, Xu Han, Yan Sun, Viresh Pati, Yubin Kim, Siddhartha Somani, Shihao YangNeurIPS 2025 Spotlight
- Linear Transformers as VAR Models: Aligning Autoregressive Attention Mechanisms with Autoregressive Forecasting2025
- WAVE: Weighted Autoregressive Varying Gate for Time Series Forecasting2025
- In-context Time Series Predictor2025
- CATS: Enhancing Multivariate Time Series Forecasting by Constructing Auxiliary Time Series as Exogenous Variables2024
- ARM: Refining Multivariate Forecasting with Adaptive Temporal-Contextual Learning2024
Corresponding-Author Publications and Mentored Research Leadership
- StretchTime: Adaptive Time Series Forecasting via Symplectic Attention2026Yubin Kim, Viresh Pati, Jevon Twitty, Vinh Pham, Shihao Yang, Jiecheng Lu*ICML 2026*corresponding author; mentored undergraduate first authors
- CAPS: Unifying Attention, Recurrence, and Alignment in Transformer-based Time Series Forecasting2026Viresh Pati, Yubin Kim, Vinh Pham, Jevon Twitty, Shihao Yang, Jiecheng Lu*Under review*corresponding author; mentored undergraduate first authors
Talks
- Rethinking Sequence Modeling with HyperMLP: An Integrated Architectural Perspective2026Invited Talk, Knowledge Engineering Group, Tsinghua University (hosted by Prof. Jie Tang)OnlineInvited talk delivered in March 2026.
- Rethinking Sequence Modeling: LLM Scaling Laws, Expressivity-Efficiency Tradeoffs, and the Role of Architecture2026PhD Seminar, Georgia Tech Machine Learning Student SeminarAtlanta, GAStudent seminar talk delivered in April 2026.
- Rethinking Sequence Modeling: LLM Scaling Laws, Expressivity-Efficiency Tradeoffs, and the Role of Architecture2026PhD Seminar, Georgia Tech ISyE PhD Student SeminarAtlanta, GAPhD seminar talk delivered in February 2026.
Awards
- AASF AIX Summit 2026Poster: Free Energy Mixer
- Georgia Tech
- Georgia Tech
Computing Grants
- Subquadratic HyperMLP via Convolution-Based Sequence MixingMar. 2026 - Mar. 2027Lambda Research Grant - PI
- Scaling Zero-Sum Linear Attention for Cross-Modal Foundation ModelsApr. 2026 - Sep. 2026NVIDIA Academic Grant - awarded project
- Advanced Transformer-based Models for Long-term Time Series ForecastingSep. 2025 - Sep. 2026NSF ACCESS Program - awarded project
Leadership, Teaching, and Service
- Research leadership and mentoringSince Spring 2025, I have organized and led weekly research discussions for a five-student Georgia Tech undergraduate Sequential AI team, mentoring students on experiment design and paper writing. As of Spring 2026, this team completed three manuscripts: one published paper (ZeroS, NeurIPS 2025 Spotlight) and two ICML 2026 submissions with undergraduate first authors and me as corresponding/final author.
- Teaching preparationIndependent instructor, ISyE 4031 Regression and Forecasting, Georgia Tech, Summer 2026. Participant in Georgia Tech's Tech to Teaching program and finished CETL 8717 Course Design for Higher Education.
- Reviewing serviceNeurIPS 2024-2026, ICLR 2025-2026, ICML 2025-2026, and IEEE Internet of Things Journal.
Teaching
- ISyE 4031 Regression and Forecasting2026Georgia Institute of TechnologyRole: Independent instructorSummer 2026.
Research Translation
- CDC FluSight forecasting hubFall 2024 - PresentSubmit weekly influenza hospitalization forecasts for all U.S. states to the CDC FluSight forecasting hub (entry name: Gatech-ensemble), translating time-series forecasting research into public-health decision support.
Patents
- CN115114345BFeature Representation Extraction for Time-Series Data
- CN114358186BData Processing for Attention-Based Time-Series Forecasting
- CN115130656ATraining an Anomaly Detection Model for Time-Series Data
- CN117010542AMulti-Source Multimodal Data Prediction
- CN117012347AMedical Data Generation from Unordered Inputs
- CN114676176AInterpretable Time-Series Forecasting