About
I am a CS Ph.D. student at the University of Illinois Urbana-Champaign in the PL/FM/SE group, advised by Prof. Lingming Zhang. You can find my CV here.
My research fields are Software Engineering and Machine Learning. Specifically, I focus on building large language models to solve software engineering tasks, with a specific interest in improving LLM’s reasoning and planning capabilities for code generation and repair through pre-training and post-training.
I obtained my bachelor’s degrees at Tsinghua University, including one in Software Engineering from the School of Software and one in Business Administration from the School of Economics and Management. I was a research assistant at Software System Security Assurance Group during my undergraduate years, advised by Prof. Yu Jiang.
Selected Publications (See full list on Google Scholar)
- Horizon-Length Prediction: Advancing Fill-in-the-Middle Capabilities for Code Generation with Lookahead Planning
Yifeng Ding, Hantian Ding, Shiqi Wang, Qing Sun, Varun Kumar, and Zijian Wang
arXiv; NeurIPS 2024 System 2 Reasoning Workshop. [preprint] [summary] - XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts
Yifeng Ding, Jiawei Liu, Yuxiang Wei, and Lingming Zhang
62nd Annual Meeting of the Association for Computational Linguistics
(ACL 2024), Pages 12941–-12955, August 2024. [paper] [summary] [code] - SelfCodeAlign: Self-Alignment for Code Generation
Yuxiang Wei, Federico Cassano, Jiawei Liu, Yifeng Ding, Naman Jain, Zachary Mueller, Harm de Vries, Leandro Von Werra, Arjun Guha, and Lingming Zhang
Thirty-eighth Conference on Neural Information Processing Systems
(NeurIPS 2024), To appear, December 2024. [preprint] [code] [model] [dataset] - Magicoder: Empowering Code Generation with OSS-Instruct
Yuxiang Wei, Zhe Wang, Jiawei Liu, Yifeng Ding, and Lingming Zhang
Forty-first International Conference on Machine Learning
(ICML 2024), Pages 52632–52657, July 2024. [paper] [code] [model] [dataset] - Evaluating Language Models for Efficient Code Generation
Jiawei Liu, Songrun Xie, Junhao Wang, Yuxiang Wei, Yifeng Ding, and Lingming Zhang
First Conference on Language Modeling
(COLM 2024), October 2024. [paper] [code] - RepoQA: Evaluating Long Context Code Understanding
Jiawei Liu, Jia Le Tian, Vijay Daita, Yuxiang Wei, Yifeng Ding, Yuhan Katherine Wang, Jun Yang, Lingming Zhang
First Workshop on Long-Context Foundation Models @ ICML 2024
(LCFM @ ICML 2024), July 2024. [paper] [code] - The Plastic Surgery Hypothesis in the Era of Large Language Models
Chunqiu Steven Xia, Yifeng Ding, and Lingming Zhang
38th IEEE/ACM International Conference on Automated Software Engineering
(ASE 2023), Pages 522–534, September 2023. [paper]
Academic Services
- Reviewer: AISTATS’25, ICLR’25, NeurIPS’24, ACL’24, EMNLP’24, NAACL’25
- Organizing Committee: LLM4Code@ICSE’25 (International Workshop on Large Language Models for Code, co-organized with ICSE’25), LLM4Code@ICSE’24
Invited Talks
- CAMEL-AI.org (11/2024): “Horizon-Length Prediction: Advancing Fill-in-the-Middle Capabilities for Code Generation with Lookahead Planning”
- UCLA AGI Lab (11/2024): “Improving Code Language Modeling via Horizon-Length Prediction”
- UIUC FM/SE Seminar (10/2024): “Horizon-Length Prediction: Advancing Fill-in-the-Middle Capabilities for Code Generation with Lookahead Planning”
- Amazon Comprehend Team (07/2024): “XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts”
- Uber Programming Systems Team (04/2023): “Equipping Large Language Models with Domain-Specific Knowledge for Automated Program Repair”
Contacts
- Email: yifeng6@illinois.edu
- Address: 2107 Thomas M. Siebel Center for Computer Science