About

I am a CS Ph.D. student at the University of Illinois Urbana-Champaign in the PL/FM/SE group, advised by Prof. Lingming Zhang. You can find my CV here.

My research fields are Software Engineering and Machine Learning. Specifically, I focus on building large language models to solve software engineering tasks, with a specific interest in improving LLM’s reasoning and planning capabilities for code generation and repair through pre-training and post-training.

I obtained my bachelor’s degrees at Tsinghua University, including one in Software Engineering from the School of Software and one in Business Administration from the School of Economics and Management. I was a research assistant at Software System Security Assurance Group during my undergraduate years, advised by Prof. Yu Jiang.

Selected Publications (See full list on Google Scholar)

Planning-Aware Code Infilling via Horizon-Length Prediction
Yifeng Ding, Hantian Ding, Shiqi Wang, Qing Sun, Varun Kumar, and Zijian Wang
arXiv; NeurIPS 2024 System 2 Reasoning Workshop. [preprint] [summary]
XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts
Yifeng Ding, Jiawei Liu, Yuxiang Wei, and Lingming Zhang
62nd Annual Meeting of the Association for Computational Linguistics
(ACL 2024), Pages 12941–-12955, August 2024. [paper] [summary] [code]
SelfCodeAlign: Self-Alignment for Code Generation
Yuxiang Wei, Federico Cassano, Jiawei Liu, Yifeng Ding, Naman Jain, Zachary Mueller, Harm de Vries, Leandro Von Werra, Arjun Guha, and Lingming Zhang
Thirty-eighth Conference on Neural Information Processing Systems
(NeurIPS 2024), To appear, December 2024. [preprint] [code] [model] [dataset]
Magicoder: Empowering Code Generation with OSS-Instruct
Yuxiang Wei, Zhe Wang, Jiawei Liu, Yifeng Ding, and Lingming Zhang
Forty-first International Conference on Machine Learning
(ICML 2024), Pages 52632–52657, July 2024. [paper] [code] [model] [dataset]
Evaluating Language Models for Efficient Code Generation
Jiawei Liu, Songrun Xie, Junhao Wang, Yuxiang Wei, Yifeng Ding, and Lingming Zhang
First Conference on Language Modeling
(COLM 2024), October 2024. [paper] [code]
RepoQA: Evaluating Long Context Code Understanding
Jiawei Liu, Jia Le Tian, Vijay Daita, Yuxiang Wei, Yifeng Ding, Yuhan Katherine Wang, Jun Yang, Lingming Zhang
First Workshop on Long-Context Foundation Models @ ICML 2024
(LCFM @ ICML 2024), July 2024. [paper] [code]
The Plastic Surgery Hypothesis in the Era of Large Language Models
Chunqiu Steven Xia, Yifeng Ding, and Lingming Zhang
38th IEEE/ACM International Conference on Automated Software Engineering
(ASE 2023), Pages 522–534, September 2023. [paper]

Academic Services

Reviewer: ICML, NeurIPS, ICLR, AISTATS, ACL/ARR, TOSEM (2024 - 2025)
Organizing Committee: LLM4Code@ICSE’25 (International Workshop on Large Language Models for Code, co-organized with ICSE’25), LLM4Code@ICSE’24

Teaching

UIUC CS 521: ML and Compilers (2025 Spring) [website]
with Prof. Sasa Misailovic and Prof. Charith Mendis

Invited Talks

CAMEL-AI.org (11/2024): “Horizon-Length Prediction: Advancing Fill-in-the-Middle Capabilities for Code Generation with Lookahead Planning”
UCLA AGI Lab (11/2024): “Improving Code Language Modeling via Horizon-Length Prediction”
UIUC FM/SE Seminar (10/2024): “Horizon-Length Prediction: Advancing Fill-in-the-Middle Capabilities for Code Generation with Lookahead Planning”
Amazon Comprehend Team (07/2024): “XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts”
Uber Programming Systems Team (04/2023): “Equipping Large Language Models with Domain-Specific Knowledge for Automated Program Repair”

Contacts

Email: yifeng6@illinois.edu
Address: 2107 Thomas M. Siebel Center for Computer Science

Yifeng DING

Selected Publications (See full list on Google Scholar)

Academic Services

Teaching

Invited Talks

Contacts