Bio

I am an Assistant Professor in the Cheriton School of Computer Science at University of Waterloo. My research interest is improving developers' productivity during software development, testing, and maintenance. Specific topics include execution-guided machine learning models for testing and verification, learning to evolve code and comments, and frameworks for executable comments and specifications.
I obtained my Ph.D. in 2023 and M.Sc. in 2020 from The University of Texas at Austin, advised by Milos Gligoric. I received my B.Sc. from University of Science and Technology of China (School of the Gifted Young) in 2017.

Teaching

CS846 Advanced Topics in Software Engineering: Machine Learning for Software Engineering: Spring 2026, Fall 2024

CS446/CS646/ECE452 Software Design and Architecture: Winter 2026, Winter 2025, Winter 2024

Publications

  1. TokDrift: When LLM Speaks in Subwords but Code Speaks in Grammar.
    Yixin Li, Yuntian Deng, and Pengyu Nie.
    In Annual Meeting of the Association for Computational Linguistics
    (ACL'26), to appear. 2026. San Diego, USA.
  2. Energy-Efficient Software Development: A Multi-dimensional Empirical Analysis of Stack Overflow.
    Bihui Jin, Heng Li, Pengyu Nie, and Ying Zou.
    In International Conference on Software Engineering
    (ICSE'26). 2026.
  3. When AI Coding Assistants Leak Training Data: A Study of LLM Memorization in Code Generation.
    Xiaoyu Cheng, Kundi Yao, Pengyu Nie, and Weiyi Shang.
    In International Conference on AI-powered Software @ FSE
    (AIWare'26). 2026.
  4. Learning Multi-step Reasoning via Persistent Latent State Propagation.
    Yinxi Li, Jiaao Chen, Fang Wu, Jiakai Yu, Heli Qi, Weihao Xuan, Haokai Zhao, Pengyu Nie, Di Jin, and Xiangru Tang.
    In Workshop on Latent & Implicit Thinking---Going Beyond CoT Reasoning @ ICLR
    (LIT'26). 2026.
  5. World of Logs: A Dataset of Logs from Online Documents.
    Xiaohui Wang, Kundi Yao, Lizhi Liao, Pengyu Nie, and Weiyi Shang.
    In International Conference on Mining Software Repositories, Data and Tool Showcase Track
    (MSR'26 DataTool). 2026.
  6. NL in the Middle: Code Translation with LLMs and Intermediate Representations.
    Chi-en Amy Tai, Pengyu Nie, Lukasz Golab, and Alexander Wong.
    In International Conference on Collaborative Advances in Software and Computing
    (CASCON'25). November 2025. Toronto, Canada.
  7. Learning to Edit Interactive Machine Learning Notebooks.
    Bihui Jin*, Jiayue Wang*, and Pengyu Nie.
    In International Conference on the Foundations of Software Engineering, Ideas, Visions and Reflections Track
    (FSE'25 IVR). June 2025. Trondheim, Norway.
  8. A Tool for Generating Exceptional Behavior Tests With Large Language Models.
    Linghan Zhong, Samuel Yuan, Jiyang Zhang, Yu Liu, Pengyu Nie, Junyi Jessy Li, and Milos Gligoric.
    In International Conference on the Foundations of Software Engineering, Demonstrations Track
    (FSE'25 Demo). June 2025. Trondheim, Norway.
  9. Mix-of-Language-Experts Architecture for Multilingual Programming.
    Yifan Zong, Yuntian Deng, and Pengyu Nie.
    In International Workshop on Large Language Models for Code
    (LLM4Code'25). April 2025. Ottawa, Canada.
  10. CoUpJava: A Dataset of Code Upgrade Histories in Open-Source Java Repositories.
    Kaihang Jiang, Bihui Jin, and Pengyu Nie.
    In International Conference on Mining Software Repositories, Data and Tool Showcase Track
    (MSR'25 DataTool). April 2025. Ottawa, Canada.
  11. exLong: Generating Exceptional Behavior Tests with Large Language Models.
    Jiyang Zhang, Yu Liu, Pengyu Nie, Junyi Jessy Li, and Milos Gligoric.
    In International Conference on Software Engineering
    (ICSE'25). April 2025. Ottawa, Canada.
  12. InterTrans: Leveraging Transitive Intermediate Translations to Enhance LLM-based Code Translation.
    Marcos Macedo, Yuan Tian, Pengyu Nie, Filipe R. Cogo, and Bram Adams.
    In International Conference on Software Engineering
    (ICSE'25). April 2025. Ottawa, Canada.
  13. What Inputs Drive Effective LLM-based Unit Test Generation?.
    Saarang Agarwal, Pengyu Nie, and Meiyappan Nagappan.
    IEEE Software, Special Issue on AIware in the FM Era
    (IEEE Softw.'25 AIware). 2025.
  14. Detecting DTC Requirement-Implementation Inconsistencies Using LLMs: An Experience Report.
    Tongwei Zhang, Kundi Yao, Hanyang Hu, Pengyu Nie, Krishna Koravadi, and Weiyi Shang.
    IEEE Software
    (IEEE Softw.'25). 2025.
  15. Efficient Incremental Code Coverage Analysis for Regression Test Suites.
    Jiale Amber Wang*, Kaiyuan Wang*, and Pengyu Nie.
    In International Conference on Automated Software Engineering
    (ASE'24), 1882-1894. October 2024. Sacramento, USA.
  16. Multilingual Code Co-evolution using Large Language Models.
    Jiyang Zhang, Pengyu Nie, Junyi Jessy Li, and Milos Gligoric.
    In Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
    (ESEC/FSE'23), 695-707. December 2023. San Francisco, USA.
  17. Machine Learning for Executable Code in Software Testing and Verification.
    Pengyu Nie.
    PhD Thesis, The University of Texas at Austin. August 2023. Austin, USA.
    This dissertation won a Margarida Jacome Dissertation Award.
  18. Extracting Inline Tests from Unit Tests.
    Yu Liu, Pengyu Nie, Anna Guo, Milos Gligoric, and Owolabi Legunsen.
    In International Symposium on Software Testing and Analysis
    (ISSTA'23), 1458-1470. July 2023. Seattle, USA.
  19. More Precise Regression Test Selection via Reasoning about Semantics-Modifying Changes.
    Yu Liu, Jiyang Zhang, Pengyu Nie, Milos Gligoric, and Owolabi Legunsen.
    In International Symposium on Software Testing and Analysis
    (ISSTA'23), 664-676. July 2023. Seattle, USA.
    This paper won an ACM Distinguished Paper Award.
  20. Learning Deep Semantics for Test Completion.
    Pengyu Nie, Rahul Banerjee, Junyi Jessy Li, Raymond J. Mooney, and Milos Gligoric.
    In International Conference on Software Engineering
    (ICSE'23), 2111-2123. May 2023. Melbourne, Australia.
  21. pytest-inline: An Inline Testing Tool for Python.
    Yu Liu, Zachary Thurston, Alan Han, Pengyu Nie, Milos Gligoric, and Owolabi Legunsen.
    In International Conference on Software Engineering, Tool Demonstrations Track
    (ICSE'23 Demo), 161-164. May 2023. Melbourne, Australia.
  22. Inline Tests.
    Yu Liu, Pengyu Nie, Owolabi Legunsen, and Milos Gligoric.
    In International Conference on Automated Software Engineering
    (ASE'22), 57:1-13. October 2022. Oakland Center, USA.
  23. CoditT5: Pretraining for Source Code and Natural Language Editing.
    Jiyang Zhang, Sheena Panthaplackel, Pengyu Nie, Junyi Jessy Li, and Milos Gligoric.
    In International Conference on Automated Software Engineering
    (ASE'22), 22:1-12. October 2022. Oakland Center, USA.
  24. Impact of Evaluation Methodologies on Code Summarization.
    Pengyu Nie, Jiyang Zhang, Junyi Jessy Li, Raymond J. Mooney, and Milos Gligoric.
    In Annual Meeting of the Association for Computational Linguistics
    (ACL'22), 4936-4960. May 2022. Dublin, Ireland.
  25. Roosterize: Suggesting Lemma Names for Coq Verification Projects using Deep Learning.
    Pengyu Nie, Karl Palmskog, Junyi Jessy Li, and Milos Gligoric.
    In International Conference on Software Engineering, Tool Demonstrations Track
    (ICSE'21 Demo), 21-24. May 2021. Madrid, Spain.
  26. Leveraging Class Hierarchy for Code Comprehension.
    Jiyang Zhang, Sheena Panthaplackel, Pengyu Nie, Junyi Jessy Li, Raymond J. Mooney, and Milos Gligoric.
    In Workshop on Computer Assisted Programming
    (CAP'20). December 2020. Vancouver, Canada.
  27. Unifying Execution of Imperative Generators and Declarative Specifications.
    Pengyu Nie, Marinela Parovic, Zhiqiang Zang, Sarfraz Khurshid, Aleksandar Milicevic, and Milos Gligoric.
    In Conference on Object-Oriented Programming Systems, Languages and Applications
    (OOPSLA'20), 217:1-217:26. November 2020. Chicago, USA.
  28. On the Naturalness of Hardware Descriptions.
    Jaeseong Lee*, Pengyu Nie*, Junyi Jessy Li, and Milos Gligoric.
    In Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
    (ESEC/FSE'20), 530-542. November 2020. Sacramento, USA.
  29. Debugging the Performance of Maven's Test Isolation: Experience Report.
    Pengyu Nie, Ahmet Celik, Matthew Coley, Aleksandar Milicevic, Jonathan Bell, and Milos Gligoric.
    In International Symposium on Software Testing and Analysis
    (ISSTA'20), 249-259. July 2020. Los Angeles, USA.
  30. Learning to Update Natural Language Comments Based on Code Changes.
    Sheena Panthaplackel, Pengyu Nie, Milos Gligoric, Junyi Jessy Li, and Raymond J. Mooney.
    In Annual Meeting of the Association for Computational Linguistics
    (ACL'20), 1853-1868. July 2020. Seattle, USA.
  31. Deep Generation of Coq Lemma Names using Elaborated Terms.
    Pengyu Nie, Karl Palmskog, Junyi Jessy Li, and Milos Gligoric.
    In International Joint Conference on Automated Reasoning
    (IJCAR'20), 97-118. June 2020. Paris, France.
  32. Learning to Format Coq Code using Language Models.
    Pengyu Nie, Karl Palmskog, Junyi Jessy Li, and Milos Gligoric.
    In The Coq Workshop
    (Coq'20). June 2020. Paris, France.
  33. Design, Implementation, and Application of GPU-based Java Bytecode Interpreters.
    Ahmet Celik, Pengyu Nie, Christopher J. Rossbach, and Milos Gligoric.
    In Conference on Object-Oriented Programming Systems, Languages and Applications
    (OOPSLA'19), 177:1-177:28. October 2019. Athens, Greece.
  34. A Framework for Writing Trigger-Action Todo Comments in Executable Format.
    Pengyu Nie, Rishabh Rai, Junyi Jessy Li, Sarfraz Khurshid, Raymond J. Mooney, and Milos Gligoric.
    In Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
    (ESEC/FSE'19), 385-396. August 2019. Tallinn, Estonia.
    This paper won an ACM SIGSOFT Distinguished Paper Award.
  35. Natural Language Processing and Program Analysis for Supporting Todo Comments as Software Evolves.
    Pengyu Nie, Junyi Jessy Li, Sarfraz Khurshid, Raymond J. Mooney, and Milos Gligoric.
    In Workshop on Natural Language Processing for Software Engineering
    (NL4SE'18), 775-778. February 2018. New Orleans, USA.

Service

2027: PC member of ICSE, FSE.

2026: PC member of ICSE, ASE, ISSTA.

2025: PC member of ISSTA, LLM4Code. Reviewer for ICLR, ACL Rolling Review February (ACL'25).

2024: PC member of ASE, ISSTA, LLM4Code, ASE-SRC. Reviewer for ACL Rolling Review February (ACL'24), June (EMNLP'24), and December (ACL'25); Emergency Area Chair of ARR December.

2021: PC member of AAAI, NLP4Prog, AIST.

Journal reviewing: TOSEM, TSE, EMSE, TOPLAS, TACL, EAAI, IST, JSS.

  • 2022–2023: Co-organizer of Joint UT-Cornell Software Engineering Seminar.
  • 2018–2022: Co-organizer of NLP+Programming Reading Group at UT Austin.
  • 2022: Committee of Graduate and Industry Networking (GAIN) at UT Austin.
  • 2022: Mentor of ECE Partner Program at UT Austin.