Pengyu Nie

Assistant Professor
University of Waterloo

Address: 200 University Avenue West
Waterloo, Ontario
Canada, N2L 3G1
Email: pynie@uwaterloo.ca

Bio

I am an Assistant Professor in the Cheriton School of Computer Science at the University of Waterloo. My research interest is improving developers' productivity during software development, testing and maintenance. Specific topics include execution-guided models for test completion and lemma naming, learning to evolve code and comments, and frameworks for maintaining executable comments and specifications.
I obtained my Ph.D. in 2023 and M.Sc. in 2020 from The University of Texas at Austin, advised by Milos Gligoric. I received my B.Sc. from University of Science and Technology of China (School of the Gifted Young) in 2017.

GitHub (pengyunie) Google Scholar DBLP

I am looking for self-motivated students with background in software engineering, formal methods, machine learning, and/or natural language processing.

Teaching

CS846 Advanced Topics in Software Engineering: Machine Learning for Software Engineering: Fall 2024

CS446/CS646/ECE452 Software Design and Architecture: Winter 2025 (current), Winter 2024

Publications

26. Suggesting Code Edits in Interactive Machine Learning Notebooks Using Large Language Models.
Bihui Jin*, Jiayue Wang*, and Pengyu Nie.
In International Conference on the Foundations of Software Engineering, Ideas, Visions and Reflections Track
(FSE'25 IVR), to appear. Trondheim, Norway, Jun 2025.

25. Mix-of-Language-Experts Architecture for Multilingual Programming.
Yifan Zong, Yuntian Deng, and Pengyu Nie.
In International Workshop on Large Language Models for Code
(LLM4Code'25), to appear. Ottawa, Canada, May 2025.

24. CoUpJava: A Dataset of Code Upgrade Histories in Open-Source Java Repositories.
Kaihang Jiang, Bihui Jin, and Pengyu Nie.
In the Mining Software Repositories Conference, Data and Tool Showcase Track
(MSR'25 DataTool), to appear. Ottawa, Canada, Apr 2025.

23. exLong: Generating Exceptional Behavior Tests with Large Language Models.
Jiyang Zhang, Yu Liu, Pengyu Nie, Junyi Jessy Li, and Milos Gligoric.
In International Conference on Software Engineering
(ICSE'25), to appear. Ottawa, Canada, Apr 2025.

22. InterTrans: Leveraging Transitive Intermediate Translations to Enhance LLM-based Code Translation.
Marcos Macedo, Yuan Tian, Pengyu Nie, Filipe R. Cogo, and Bram Adams.
In International Conference on Software Engineering
(ICSE'25), to appear. Ottawa, Canada, Apr 2025.

21. Efficient Incremental Code Coverage Analysis for Regression Test Suites.
Jiale Amber Wang, Kaiyuan Wang, and Pengyu Nie.
In International Conference on Automated Software Engineering
(ASE'24), to appear. Sacramento, USA, October 2024.

20. Multilingual Code Co-evolution using Large Language Models.
Jiyang Zhang, Pengyu Nie, Junyi Jessy Li, and Milos Gligoric.
In Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
(FSE'23), to appear. San Francisco, USA, December 2023.

19. Machine Learning for Executable Code in Software Testing and Verification.
Pengyu Nie.
PhD Thesis, The University of Texas at Austin.
Austin, USA, August 2023.
This dissertation won a Margarida Jacome Dissertation Award.

18. Extracting Inline Tests from Unit Tests.
Yu Liu, Pengyu Nie, Anna Guo, Milos Gligoric, and Owolabi Legunsen.
In International Symposium on Software Testing and Analysis
(ISSTA'23), 1458-1470. Seattle, USA, July 2023.

17. More Precise Regression Test Selection via Reasoning about Semantics-Modifying Changes.
Yu Liu, Jiyang Zhang, Pengyu Nie, Milos Gligoric, and Owolabi Legunsen.
In International Symposium on Software Testing and Analysis
(ISSTA'23), 664-676. Seattle, USA, July 2023.
This paper won an ACM SIGSOFT Distinguished Paper Award.

16. Learning Deep Semantics for Test Completion.
Pengyu Nie, Rahul Banerjee, Junyi Jessy Li, Raymond J. Mooney, and Milos Gligoric.
In International Conference on Software Engineering
(ICSE'23), 2111-2123. Melbourne, Australia, May 2023.

15. pytest-inline: An Inline Testing Tool for Python.
Yu Liu, Zachary Thurston, Alan Han, Pengyu Nie, Milos Gligoric, and Owolabi Legunsen.
In International Conference on Software Engineering, Tool Demonstrations Track
(ICSEDemo'23), 161-164. Melbourne, Australia, May 2023.

14. Impact of Evaluation Methodologies on Code Summarization.
Pengyu Nie, Jiyang Zhang, Junyi Jessy Li, Raymond J. Mooney, and Milos Gligoric.
In Annual Meeting of the Association for Computational Linguistics
(ACL'22), 4936-4960. Dublin, Ireland, May 2022.

13. Inline Tests.
Yu Liu, Pengyu Nie, Owolabi Legunsen, and Milos Gligoric.
In International Conference on Automated Software Engineering
(ASE'22), 1-13. Oakland Center, Michigan, USA, October 2022.

12. CoditT5: Pretraining for Source Code and Natural Language Editing.
Jiyang Zhang, Sheena Panthaplackel, Pengyu Nie, Junyi Jessy Li, and Milos Gligoric.
In International Conference on Automated Software Engineering
(ASE'22), 1-12. Oakland Center, Michigan, USA, October 2022.

11. Roosterize: Suggesting Lemma Names for Coq Verification Projects using Deep Learning.
Pengyu Nie, Karl Palmskog, Junyi Jessy Li, and Milos Gligoric.
In International Conference on Software Engineering, Tool Demonstrations Track
(ICSEDemo'21), 21-24. Virtual, May 2021.

10. Leveraging Class Hierarchy for Code Comprehension.
Jiyang Zhang, Sheena Panthaplackel, Pengyu Nie, Junyi Jessy Li, Raymond J. Mooney, and Milos Gligoric.
In Workshop on Computer Assisted Programming
(CAP'20). Virtual, December 2020.

9. Unifying Execution of Imperative Generators and Declarative Specifications. [slides] [talk]
Pengyu Nie, Marinela Parovic, Zhiqiang Zang, Sarfraz Khurshid, Aleksandar Milicevic, and Milos Gligoric.
In Conference on Object-Oriented Programming Systems, Languages and Applications
(OOPSLA'20), 217:1-217:26. Chicago, Illinois, USA, November 2020.

8. On the Naturalness of Hardware Descriptions. [slides] [talk]
Jaeseong Lee, Pengyu Nie, Junyi Jessy Li, and Milos Gligoric.
In Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
(FSE'20), 530-542. Sacramento, California, USA, November 2020.

7. Learning to Format Coq Code using Language Models. [slides]
Pengyu Nie, Karl Palmskog, Junyi Jessy Li, and Milos Gligoric.
In The Coq Workshop
(Coq'20). Paris, France, July 2020.

6. Debugging the Performance of Maven's Test Isolation: Experience Report. [slides]
Pengyu Nie, Ahmet Celik, Matthew Coley, Aleksandar Milicevic, Jonathan Bell, and Milos Gligoric.
In International Symposium on Software Testing and Analysis
(ISSTA'20), 249-259. Los Angeles, California, USA, July 2020.

5. Learning to Update Natural Language Comments based on Code Changes.
Sheena Panthaplackel, Pengyu Nie, Milos Gligoric, Junyi Jessy Li, and Raymond J. Mooney.
In Annual Meeting of the Association for Computational Linguistics
(ACL'20), 1853-1868. Seattle, Washington, USA, July 2020.

4. Deep Generation of Coq Lemma Names using Elaborated Terms. [slides] [talk]
Pengyu Nie, Karl Palmskog, Junyi Jessy Li, and Milos Gligoric.
In International Joint Conference on Automated Reasoning
(IJCAR'20), 97-118. Paris, France, June 2020.

3. Design, Implementation, and Application of GPU-based Java Bytecode Interpreters.
Ahmet Celik, Pengyu Nie, Christopher J. Rossbach, and Milos Gligoric.
In Conference on Object-Oriented Programming Systems, Languages and Applications
(OOPSLA'19), 177:1-177:28. Athens, Greece, October 2019.

2. A Framework for Writing Trigger-Action Todo Comments in Executable Format. [slides]
Pengyu Nie, Rishabh Rai, Junyi Jessy Li, Sarfraz Khurshid, Raymond J. Mooney, and Milos Gligoric.
In Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
(FSE'19), 385-396. Tallinn, Estonia, August 2019.
This paper won an ACM SIGSOFT Distinguished Paper Award.

1. Natural Language Processing and Program Analysis for Supporting Todo Comments as Software Evolves.
Pengyu Nie, Junyi Jessy Li, Sarfraz Khurshid, Raymond J. Mooney, and Milos Gligoric.
In Workshop on Natural Language Processing for Software Engineering
(NL4SE'18), 775-778. New Orleans, Louisiana, USA, February 2018. Long presentation.

Service

2025: PC member of ISSTA.
2024: PC member of ISSTA, ASE, ARR June / EMNLP, LLM4Code, ASE SRC. Reviewer for TOSEM, TSE, EMSE.
2023: Reviewer for TOSEM, JSS, TSE, EAAI, TACL. Sub-reviewer for ICSE.
2022: Reviewer for TSE, TOPLAS. Sub-reviewer for ICSE.
2021: PC member of AAAI, NLP4Prog, AIST. Sub-reviewer for ASE, TSE.
2020: Sub-reviewer for ICSE, ISSTA, COLING, ISSRE.
2019: Sub-reviewer for ICSE, ISSTA, IJCAI.
2018: Sub-reviewer for ASE, FSE.

2022-2023: Co-organizer of the Joint UT-Cornell Software Engineering Seminar.
2018-2022: Co-organizer of the NLP+Programming Reading Group at UT Austin.
2022: Committee of Graduate and Industry Networking (GAIN) at UT Austin.
2022: Mentor of the ECE Partner Program at UT Austin.