( = Paper PDF,
= Presentation slides,
= Presentation video)
1.
Balreet Grewal; Wentao Lu; Sarah Nadi; Cor-Paul Bezemer
Analyzing Developer Use of ChatGPT Generated Code in Open Source GitHub Projects Inproceedings
International Conference on Mining Software Repositories (MSR), 2024.
Abstract | BibTeX | Tags: Code reuse, LLM, SE4AI
@inproceedings{GrewalMSR2024,
title = {Analyzing Developer Use of ChatGPT Generated Code in Open Source GitHub Projects},
author = {Balreet Grewal and Wentao Lu and Sarah Nadi and Cor-Paul Bezemer },
year = {2024},
date = {2024-04-14},
urldate = {2024-04-14},
booktitle = {International Conference on Mining Software Repositories (MSR)},
abstract = {The rapid development of large language models such as ChatGPT
have made them particularly useful to developers in generating
code snippets for their projects. To understand how ChatGPT’s
generated code is leveraged by developers, we conducted an em-
pirical study of 3,044 ChatGPT-generated code snippets integrated
within GitHub projects. A median of 54% of the generated lines of
code is found in the project’s code and this code typically remains
unchanged once added. The modifications of the 76 code snippets
that changed in a subsequent commit, consisted of minor function-
ality changes and code reorganizations that were made within a
day. Our findings offer insights that help drive the development
of AI-assisted programming tools. We highlight the importance
of making changes in ChatGPT code before integrating it into a
project.},
keywords = {Code reuse, LLM, SE4AI},
pubstate = {published},
tppubtype = {inproceedings}
}
The rapid development of large language models such as ChatGPT
have made them particularly useful to developers in generating
code snippets for their projects. To understand how ChatGPT’s
generated code is leveraged by developers, we conducted an em-
pirical study of 3,044 ChatGPT-generated code snippets integrated
within GitHub projects. A median of 54% of the generated lines of
code is found in the project’s code and this code typically remains
unchanged once added. The modifications of the 76 code snippets
that changed in a subsequent commit, consisted of minor function-
ality changes and code reorganizations that were made within a
day. Our findings offer insights that help drive the development
of AI-assisted programming tools. We highlight the importance
of making changes in ChatGPT code before integrating it into a
project.
have made them particularly useful to developers in generating
code snippets for their projects. To understand how ChatGPT’s
generated code is leveraged by developers, we conducted an em-
pirical study of 3,044 ChatGPT-generated code snippets integrated
within GitHub projects. A median of 54% of the generated lines of
code is found in the project’s code and this code typically remains
unchanged once added. The modifications of the 76 code snippets
that changed in a subsequent commit, consisted of minor function-
ality changes and code reorganizations that were made within a
day. Our findings offer insights that help drive the development
of AI-assisted programming tools. We highlight the importance
of making changes in ChatGPT code before integrating it into a
project.