( = Paper PDF,
= Presentation slides,
= Presentation video)
1.
Md Saeed Siddik; Cor-Paul Bezemer
Do Code Quality and Style Issues Differ Across (Non-)Machine Learning Notebooks? Yes! Inproceedings
23nd IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM), pp. 1–12, IEEE, 2023.
Abstract | BibTeX | Tags: Computational notebooks, Empirical software engineering, Mining software repositories
@inproceedings{SiddikSCAM2023,
title = {Do Code Quality and Style Issues Differ Across (Non-)Machine Learning Notebooks? Yes!},
author = {Md Saeed Siddik and Cor-Paul Bezemer},
year = {2023},
date = {2023-10-03},
urldate = {2023-10-03},
booktitle = {23nd IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM)},
pages = {1--12},
publisher = {IEEE},
abstract = {The popularity of computational notebooks is
rapidly increasing because of their interactive code-output vi-
sualization and on-demand non-sequential code block execution.
These notebook features have made notebooks especially popular
with machine learning developers and data scientists. However,
as prior work shows, notebooks generally contain low quality
code. In this paper, we investigate whether the low quality code
is inherent to the programming style in notebooks, or whether
it is correlated with the use of machine learning techniques.
We present a large-scale empirical analysis of 246,599 open-
source notebooks to explore how machine learning code quality
in Jupyter Notebooks differs from non-machine learning code,
thereby focusing on code style issues. We explored code style
issues across the Error, Convention, Warning, and Refactoring
categories. We found that machine learning notebooks are of
lower quality regarding PEP-8 code standards than non-machine
learning notebooks, and their code quality distributions signifi-
cantly differ with a small effect size. We identified several code
style issues with large differences in occurrences between machine
learning and non-machine learning notebooks. For example,
package and import-related issues are more prevalent in machine
learning notebooks. Our study shows that code quality and code
style issues differ significantly across machine learning and non-
machine learning notebooks.},
keywords = {Computational notebooks, Empirical software engineering, Mining software repositories},
pubstate = {published},
tppubtype = {inproceedings}
}
The popularity of computational notebooks is
rapidly increasing because of their interactive code-output vi-
sualization and on-demand non-sequential code block execution.
These notebook features have made notebooks especially popular
with machine learning developers and data scientists. However,
as prior work shows, notebooks generally contain low quality
code. In this paper, we investigate whether the low quality code
is inherent to the programming style in notebooks, or whether
it is correlated with the use of machine learning techniques.
We present a large-scale empirical analysis of 246,599 open-
source notebooks to explore how machine learning code quality
in Jupyter Notebooks differs from non-machine learning code,
thereby focusing on code style issues. We explored code style
issues across the Error, Convention, Warning, and Refactoring
categories. We found that machine learning notebooks are of
lower quality regarding PEP-8 code standards than non-machine
learning notebooks, and their code quality distributions signifi-
cantly differ with a small effect size. We identified several code
style issues with large differences in occurrences between machine
learning and non-machine learning notebooks. For example,
package and import-related issues are more prevalent in machine
learning notebooks. Our study shows that code quality and code
style issues differ significantly across machine learning and non-
machine learning notebooks.
rapidly increasing because of their interactive code-output vi-
sualization and on-demand non-sequential code block execution.
These notebook features have made notebooks especially popular
with machine learning developers and data scientists. However,
as prior work shows, notebooks generally contain low quality
code. In this paper, we investigate whether the low quality code
is inherent to the programming style in notebooks, or whether
it is correlated with the use of machine learning techniques.
We present a large-scale empirical analysis of 246,599 open-
source notebooks to explore how machine learning code quality
in Jupyter Notebooks differs from non-machine learning code,
thereby focusing on code style issues. We explored code style
issues across the Error, Convention, Warning, and Refactoring
categories. We found that machine learning notebooks are of
lower quality regarding PEP-8 code standards than non-machine
learning notebooks, and their code quality distributions signifi-
cantly differ with a small effect size. We identified several code
style issues with large differences in occurrences between machine
learning and non-machine learning notebooks. For example,
package and import-related issues are more prevalent in machine
learning notebooks. Our study shows that code quality and code
style issues differ significantly across machine learning and non-
machine learning notebooks.