(
= Paper PDF,
= Presentation slides,
= Presentation video)
1.
Md Saeed Siddik; Hao Li; Cor-Paul Bezemer
A Systematic Literature Review of Software Engineering Research on Jupyter Notebook Journal Article
Journal of Systems and Software, 2025.
Abstract | BibTeX | Tags: Jupyter Notebook, SLR
@article{siddik_slr_notebook,
title = {A Systematic Literature Review of Software Engineering Research on Jupyter Notebook},
author = {Md Saeed Siddik and Hao Li and Cor-Paul Bezemer},
year = {2025},
date = {2025-12-27},
urldate = {2025-12-27},
journal = {Journal of Systems and Software},
abstract = {Context: Jupyter Notebook has emerged as a versatile tool that transforms
how researchers, developers, and data scientists conduct and communicate
their work. As the adoption of Jupyter notebooks continues to rise, so does
the interest from the software engineering research community in improving
the software engineering practices for Jupyter notebooks.
Objective: The purpose of this study is to analyze trends, gaps, and
methodologies used in software engineering research on Jupyter notebooks.
Method : We selected 199 relevant publications up to September 2025,
following established systematic literature review guidelines. We explored
publication trends, categorized them based on software engineering topics,
and reported findings based on those topics.
Results: The most popular venues for publishing software engineering
research on Jupyter notebooks are related to human-computer interaction
instead of traditional software engineering venues. Researchers have ad-
dressed a wide range of software engineering topics on notebooks, such as
code reuse, readability, and execution environment. Although reusability is
one of the research topics for Jupyter notebooks, only 82 of the 199 studies
can be reused based on their provided URLs. Additionally, most replication
packages are not hosted on permanent repositories for long-term availability
and adherence to open science principles.
Conclusion: Solutions specific to notebooks for software engineering is-
sues, including testing, refactoring, and documentation, are underexplored.
Future research opportunities exist in automatic testing frameworks, refac-
toring clones between notebooks, and generating group documentation for
coherent code cells.},
keywords = {Jupyter Notebook, SLR},
pubstate = {published},
tppubtype = {article}
}
Context: Jupyter Notebook has emerged as a versatile tool that transforms
how researchers, developers, and data scientists conduct and communicate
their work. As the adoption of Jupyter notebooks continues to rise, so does
the interest from the software engineering research community in improving
the software engineering practices for Jupyter notebooks.
Objective: The purpose of this study is to analyze trends, gaps, and
methodologies used in software engineering research on Jupyter notebooks.
Method : We selected 199 relevant publications up to September 2025,
following established systematic literature review guidelines. We explored
publication trends, categorized them based on software engineering topics,
and reported findings based on those topics.
Results: The most popular venues for publishing software engineering
research on Jupyter notebooks are related to human-computer interaction
instead of traditional software engineering venues. Researchers have ad-
dressed a wide range of software engineering topics on notebooks, such as
code reuse, readability, and execution environment. Although reusability is
one of the research topics for Jupyter notebooks, only 82 of the 199 studies
can be reused based on their provided URLs. Additionally, most replication
packages are not hosted on permanent repositories for long-term availability
and adherence to open science principles.
Conclusion: Solutions specific to notebooks for software engineering is-
sues, including testing, refactoring, and documentation, are underexplored.
Future research opportunities exist in automatic testing frameworks, refac-
toring clones between notebooks, and generating group documentation for
coherent code cells.
how researchers, developers, and data scientists conduct and communicate
their work. As the adoption of Jupyter notebooks continues to rise, so does
the interest from the software engineering research community in improving
the software engineering practices for Jupyter notebooks.
Objective: The purpose of this study is to analyze trends, gaps, and
methodologies used in software engineering research on Jupyter notebooks.
Method : We selected 199 relevant publications up to September 2025,
following established systematic literature review guidelines. We explored
publication trends, categorized them based on software engineering topics,
and reported findings based on those topics.
Results: The most popular venues for publishing software engineering
research on Jupyter notebooks are related to human-computer interaction
instead of traditional software engineering venues. Researchers have ad-
dressed a wide range of software engineering topics on notebooks, such as
code reuse, readability, and execution environment. Although reusability is
one of the research topics for Jupyter notebooks, only 82 of the 199 studies
can be reused based on their provided URLs. Additionally, most replication
packages are not hosted on permanent repositories for long-term availability
and adherence to open science principles.
Conclusion: Solutions specific to notebooks for software engineering is-
sues, including testing, refactoring, and documentation, are underexplored.
Future research opportunities exist in automatic testing frameworks, refac-
toring clones between notebooks, and generating group documentation for
coherent code cells.
