( = Paper PDF, = Presentation slides, = Presentation video)
Mohammad Reza Taesiri; Tianjun Feng; Anh Nguyen; Cor-Paul Bezemer
GlitchBench: Can large multimodal models detect video game glitches? Inproceedings
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
Abstract | BibTeX | Tags: Computer games, Foundation models, Game development, Gameplay videos, LLM
@inproceedings{TaesiriCVPR2024,
title = {GlitchBench: Can large multimodal models detect video game glitches?},
author = {Mohammad Reza Taesiri and Tianjun Feng and Anh Nguyen and Cor-Paul Bezemer},
year = {2024},
date = {2024-06-15},
urldate = {2024-03-15},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
abstract = {Large multimodal models (LMMs) have evolved from
large language models (LLMs) to integrate multiple input
modalities, such as visual inputs. This integration augments
the capacity of LLMs for tasks requiring visual comprehen-
sion and reasoning. However, the extent and limitations of
their enhanced abilities are not fully understood, especially
when it comes to real-world tasks. To address this gap, we
introduce GlitchBench, a novel benchmark derived from
video game quality assurance tasks, to test and evaluate the
reasoning capabilities of LMMs. Our benchmark is curated
from a variety of unusual and glitched scenarios from video
games and aims to challenge both the visual and linguis-
tic reasoning powers of LMMs in detecting and interpreting
out-of-the-ordinary events. We evaluate multiple state-of-
the-art LMMs, and we show that GlitchBench presents a
new challenge for these models. Code and data are avail-
able at: https://glitchbench.github.io/},
keywords = {Computer games, Foundation models, Game development, Gameplay videos, LLM},
pubstate = {published},
tppubtype = {inproceedings}
}
large language models (LLMs) to integrate multiple input
modalities, such as visual inputs. This integration augments
the capacity of LLMs for tasks requiring visual comprehen-
sion and reasoning. However, the extent and limitations of
their enhanced abilities are not fully understood, especially
when it comes to real-world tasks. To address this gap, we
introduce GlitchBench, a novel benchmark derived from
video game quality assurance tasks, to test and evaluate the
reasoning capabilities of LMMs. Our benchmark is curated
from a variety of unusual and glitched scenarios from video
games and aims to challenge both the visual and linguis-
tic reasoning powers of LMMs in detecting and interpreting
out-of-the-ordinary events. We evaluate multiple state-of-
the-art LMMs, and we show that GlitchBench presents a
new challenge for these models. Code and data are avail-
able at: https://glitchbench.github.io/
Mohammad Reza Taesiri; Finlay Macklon; Sarra Habchi; Cor-Paul Bezemer
Searching bug instances in gameplay video repositories Journal Article
IEEE Transactions on Games, 2024.
Abstract | BibTeX | Tags: Bug report, Computer games, Game development, Gameplay videos, Gaming
@article{TaesiriTG2024,
title = {Searching bug instances in gameplay video repositories},
author = {Mohammad Reza Taesiri and Finlay Macklon and Sarra Habchi and Cor-Paul Bezemer},
year = {2024},
date = {2024-01-17},
urldate = {2024-01-17},
journal = {IEEE Transactions on Games},
abstract = {Gameplay videos offer valuable insights into player interactions and game responses, particularly data about game bugs.
Despite the abundance of gameplay videos online, extracting useful information remains a challenge. This paper introduces a method
for searching and extracting relevant videos from extensive video repositories using English text queries. Our approach requires no
external information, like video metadata; it solely depends on video content. Leveraging the zero-shot transfer capabilities of the
Contrastive Language-Image Pre-Training (CLIP) model, our approach does not require any data labeling or training. To evaluate our
approach, we present the GamePhysics dataset, comprising 26,954 videos from 1,873 games that were collected from the
GamePhysics section on the Reddit website. Our approach shows promising results in our extensive analysis of simple and compound
queries, indicating that our method is useful for detecting objects and events in gameplay videos. Moreover, we assess the
effectiveness of our method by analyzing a carefully annotated dataset of 220 gameplay videos. The results of our study demonstrate
the potential of our approach for applications such as the creation of a video search tool tailored to identifying video game bugs, which
could greatly benefit Quality Assurance (QA) teams in finding and reproducing bugs. The code and data used in this paper can be
found at https://zenodo.org/records/10211390},
keywords = {Bug report, Computer games, Game development, Gameplay videos, Gaming},
pubstate = {published},
tppubtype = {article}
}
Despite the abundance of gameplay videos online, extracting useful information remains a challenge. This paper introduces a method
for searching and extracting relevant videos from extensive video repositories using English text queries. Our approach requires no
external information, like video metadata; it solely depends on video content. Leveraging the zero-shot transfer capabilities of the
Contrastive Language-Image Pre-Training (CLIP) model, our approach does not require any data labeling or training. To evaluate our
approach, we present the GamePhysics dataset, comprising 26,954 videos from 1,873 games that were collected from the
GamePhysics section on the Reddit website. Our approach shows promising results in our extensive analysis of simple and compound
queries, indicating that our method is useful for detecting objects and events in gameplay videos. Moreover, we assess the
effectiveness of our method by analyzing a carefully annotated dataset of 220 gameplay videos. The results of our study demonstrate
the potential of our approach for applications such as the creation of a video search tool tailored to identifying video game bugs, which
could greatly benefit Quality Assurance (QA) teams in finding and reproducing bugs. The code and data used in this paper can be
found at https://zenodo.org/records/10211390
Markos Viggiato; Dale Paas; Cor-Paul Bezemer
Prioritizing Natural Language Test Cases Based on Highly-Used Game Features Inproceedings
Proceedings of the 31st Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), pp. 1–12, 2023.
Abstract | BibTeX | Tags: Computer games, Game development, Natural language processing, Testing
@inproceedings{ViggiatoFSE2023,
title = {Prioritizing Natural Language Test Cases Based on Highly-Used Game Features},
author = {Markos Viggiato and Dale Paas and Cor-Paul Bezemer },
year = {2023},
date = {2023-12-01},
urldate = {2023-12-01},
booktitle = {Proceedings of the 31st Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE)},
pages = {1--12},
abstract = {Software testing is still a manual activity in many industries, such
as the gaming industry. But manually executing tests becomes im-
practical as the system grows and resources are restricted, mainly
in a scenario with short release cycles. Test case prioritization is a
commonly used technique to optimize the test execution. However,
most prioritization approaches do not work for manual test cases
as they require source code information or test execution history,
which is often not available in a manual testing scenario. In this
paper, we propose a prioritization approach for manual test cases
written in natural language based on the tested application features
(in particular, highly-used application features). Our approach con-
sists of (1) identifying the tested features from natural language test
cases (with zero-shot classification techniques) and (2) prioritizing
test cases based on the features that they test. We leveraged the
NSGA-II genetic algorithm for the multi-objective optimization of
the test case ordering to maximize the coverage of highly-used
features while minimizing the cumulative execution time. Our find-
ings show that we can successfully identify the application features
covered by test cases using an ensemble of pre-trained models
with strong zero-shot capabilities (an F-score of 76.1%). Also, our
prioritization approaches can find test case orderings that cover
highly-used application features early in the test execution while
keeping the time required to execute test cases short. QA engineers
can use our approach to focus the test execution on test cases that
cover features that are relevant to users.},
keywords = {Computer games, Game development, Natural language processing, Testing},
pubstate = {published},
tppubtype = {inproceedings}
}
as the gaming industry. But manually executing tests becomes im-
practical as the system grows and resources are restricted, mainly
in a scenario with short release cycles. Test case prioritization is a
commonly used technique to optimize the test execution. However,
most prioritization approaches do not work for manual test cases
as they require source code information or test execution history,
which is often not available in a manual testing scenario. In this
paper, we propose a prioritization approach for manual test cases
written in natural language based on the tested application features
(in particular, highly-used application features). Our approach con-
sists of (1) identifying the tested features from natural language test
cases (with zero-shot classification techniques) and (2) prioritizing
test cases based on the features that they test. We leveraged the
NSGA-II genetic algorithm for the multi-objective optimization of
the test case ordering to maximize the coverage of highly-used
features while minimizing the cumulative execution time. Our find-
ings show that we can successfully identify the application features
covered by test cases using an ensemble of pre-trained models
with strong zero-shot capabilities (an F-score of 76.1%). Also, our
prioritization approaches can find test case orderings that cover
highly-used application features early in the test execution while
keeping the time required to execute test cases short. QA engineers
can use our approach to focus the test execution on test cases that
cover features that are relevant to users.
Markos Viggiato
Leveraging Natural Language Processing Techniques to Improve Manual Game Testing PhD Thesis
2023.
Abstract | BibTeX | Tags: Computer games, Game development, Natural language processing, Testing
@phdthesis{ViggiatoPhD,
title = {Leveraging Natural Language Processing Techniques to Improve Manual Game Testing},
author = {Markos Viggiato },
year = {2023},
date = {2023-01-17},
urldate = {2023-01-17},
abstract = {The gaming industry has experienced a sharp growth in recent years, surpassing other popular entertainment segments, such as the film industry. With the ever-increasing scale of the gaming industry and the fact that players are extremely difficult to satisfy, it has become extremely challenging to develop a successful game. In this context, the quality of games has become a critical issue. Game testing is a widely-performed activity to ensure that games meet the desired quality criteria. However, despite recent advancements in test automation, manual game testing is still prevalent in the gaming industry, with test cases often described in natural language only and consisting of one or more test steps that must be manually performed by the Quality Assurance (QA) engineer (i.e., the tester). This makes game testing challenging and costly. Issues such as redundancy (i.e., when different test cases have the same testing objective) and incompleteness (i.e., when test cases miss one or more steps) become a bigger concern in a manual game testing scenario. In addition, as games become bigger and the number of required test cases increases, it becomes impractical to execute all test cases in a scenario with short game release cycles, for example.
Prior work proposed several approaches to analyze and improve test cases with associated source code. However, there is little research on improving manual game testing. Having higher-quality test cases and optimizing test execution help to reduce wasted developer time and allow testers to use testing resources more effectively, which makes game testing more efficient and effective. In addition, even though players are extremely difficult to satisfy, their priorities are not considered during game testing. In this thesis, we investigate how to improve manual game testing from different perspectives.
In the first part of the thesis, we investigated how we can reduce redundancy in the test suite by identifying similar natural language test cases. We evaluated several unsupervised approaches using text embedding, text similarity, and cluster-
ing techniques and showed that we can successfully identify similar test cases with a high performance. We also investigated how we can improve test case descriptions to reduce the number of unclear, ambiguous, and incomplete test cases. We proposed and evaluated an automated framework that leverages statistical and neural language models and (1) provides recommendations to improve test case descriptions, (2) recommends potentially missing steps, and (3) suggests existing similar test cases.
In the second part of the thesis, we investigated how player priorities can be included in the game testing process. We first proposed an approach to prioritize test cases that cover the game features that players use the most, which helps to avoid bugs that could affect a very large number of players. Our approach (1) identifies the game features covered by test cases using an ensemble of zero-shot techniques with a high performance and (2) optimizes the test execution based on highly-used game features covered by test cases. Finally, we investigated how sentiment classifiers perform on game reviews and what issues affect those classifiers. High-performing classifiers can be used to obtain players' sentiments about games and guide testing based on the game features that players like or dislike. We show that, while traditional sentiment classifiers do not perform well, a modern classifier (the OPT-175B Large Language Model) presents a (far) better performance. The research work presented in this thesis provides deep insights, actionable recommendations, and effective and thoroughly evaluated approaches to support QA engineers and developers to improve manual game testing.},
keywords = {Computer games, Game development, Natural language processing, Testing},
pubstate = {published},
tppubtype = {phdthesis}
}
Prior work proposed several approaches to analyze and improve test cases with associated source code. However, there is little research on improving manual game testing. Having higher-quality test cases and optimizing test execution help to reduce wasted developer time and allow testers to use testing resources more effectively, which makes game testing more efficient and effective. In addition, even though players are extremely difficult to satisfy, their priorities are not considered during game testing. In this thesis, we investigate how to improve manual game testing from different perspectives.
In the first part of the thesis, we investigated how we can reduce redundancy in the test suite by identifying similar natural language test cases. We evaluated several unsupervised approaches using text embedding, text similarity, and cluster-
ing techniques and showed that we can successfully identify similar test cases with a high performance. We also investigated how we can improve test case descriptions to reduce the number of unclear, ambiguous, and incomplete test cases. We proposed and evaluated an automated framework that leverages statistical and neural language models and (1) provides recommendations to improve test case descriptions, (2) recommends potentially missing steps, and (3) suggests existing similar test cases.
In the second part of the thesis, we investigated how player priorities can be included in the game testing process. We first proposed an approach to prioritize test cases that cover the game features that players use the most, which helps to avoid bugs that could affect a very large number of players. Our approach (1) identifies the game features covered by test cases using an ensemble of zero-shot techniques with a high performance and (2) optimizes the test execution based on highly-used game features covered by test cases. Finally, we investigated how sentiment classifiers perform on game reviews and what issues affect those classifiers. High-performing classifiers can be used to obtain players' sentiments about games and guide testing based on the game features that players like or dislike. We show that, while traditional sentiment classifiers do not perform well, a modern classifier (the OPT-175B Large Language Model) presents a (far) better performance. The research work presented in this thesis provides deep insights, actionable recommendations, and effective and thoroughly evaluated approaches to support QA engineers and developers to improve manual game testing.
Finlay Macklon; Mohammad Reza Taesiri; Markos Viggiato; Stefan Antoszko; Natalia Romanova; Dale Paas; Cor-Paul Bezemer
Automatically Detecting Visual Bugs in HTML5 <canvas> Games Inproceedings
37th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2022.
BibTeX | Tags: Computer games, Game development, Gaming, Regression testing, Testing, Web applications
@inproceedings{finlay_ase2022,
title = {Automatically Detecting Visual Bugs in HTML5
Markos Viggiato; Dale Paas; Chris Buzon; Cor-Paul Bezemer
Identifying Similar Test Cases That Are Specified in Natural Language Journal Article
Transactions of Software Engineering (TSE), 2022.
Abstract | BibTeX | Tags: Game development, Testing
@article{ViggiatoTSE2022,
title = {Identifying Similar Test Cases That Are Specified in Natural Language},
author = {Markos Viggiato and Dale Paas and Chris Buzon and Cor-Paul Bezemer},
year = {2022},
date = {2022-04-21},
urldate = {2022-04-21},
journal = {Transactions of Software Engineering (TSE)},
abstract = {Software testing is still a manual process in many industries, despite the recent improvements in automated testing
techniques. As a result, test cases (which consist of one or more test steps that need to be executed manually by the tester) are often
specified in natural language by different employees and many redundant test cases might exist in the test suite. This increases the
(already high) cost of test execution. Manually identifying similar test cases is a time-consuming and error-prone task. Therefore, in this
paper, we propose an unsupervised approach to identify similar test cases. Our approach uses a combination of text embedding, text
similarity and clustering techniques to identify similar test cases. We evaluate five different text embedding techniques, two text
similarity metrics, and two clustering techniques to cluster similar test steps and three techniques to identify similar test cases from the
test step clusters. Through an evaluation in an industrial setting, we showed that our approach achieves a high performance to cluster
test steps (an F-score of 87.39%) and identify similar test cases (an F-score of 83.47%). Furthermore, a validation with developers
indicates several different practical usages of our approach (such as identifying redundant test cases), which help to reduce the testing
manual effort and time.},
keywords = {Game development, Testing},
pubstate = {published},
tppubtype = {article}
}
techniques. As a result, test cases (which consist of one or more test steps that need to be executed manually by the tester) are often
specified in natural language by different employees and many redundant test cases might exist in the test suite. This increases the
(already high) cost of test execution. Manually identifying similar test cases is a time-consuming and error-prone task. Therefore, in this
paper, we propose an unsupervised approach to identify similar test cases. Our approach uses a combination of text embedding, text
similarity and clustering techniques to identify similar test cases. We evaluate five different text embedding techniques, two text
similarity metrics, and two clustering techniques to cluster similar test steps and three techniques to identify similar test cases from the
test step clusters. Through an evaluation in an industrial setting, we showed that our approach achieves a high performance to cluster
test steps (an F-score of 87.39%) and identify similar test cases (an F-score of 83.47%). Furthermore, a validation with developers
indicates several different practical usages of our approach (such as identifying redundant test cases), which help to reduce the testing
manual effort and time.
Mohammad Reza Taesiri; Finlay Macklon; Cor-Paul Bezemer
CLIP meets GamePhysics: Towards bug identification in gameplay videos using zero-shot transfer learning Inproceedings
International Conference on Mining Software Repositories (MSR), 2022.
Abstract | BibTeX | Tags: Bug report, Computer games, Game development, Gameplay videos, Gaming
@inproceedings{TaesiriMSR2022,
title = {CLIP meets GamePhysics: Towards bug identification in gameplay videos using zero-shot transfer learning},
author = {Mohammad Reza Taesiri and Finlay Macklon and Cor-Paul Bezemer},
year = {2022},
date = {2022-03-24},
urldate = {2022-03-24},
booktitle = {International Conference on Mining Software Repositories (MSR)},
abstract = {Gameplay videos contain rich information about how players inter-
act with the game and how the game responds. Sharing gameplay
videos on social media platforms, such as Reddit, has become a
common practice for many players. Often, players will share game-
play videos that showcase video game bugs. Such gameplay videos
are software artifacts that can be utilized for game testing, as they
provide insight for bug analysis. Although large repositories of
gameplay videos exist, parsing and mining them in an effective and
structured fashion has still remained a big challenge. In this paper,
we propose a search method that accepts any English text query as
input to retrieve relevant videos from large repositories of gameplay
videos. Our approach does not rely on any external information
(such as video metadata); it works solely based on the content of the
video. By leveraging the zero-shot transfer capabilities of the Con-
trastive Language-Image Pre-Training (CLIP) model, our approach
does not require any data labeling or training. To evaluate our ap-
proach, we present the GamePhysics dataset consisting of 26,954
videos from 1,873 games, that were collected from the GamePhysics
section on the Reddit website. Our approach shows promising re-
sults in our extensive analysis of simple queries, compound queries,
and bug queries, indicating that our approach is useful for object
and event detection in gameplay videos. An example application
of our approach is as a gameplay video search engine to aid in
reproducing video game bugs. Please visit the following link for the
code and the data: https://asgaardlab.github.io/CLIPxGamePhysics/},
keywords = {Bug report, Computer games, Game development, Gameplay videos, Gaming},
pubstate = {published},
tppubtype = {inproceedings}
}
act with the game and how the game responds. Sharing gameplay
videos on social media platforms, such as Reddit, has become a
common practice for many players. Often, players will share game-
play videos that showcase video game bugs. Such gameplay videos
are software artifacts that can be utilized for game testing, as they
provide insight for bug analysis. Although large repositories of
gameplay videos exist, parsing and mining them in an effective and
structured fashion has still remained a big challenge. In this paper,
we propose a search method that accepts any English text query as
input to retrieve relevant videos from large repositories of gameplay
videos. Our approach does not rely on any external information
(such as video metadata); it works solely based on the content of the
video. By leveraging the zero-shot transfer capabilities of the Con-
trastive Language-Image Pre-Training (CLIP) model, our approach
does not require any data labeling or training. To evaluate our ap-
proach, we present the GamePhysics dataset consisting of 26,954
videos from 1,873 games, that were collected from the GamePhysics
section on the Reddit website. Our approach shows promising re-
sults in our extensive analysis of simple queries, compound queries,
and bug queries, indicating that our approach is useful for object
and event detection in gameplay videos. An example application
of our approach is as a gameplay video search engine to aid in
reproducing video game bugs. Please visit the following link for the
code and the data: https://asgaardlab.github.io/CLIPxGamePhysics/
Arthur V. Kamienski; Cor-Paul Bezemer
An Empirical Study of Q&A Websites for Game Developers Journal Article
Empirical Software Engineering Journal (EMSE), 2021.
Abstract | BibTeX | Tags: Game development, Q&A communities
@article{arthur2021,
title = {An Empirical Study of Q&A Websites for Game Developers},
author = {Arthur V. Kamienski and Cor-Paul Bezemer},
year = {2021},
date = {2021-07-07},
urldate = {2021-07-07},
journal = {Empirical Software Engineering Journal (EMSE)},
abstract = {The game development industry is growing, and training new developers in game development-specific abilities is essential to satisfying its need for skilled game developers. These developers require effective learning resources to acquire the information they need and improve their game development skills. Question and Answer (Q&A) websites stand out as some of the most used online learning resources in software development. Many studies have investigated how Q&A websites help software developers become more experienced. However, no studies have explored Q&A websites aimed at game development, and there is little information about how game developers use and interact with these websites. In this paper, we study four Q&A communities by analyzing game development data we collected from their websites and the 347 responses received on a survey we ran with game developers. We observe that the communities have declined over the past few years and identify factors that correlate to these changes. Using a Latent Dirichlet Allocation (LDA) model, we characterize the topics discussed in the communities. We also analyze how topics differ across communities and identify the most discussed topics. Furthermore, we find that survey respondents have a mostly negative view of the communities and tended to stop using the websites once they became more experienced. Finally, we provide recommendations on where game developers should post their questions, which can help mitigate the websites’ declines and improve their effectiveness.},
keywords = {Game development, Q&A communities},
pubstate = {published},
tppubtype = {article}
}
Quang N. Vu; Cor-Paul Bezemer
An Empirical Study of the Characteristics of Popular Game Jams and Their High-ranking Submissions on itch.io Inproceedings
International Conference on the Foundations of Digital Games (FDG), pp. 1–12, 2020.
Abstract | BibTeX | Tags: Empirical software engineering, Game development, Game jams, itch.io, Mining software repositories
@inproceedings{Quang20,
title = {An Empirical Study of the Characteristics of Popular Game Jams and Their High-ranking Submissions on itch.io},
author = {Quang N. Vu and Cor-Paul Bezemer},
year = {2020},
date = {2020-04-14},
urldate = {2020-04-14},
booktitle = {International Conference on the Foundations of Digital Games (FDG)},
pages = {1--12},
abstract = {Game jams are hackathon-like events that allow participants to develop a playable game prototype within a time limit. They foster creativity and the exchange of ideas by letting developers with different skill sets collaborate. Having a high-ranking game is a great bonus to a beginning game developer’s résumé and their pursuit of a career in the game industry. However, participants often face time constraints set by jam hosts while balancing what aspects of their games should be emphasized to have the highest chance of winning. Similarly, hosts need to understand what to emphasize when organizing online jams so that their jams are more popular, in terms of submission rate. In this paper, we study 1,290 past game jams and their 3,752 submissions on itch.io to understand better what makes popular jams and high-ranking games perceived well by the audience. We find that a quality description has a positive contribution to both a jam’s popularity and a game’s ranking. Additionally, more manpower organizing a jam or developing a game increases a jam’s popularity and a game’s high-ranking likelihood. Highranking games tend to support Windows or macOS, and belong to the “Puzzleâ€, “Platformerâ€, “Interactive Fictionâ€, or “Action†genres. Also, shorter competitive jams tend to be more popular. Based on our findings, we suggest jam hosts and participants improve the description of their products and consider co-organizing or co-participating in a jam. Furthermore, jam participants should develop multi-platform multi-genre games. Finally, jam hosts should introduce a tighter time limit to increase their jam’s popularity.},
keywords = {Empirical software engineering, Game development, Game jams, itch.io, Mining software repositories},
pubstate = {published},
tppubtype = {inproceedings}
}
Daniel Lee; Dayi Lin; Cor-Paul Bezemer; Ahmed E. Hassan
Building the Perfect Game - An Empirical Study of Game Modifications Journal Article
Empirical Software Engineering Journal (EMSE), 2019.
Abstract | BibTeX | Tags: Game development, Gaming, Nexus mod
@article{Daniel2019nexusmods,
title = {Building the Perfect Game - An Empirical Study of Game Modifications},
author = {Daniel Lee and Dayi Lin and Cor-Paul Bezemer and Ahmed E. Hassan},
year = {2019},
date = {2019-10-17},
urldate = {2019-10-17},
journal = {Empirical Software Engineering Journal (EMSE)},
abstract = {Prior work has shown that gamer loyalty is important for the sales of a developer’s future games. Therefore, it is important for game developers to increase the longevity of their games. However, game developers cannot always meet the growing and changing needs of the gaming community, due to the often already overloaded schedules of developers. So-called modders can potentially assist game developers with addressing gamers’ needs. Modders are enthusiasts who provide modifications or completely new content for a game. By supporting modders, game developers can meet the rapidly growing and varying needs of their gamer base. Modders have the potential to play a role in extending the life expectancy of a game, thereby saving game developers time and money, and leading to a better overall gaming experience for their gamer base.
In this paper, we empirically study the metadata of 9,521 mods that were extracted from the Nexus Mods distribution platform. The Nexus Mods distribution platform is one of the largest mod distribution platforms for PC games at the time of our study. The goal of our paper is to provide useful insights about mods on the Nexus Mods distribution platform from a quantitative perspective, and to provide researchers a solid foundation to further explore game mods. To better understand the potential of mods to extend the longevity of a game we study their characteristics, and we study their release schedules and post-release support (in terms of bug reports) as a proxy for the willingness of the modding community to contribute to a game. We find that providing official support for mods can be beneficial for the perceived quality of the mods of a game: games for which a modding tool is provided by the original game developer have a higher median endorsement ratio than mods for games that do not have such a tool. In addition, mod users are willing to submit bug reports for a mod. However, they often fail to do this in a systematic manner using the bug reporting tool of the Nexus Mods platform, resulting in low-quality bug reports which are difficult to resolve.
Our findings give the first insights into the characteristics, release schedule and post-release support of game mods. Our findings show that some games have a very active modding community, which contributes to those games through mods. Based on our findings, we recommend that game developers who desire an active modding community for their own games provide the modding community with an officially supported modding tool. In addition, we recommend that mod distribution platforms,
such as Nexus Mods, improve their bug reporting system to receive higher quality bug reports.},
keywords = {Game development, Gaming, Nexus mod},
pubstate = {published},
tppubtype = {article}
}
In this paper, we empirically study the metadata of 9,521 mods that were extracted from the Nexus Mods distribution platform. The Nexus Mods distribution platform is one of the largest mod distribution platforms for PC games at the time of our study. The goal of our paper is to provide useful insights about mods on the Nexus Mods distribution platform from a quantitative perspective, and to provide researchers a solid foundation to further explore game mods. To better understand the potential of mods to extend the longevity of a game we study their characteristics, and we study their release schedules and post-release support (in terms of bug reports) as a proxy for the willingness of the modding community to contribute to a game. We find that providing official support for mods can be beneficial for the perceived quality of the mods of a game: games for which a modding tool is provided by the original game developer have a higher median endorsement ratio than mods for games that do not have such a tool. In addition, mod users are willing to submit bug reports for a mod. However, they often fail to do this in a systematic manner using the bug reporting tool of the Nexus Mods platform, resulting in low-quality bug reports which are difficult to resolve.
Our findings give the first insights into the characteristics, release schedule and post-release support of game mods. Our findings show that some games have a very active modding community, which contributes to those games through mods. Based on our findings, we recommend that game developers who desire an active modding community for their own games provide the modding community with an officially supported modding tool. In addition, we recommend that mod distribution platforms,
such as Nexus Mods, improve their bug reporting system to receive higher quality bug reports.