“A Taxonomy of Testable HTML5 Canvas Issues” accepted in TSE!

Finlay’s paper “” was accepted for publication in the Transactions on Software Engineering (TSE) journal! Super congrats Finlay (and co-author Markos!)! This paper was a collaboration with Natalia Romanova, Chris Buzon and Dale Paas from our industry partner Prodigy Education.

Abstract:
“The HTML5 canvas is widely used to display high quality graphics in web applications. However, the combination of
web, GUI, and visual techniques that are required to build canvas applications, together with the lack of testing and debugging
tools, makes developing such applications very challenging. To help direct future research on testing canvas applications, in this
paper we present a taxonomy of testable canvas issues. First, we extracted 2,403 canvas related issue reports from 123 open
source GitHub projects that use the HTML5 canvas. Second, we constructed our taxonomy by manually classifying a random
sample of 332 issue reports. Our manual classification identified five broad categories of testable canvas issues, such as Visual
and Performance issues. We found that Visual issues are the most frequent (35%), while Performance issues are relatively infrequent
(5%). We also found that many testable canvas issues that present themselves visually on the canvas are actually caused by
other components of the web application. Our taxonomy of testable canvas issues can be used to steer future research into
canvas issues and testing.”

See our Publications for the full paper.

“Analyzing Techniques for Duplicate Question Detection on Q&A Websites for Game Developers” accepted for publication in EMSE!

Arthur’s paper “Analyzing Techniques for Duplicate Question Detection on Q&A Websites for Game Developers” was accepted for publication in the Empirical Software Engineering journal! Super congrats Arthur! This was a collaboration with Dr. Abram Hindle.

Abstract:
Game development is currently the largest industry in the entertainment segment and has a high demand for skilled game developers that can produce high-quality games. To satiate this demand, game developers need resources that can provide them with the knowledge they need to learn and improve their skills. Question and Answer (Q&A) websites are one of such resources that provide a valuable source of knowledge about game development practices. However, the presence of duplicate questions on Q&A websites hinders their ability to effectively provide information for their users. While several researchers created and analyzed techniques for duplicate question detection on websites such as Stack Overflow, so far no studies have explored how well those techniques work on Q&A websites for game development. With that in mind, in this paper we analyze how we can use pre-trained and unsupervised techniques to detect duplicate questions on Q&A websites focused on game development using data extracted from the Game Development Stack Exchange and Stack Overflow. We also explore how we can leverage a small set of labelled data to improve the performance of those techniques. The pre-trained technique based on MPNet achieved the highest results in identifying duplicate questions about game development, and we could achieve a better performance when combining multiple unsupervised techniques into a single supervised model. Furthermore, the supervised models could identify duplicate questions on websites different from those they were trained on with little to no decrease in performance. Our results lay the groundwork for building better duplicate question detection systems in Q&A websites for game developers and ultimately providing game developers with a more effective Q&A community.

See our Publications for the full paper, or access the preprint directly.

“Identifying Similar Test Cases That Are Specified in Natural Language” accepted in TSE!

Markos’ paper “Identifying Similar Test Cases That Are Specified in Natural Language” was accepted for publication in the Transactions on Software Engineering (TSE) journal! Super congrats Markos! This paper was a collaboration with Dale Paas and Chris Buzon from our industry partner Prodigy Education.

Abstract:
“Software testing is still a manual process in many industries, despite the recent improvements in automated testing
techniques. As a result, test cases (which consist of one or more test steps that need to be executed manually by the tester) are often
specified in natural language by different employees and many redundant test cases might exist in the test suite. This increases the
(already high) cost of test execution. Manually identifying similar test cases is a time-consuming and error-prone task. Therefore, in this
paper, we propose an unsupervised approach to identify similar test cases. Our approach uses a combination of text embedding, text
similarity and clustering techniques to identify similar test cases. We evaluate five different text embedding techniques, two text
similarity metrics, and two clustering techniques to cluster similar test steps and three techniques to identify similar test cases from the
test step clusters. Through an evaluation in an industrial setting, we showed that our approach achieves a high performance to cluster
test steps (an F-score of 87.39%) and identify similar test cases (an F-score of 83.47%). Furthermore, a validation with developers
indicates several different practical usages of our approach (such as identifying redundant test cases), which help to reduce the testing
manual effort and time.”

See our Publications for the full paper.

“CLIP meets GamePhysics: Towards bug identification in gameplay videos using zero-shot transfer learning” accepted at MSR 2022!

Mohammad Reza’s paper “CLIP meets GamePhysics: Towards bug identification in gameplay videos using zero-shot transfer learning” was accepted for publication at the Mining Software Repositories (MSR) conference 2022! Super congrats Mohammad Reza and co-author Finlay!

Abstract:
“Gameplay videos contain rich information about how players interact with the game and how the game responds. Sharing gameplay videos on social media platforms, such as Reddit, has become a common practice for many players. Often, players will share game-play videos that showcase video game bugs. Such gameplay videos are software artifacts that can be utilized for game testing, as they provide insight for bug analysis. Although large repositories of gameplay videos exist, parsing and mining them in an effective and structured fashion has still remained a big challenge. In this paper, we propose a search method that accepts any English text query as input to retrieve relevant videos from large repositories of gameplay videos. Our approach does not rely on any external information (such as video metadata); it works solely based on the content of the video. By leveraging the zero-shot transfer capabilities of the Contrastive Language-Image Pre-Training (CLIP) model, our approach does not require any data labeling or training. To evaluate our approach, we present the GamePhysics dataset consisting of 26,954 videos from 1,873 games, that were collected from the GamePhysics section on the Reddit website. Our approach shows promising results in our extensive analysis of simple queries, compound queries, and bug queries, indicating that our approach is useful for object and event detection in gameplay videos. An example application of our approach is as a gameplay video search engine to aid in reproducing video game bugs. Please visit the following link for the code and the data: https://asgaardlab.github.io/CLIPxGamePhysics/.”

See our Publications or arXiv for the full paper

“A Case Study on the Stability of Performance Tests for Serverless Applications” accepted in JSS!

Simon’s paper “A Case Study on the Stability of Performance Tests for Serverless Applications” was accepted for publication in the Journal of Systems and Software (JSS)! This paper was a collaboration with Diego Costa, Lizhi Liao, Weiyi Shang, Andre van Hoorn and Samuel Kounev through the SPEC RG DevOps Performance Working Group.

Abstract:
“Context. While in serverless computing, application resource management and operational concerns are generally delegated to the cloud provider, ensuring that serverless applications meet their performance requirements is still a responsibility of the developers. Performance testing is a commonly used performance assessment practice; however, it traditionally requires visibility of the resource environment.
Objective. In this study, we investigate whether performance tests of serverless applications are stable, that is, if their results are reproducible, and what implications the serverless paradigm has for performance tests.
Method. We conduct a case study where we collect two datasets of performance test results: (a) repetitions of performance tests for varying memory size and load intensities and (b) three repetitions of the same performance test every day for ten months.
Results. We find that performance tests of serverless applications are comparatively stable if conducted on the same day. However, we also observe short-term performance variations and frequent long-term performance changes.
Conclusion. Performance tests for serverless applications can be stable; however, the serverless model impacts the planning, execution, and analysis of performance tests.”

See our Publications for the full paper.

“How are Solidity smart contracts tested in open source projects? An exploratory study” accepted at AST 2022!

Luisa’s paper “How are Solidity smart contracts tested in open source projects? An exploratory study” was accepted for publication at AST 2022! Super congrats Luisa!

Abstract:
“Smart contracts are self-executing programs that are stored on the blockchain. Once a smart contract is compiled and deployed on the blockchain, it cannot be modified. Therefore, having a bug-free smart contract is vital. To ensure a bug-free smart contract, it must be tested thoroughly. However, little is known about how developers test smart contracts in practice. Our study explores 139 open source smart contract projects that are written in Solidity to investigate the state of smart contract testing from three dimensions: (1) the developers working on the tests, (2) the used testing frameworks and testnets and (3) the type of tests that are conducted. We found that mostly core developers of a project are responsible for testing the contracts. Second, developers typically use only functional testing frameworks to test a smart contract, with Truffle being the most popular one. Finally, our results show that functional testing is conducted in most of the studied projects (93%), security testing is only performed in a few projects (9.4%) and traditional performance testing is conducted in none. In addition, we found 25 projects that mentioned or published external audit reports.”

See our Publications for the full paper.

“Studying the Performance Risks of Upgrading Docker Hub Images: A Case Study of WordPress” accepted at ICPE 2022!

Mikael’s paper “Studying the Performance Risks of Upgrading Docker Hub Images: A Case Study of WordPress” was accepted for publication at ICPE 2022! Super congrats Mikael!

Abstract:
“The Docker Hub repository contains Docker images of applications, which allow users to do in-place upgrades to benefit from the latest released features and security patches. However, prior work showed that upgrading a Docker image not only changes the main application, but can also change many dependencies. In this paper, we present a methodology to study the performance impact of upgrading the Docker Hub image of an application, thereby focusing on changes to dependencies. We demonstrate our methodology through a case study of 90 official images of the WordPress application. Our study shows that Docker image users should be cautious and conduct a performance test before upgrading to a newer Docker image in most cases. Our methodology can assist them to better understand the performance risks of such upgrades, and helps them to decide how thorough such a performance test should be.”

See our Publications for the full paper.

“Using Natural Language Processing Techniques to Improve Manual Test Case Descriptions” accepted at ICSE-SEIP 2022!

Markos’ paper “Using Natural Language Processing Techniques to Improve Manual Test Case Descriptions” was accepted for publication at the Software Engineering in Practice (SEIP) track of ICSE 2022! Super congrats Markos!

Abstract:
“Despite the recent advancements in test automation, software testing often remains a manual, and costly, activity in many industries. Manual test cases, often described only in natural language, consist of one or more test steps, which are instructions that must be performed to achieve the testing objective. Having different employees specifying test cases might result in redundant, unclear, or incomplete test cases. Manually reviewing and validating newly-specified test cases is time-consuming and becomes impractical in a scenario with a large test suite. Therefore, in this paper, we propose an automated framework to automatically analyze test cases that are specified in natural language and provide actionable recommendations on how to improve the test cases. Our framework consists of configurable components and modules for analysis, which are capable of recommending improvements to the following: (1) the terminology of a new test case through language modeling, (2) potentially missing test steps for a new test case through frequent itemset and association rule mining, and (3) recommendation of similar test cases that already exist in the test suite through text embedding and clustering. We thoroughly evaluated the three modules on data from our industry partner. Our framework can provide actionable recommendations, which is an important challenge given the widespread occurrence of test cases that are described only in natural language in the software industry (in particular, the game industry).”

See our Publications for the full paper.

“Applications of Generative Adversarial Networks in Anomaly Detection: A Systematic Literature Review” accepted in IEEE ACCESS!

Mikael and Chloe’s paper “Applications of Generative Adversarial Networks in Anomaly Detection: A Systematic Literature Review” was accepted for publication in the IEEE ACCESS journal! Super congrats Mikael and Chloe!

Abstract:
“Anomaly detection has become an indispensable tool for modern society, applied in a wide range of applications, from detecting fraudulent transactions to malignant brain tumors. Over time, many
anomaly detection techniques have been introduced. However, in general, they all suffer from the same problem: lack of data that represents anomalous behaviour. As anomalous behaviour is usually costly (or
dangerous) for a system, it is difficult to gather enough data that represents such behaviour. This, in turn, makes it difficult to develop and evaluate anomaly detection techniques. Recently, generative adversarial
networks (GANs) have attracted much attention in anomaly detection research, due to their unique ability to generate new data. In this paper, we present a systematic review of the literature in this area, covering
128 papers. The goal of this review paper is to analyze the relation between anomaly detection techniques and types of GANs, to identify the most common application domains for GAN-assisted and GAN-based
anomaly detection, and to assemble information on datasets and performance metrics used to assess them. Our study helps researchers and practitioners to find the most suitable GAN-assisted anomaly detection
technique for their application. In addition, we present a research roadmap for future studies in this area. In summary, GANs are used in anomaly detection to address the problem of insufficient amount of data for the
anomalous behaviour, either through data augmentation or representation learning. The most commonly used GAN architectures are DCGANs, standard GANs, and cGANs. The primary application domains include
medicine, surveillance and intrusion detection.”

See our Publications for the full paper.

“An Empirical Study of Q&A Websites for Game Developers” accepted for publication in the EMSE journal!

Arthur’s paper “An Empirical Study of Q&A Websites for Game Developers” was accepted for publication in the Empirical Software Engineering journal! Super congrats Arthur!

Abstract:
The game development industry is growing, and training new developers in game development-specific abilities is essential to satisfying its need for skilled game developers. These developers require effective learning resources to acquire the information they need and improve their game development skills. Question and Answer (Q&A) websites stand out as some of the most used online learning resources in software development. Many studies have investigated how Q&A websites help software developers become more experienced. However, no studies have explored Q&A websites aimed at game development, and there is little information about how game developers use and interact with these websites. In this paper, we study four Q&A communities by analyzing game development data we collected from their websites and the 347 responses received on a survey we ran with game developers. We observe that the communities have declined over the past few years and identify factors that correlate to these changes. Using a Latent Dirichlet Allocation (LDA) model, we characterize the topics discussed in the communities. We also analyze how topics differ across communities and identify the most discussed topics. Furthermore, we find that survey respondents have a mostly negative view of the communities and tended to stop using the websites once they became more experienced. Finally, we provide recommendations on where game developers should post their questions, which can help mitigate the websites’ declines and improve their effectiveness.

See our Publications for the full paper.