( = Paper PDF, = Presentation slides, = Presentation video)
Mohamed Sami Rakha; Cor-Paul Bezemer; Ahmed E. Hassan
Revisiting the Performance Evaluation of Automated Approaches for the Identification of Duplicate Issue Reports Journal Article
The Transactions of Software Engineering (TSE) journal, 44 (12), pp. 1245–1268, 2017.
Abstract | BibTeX | Tags: Performance evaluation, Software engineering, Text analysis
@article{sami16tse,
title = {Revisiting the Performance Evaluation of Automated Approaches for the Identification of Duplicate Issue Reports},
author = {Mohamed Sami Rakha and Cor-Paul Bezemer and Ahmed E. Hassan},
year = {2017},
date = {2017-09-21},
urldate = {2017-09-21},
journal = {The Transactions of Software Engineering (TSE) journal},
volume = {44},
number = {12},
pages = {1245--1268},
publisher = {IEEE},
abstract = {Issue tracking systems (ITSs), such as Bugzilla, are commonly used to track reported bugs, improvements and change requests for a software project. To avoid wasting developer resources on previously-reported (i.e., duplicate) issues, it is necessary to identify such duplicates as soon as they are reported. Several automated approaches have been proposed for retrieving duplicate reports, i.e., identifying the duplicate of a new issue report in a list of n candidates. These approaches rely on leveraging the textual, categorical, and contextual information in previously-reported issues to decide whether a newly-reported issue has previously been reported. In general, these approaches are evaluated using data that spans a relatively short period of time (i.e., the classical evaluation). However, in this paper, we show that the classical evaluation tends to overestimate the performance of automated approaches for retrieving duplicate issue reports. Instead, we propose a realistic evaluation using all the reports that are available in the ITS of a software project. We conduct experiments in which we evaluate two popular approaches for retrieving duplicate issues (BM25F and REP) using the classical and realistic evaluations. We find that for the issue tracking data of the Mozilla foundation, the Eclipse foundation and OpenOffice, the realistic evaluation shows that previously proposed approaches perform considerably lower than previously reported using the classical evaluation. As a result, we conclude that the reported performance of approaches for retrieving duplicate issue reports is significantly overestimated in literature. In order to improve the performance of the automated retrieval of duplicate issue reports, we propose to leverage the resolution field of issue reports. Our experiments show that a relative improvement in the performance of a median of 7-21.5% and a maximum of 19-60% can be achieved by leveraging the resolution field of issue reports for the automated retrieval of duplicates.},
keywords = {Performance evaluation, Software engineering, Text analysis},
pubstate = {published},
tppubtype = {article}
}
Dayi Lin; Cor-Paul Bezemer; Ahmed E. Hassan
Studying the Urgent Updates of Popular Games on the Steam Platform Journal Article
The Empirical Software Engineering Journal (EMSE), 22 (4), pp. 2095–2126, 2017.
Abstract | BibTeX | Tags: Computer games, Steam, Update cycle, Update strategy, Urgent updates
@article{Lin16urgent,
title = {Studying the Urgent Updates of Popular Games on the Steam Platform},
author = {Dayi Lin and Cor-Paul Bezemer and Ahmed E. Hassan},
year = {2017},
date = {2017-08-01},
urldate = {2017-08-01},
journal = {The Empirical Software Engineering Journal (EMSE)},
volume = {22},
number = {4},
pages = {2095--2126},
publisher = {Springer},
abstract = {The steadily increasing popularity of computer games has led to the rise of a multi-billion dollar industry. This increasing popularity is partly enabled by online digital distribution platforms for games, such as Steam. These platforms offer an insight into the development and test processes of game developers. In particular, we can extract the update cycle of a game and study what makes developers deviate from that cycle by releasing so-called urgent updates.
An urgent update is a software update that fixes problems that are deemed critical enough to not be left unfixed until a regular-cycle update. Urgent updates are made in a state of emergency and outside the regular development and test timelines which causes unnecessary stress on the development team. Hence, avoiding the need for an urgent update is important for game developers. We define urgent updates as 0-day updates (updates that are released on the same day), updates that are released faster than the regular cycle, or self-admitted hotfixes.
We conduct an empirical study of the urgent updates of the 50 most popular games from Steam, the dominant digital game delivery platform. As urgent updates are reflections of mistakes in the development and test processes, a better understanding of urgent updates can in turn stimulate the improvement of these processes, and eventually save resources for game developers. In this paper, we argue that the update strategy that is chosen by a game developer affects the number of urgent updates that are released. Although the choice of update strategy does not appear to have an impact on the percentage of updates that are released faster than the regular cycle or self-admitted hotfixes, games that use a frequent update strategy tend to have a higher proportion of 0-day updates than games that use a traditional update strategy.},
keywords = {Computer games, Steam, Update cycle, Update strategy, Urgent updates},
pubstate = {published},
tppubtype = {article}
}
An urgent update is a software update that fixes problems that are deemed critical enough to not be left unfixed until a regular-cycle update. Urgent updates are made in a state of emergency and outside the regular development and test timelines which causes unnecessary stress on the development team. Hence, avoiding the need for an urgent update is important for game developers. We define urgent updates as 0-day updates (updates that are released on the same day), updates that are released faster than the regular cycle, or self-admitted hotfixes.
We conduct an empirical study of the urgent updates of the 50 most popular games from Steam, the dominant digital game delivery platform. As urgent updates are reflections of mistakes in the development and test processes, a better understanding of urgent updates can in turn stimulate the improvement of these processes, and eventually save resources for game developers. In this paper, we argue that the update strategy that is chosen by a game developer affects the number of urgent updates that are released. Although the choice of update strategy does not appear to have an impact on the percentage of updates that are released faster than the regular cycle or self-admitted hotfixes, games that use a frequent update strategy tend to have a higher proportion of 0-day updates than games that use a traditional update strategy.
Philipp Leitner; Cor-Paul Bezemer
An Exploratory Study of the State of Practice of Performance Testing in Java-based Open Source Projects Inproceedings
The International Conference on Performance Engineering (ICPE), pp. 373–384, ACM/SPEC, 2017.
Abstract | BibTeX | Tags: Empirical software engineering, Mining software repositories, Open source, Performance engineering, Performance testing
@inproceedings{leitner16oss,
title = {An Exploratory Study of the State of Practice of Performance Testing in Java-based Open Source Projects},
author = {Philipp Leitner and Cor-Paul Bezemer},
year = {2017},
date = {2017-04-22},
urldate = {2017-04-22},
booktitle = {The International Conference on Performance Engineering (ICPE)},
pages = {373--384},
publisher = {ACM/SPEC},
abstract = {The usage of open source (OS) software is nowadays widespread across many industries and domains. While the functional quality of OS projects is considered to be up to par with that of closed-source software, much is unknown about the quality in terms of non-functional attributes, such as
performance. One challenge for OS developers is that, unlike for functional testing, there is a lack of accepted best practices for performance testing.
To reveal the state of practice of performance testing in OS projects, we conduct an exploratory study on 111 Java-based OS projects from GitHub. We study the performance tests of these projects from five perspectives: (1) the developers, (2) size, (3) organization and (4) types of performance tests
and (5) the tooling used for performance testing.
First, in a quantitative study we show that writing performance tests is not a popular task in OS projects: performance tests form only a small portion of the test suite, are rarely updated, and are usually maintained by a small group of core project developers. Second, we show through a qualitative study that even though many projects are aware that they need performance tests, developers appear to struggle implementing them. We argue that future performance testing frameworks should provider better support for low-friction testing, for instance via non-parameterized methods
or performance test generation, as well as focus on a tight integration with standard continuous integration tooling.},
keywords = {Empirical software engineering, Mining software repositories, Open source, Performance engineering, Performance testing},
pubstate = {published},
tppubtype = {inproceedings}
}
performance. One challenge for OS developers is that, unlike for functional testing, there is a lack of accepted best practices for performance testing.
To reveal the state of practice of performance testing in OS projects, we conduct an exploratory study on 111 Java-based OS projects from GitHub. We study the performance tests of these projects from five perspectives: (1) the developers, (2) size, (3) organization and (4) types of performance tests
and (5) the tooling used for performance testing.
First, in a quantitative study we show that writing performance tests is not a popular task in OS projects: performance tests form only a small portion of the test suite, are rarely updated, and are usually maintained by a small group of core project developers. Second, we show through a qualitative study that even though many projects are aware that they need performance tests, developers appear to struggle implementing them. We argue that future performance testing frameworks should provider better support for low-friction testing, for instance via non-parameterized methods
or performance test generation, as well as focus on a tight integration with standard continuous integration tooling.
Suhas Kabinna; Cor-Paul Bezemer; Weiyi Shang; Ahmed E. Hassan
Logging Library Migrations: A Case Study for the Apache Software Foundation Projects Inproceedings
International Conference on Mining Software Repositories (MSR), pp. 154–164, ACM, 2016.
@inproceedings{Kabinna16msr,
title = {Logging Library Migrations: A Case Study for the Apache Software Foundation Projects},
author = {Suhas Kabinna and Cor-Paul Bezemer and Weiyi Shang and Ahmed E. Hassan},
year = {2016},
date = {2016-05-14},
urldate = {2016-05-14},
booktitle = {International Conference on Mining Software Repositories (MSR)},
pages = {154--164},
publisher = {ACM},
abstract = {Developers leverage logs for debugging, performance monitoring and load testing. The increased dependence on logs has lead to the development of numerous logging libraries which help developers in logging their code. As new libraries emerge and current ones evolve, projects often migrate from an older library to another one.
In this paper we study logging library migrations within Apache Software Foundation (ASF) projects. From our manual analysis of JIRA issues, we find that 33 out of 223 (i.e., 14%) ASF projects have undergone at least one logging library migration. We find that the five main drivers for logging library migration are: 1) to increase flexibility (i.e., the ability to use different logging libraries within a project) 2) to improve performance, 3) to reduce effort spent on code maintenance, 4) to reduce dependence on other libraries and 5) to obtain specific features from the new logging library. We find that over 70% of the migrated projects encounter on average two post-migration bugs due to the new logging library. Furthermore, our findings suggest that performance (traditionally one of the primary drivers for migrations) is rarely improved after a migration.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
In this paper we study logging library migrations within Apache Software Foundation (ASF) projects. From our manual analysis of JIRA issues, we find that 33 out of 223 (i.e., 14%) ASF projects have undergone at least one logging library migration. We find that the five main drivers for logging library migration are: 1) to increase flexibility (i.e., the ability to use different logging libraries within a project) 2) to improve performance, 3) to reduce effort spent on code maintenance, 4) to reduce dependence on other libraries and 5) to obtain specific features from the new logging library. We find that over 70% of the migrated projects encounter on average two post-migration bugs due to the new logging library. Furthermore, our findings suggest that performance (traditionally one of the primary drivers for migrations) is rarely improved after a migration.
Tarek M. Ahmed; Cor-Paul Bezemer; Tse-Hsun Chen; Ahmed E. Hassan; Weiyi Shang
Studying the Effectiveness of Application Performance Management (APM) Tools for Detecting Performance Regressions for Web Applications: An Experience Report Inproceedings
International Conference on Mining Software Repositories (MSR), pp. 1–12, ACM, 2016.
@inproceedings{Ahmed16msr,
title = {Studying the Effectiveness of Application Performance Management (APM) Tools for Detecting Performance Regressions for Web Applications: An Experience Report},
author = {Tarek M. Ahmed and Cor-Paul Bezemer and Tse-Hsun Chen and Ahmed E. Hassan and Weiyi Shang},
year = {2016},
date = {2016-05-14},
urldate = {2016-05-14},
booktitle = {International Conference on Mining Software Repositories (MSR)},
pages = {1--12},
publisher = {ACM},
abstract = {Performance regressions, such as a higher CPU utilization than in the previous version of an application, are caused by software application updates that negatively affect the performance of an application. Although a plethora of mining software repository research has been done to detect such regressions, research tools are generally not readily available to practitioners. Application Performance Management (APM) tools are commonly used in practice for detecting performance issues in the field by mining operational data.
In contrast to performance regression detection tools that assume a changing code base and a stable workload, APM tools mine operational data to detect performance anomalies caused by a changing workload in an otherwise stable code base. Although APM tools are widely used in practice, no research has been done to understand 1) whether APM tools can identify performance regressions caused by code changes and 2) how well these APM tools support diagnosing the root-cause of these regressions.
In this paper, we explore if the readily accessible APM tools can help practitioners detect performance regressions. We perform a case study using three commercial (AppDynamics, New Relic and Dynatrace) and one open source (Pinpoint) APM tools. In particular, we examine the effectiveness of leveraging these APM tools in detecting and diagnosing injected performance regressions (excessive memory usage, high CPU utilization and inefficient database queries) in three open source applications. We find that APM tools can detect most of the injected performance regressions, making them good candidates to detect performance regressions in practice. However, there is a gap between mining approaches that are proposed in state-of-the-art performance regression detection research and the ones used by APM tools. In addition, APM tools lack the ability to be extended, which makes it hard to enhance them when exploring novel mining approaches for detecting performance regressions.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
In contrast to performance regression detection tools that assume a changing code base and a stable workload, APM tools mine operational data to detect performance anomalies caused by a changing workload in an otherwise stable code base. Although APM tools are widely used in practice, no research has been done to understand 1) whether APM tools can identify performance regressions caused by code changes and 2) how well these APM tools support diagnosing the root-cause of these regressions.
In this paper, we explore if the readily accessible APM tools can help practitioners detect performance regressions. We perform a case study using three commercial (AppDynamics, New Relic and Dynatrace) and one open source (Pinpoint) APM tools. In particular, we examine the effectiveness of leveraging these APM tools in detecting and diagnosing injected performance regressions (excessive memory usage, high CPU utilization and inefficient database queries) in three open source applications. We find that APM tools can detect most of the injected performance regressions, making them good candidates to detect performance regressions in practice. However, there is a gap between mining approaches that are proposed in state-of-the-art performance regression detection research and the ones used by APM tools. In addition, APM tools lack the ability to be extended, which makes it hard to enhance them when exploring novel mining approaches for detecting performance regressions.
Suhas Kabinna; Cor-Paul Bezemer; Weiyi Shang; Ahmed E. Hassan
Examining the Stability of Logging Statements Inproceedings
IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER), pp. 326-337, 2016.
Abstract | BibTeX | Tags: Log file stability, Log processing tools, Logging statements
@inproceedings{Kabinna16,
title = {Examining the Stability of Logging Statements},
author = {Suhas Kabinna and Cor-Paul Bezemer and Weiyi Shang and Ahmed E. Hassan},
year = {2016},
date = {2016-03-14},
urldate = {2016-03-14},
booktitle = {IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER)},
pages = {326-337},
abstract = {Logging statements (embedded in the source code) produce logs that assist in understanding system behavior, monitoring choke-points and debugging. Prior work showcases the importance of logging statements in operating, understanding and improving software systems. The wide dependence on logs has lead to a new market of log processing and management tools. However, logs are often unstable, i.e., the logging statements that generate logs are often changed without the consideration of other stakeholders, causing sudden failures of log processing tools and increasing the maintenance costs of such tools. We examine the stability of logging statements in four open source applications namely: Liferay, ActiveMQ, Camel and CloudStack. We find that 20–45% of their logging statements change throughout their lifetime. The median number of days between the introduction of a logging statement and the first change to that statement is between 1 and 17 in our studied applications. These numbers show that in order to reduce maintenance effort, developers of log processing tools must be careful when selecting the logging statements on which their tools depend. In order to effectively mitigate the issues that are caused by unstable logging statements, we make an important first step towards determining whether a logging statement is likely to remain unchanged in the future. First, we use a random forest classifier to determine whether a just-introduced logging statement will change in the future, based solely on metrics that are calculated when it is introduced. Second, we examine whether a long-lived logging statement is likely to change based on its change history. We leverage Cox proportional hazards models (Cox models) to determine the change risk of long-lived logging statements in the source code. Through our case study on four open source applications, we show that our random forest classifier achieves a 83–91% precision, a 65–85% recall and a 0.95–0.96 AUC. We find that file ownership, developer experience, log density and SLOC are important metrics in our studied projects for determining the stability of logging statements in both our random forest classifiers and Cox models. Developers can use our approach to determine the risk of a logging statement changing in their own projects, to construct more robust log processing tools, by ensuring that these tools depend on logs that are generated by more stable logging statements.},
keywords = {Log file stability, Log processing tools, Logging statements},
pubstate = {published},
tppubtype = {inproceedings}
}
Ravjot Singh; Cor-Paul Bezemer; Weiyi Shang; Ahmed E. Hassan
Optimizing the Performance Configuration of Object-Relational mapping Frameworks Using a Multi-Objective Genetic Algorithm Inproceedings
ACM/SPEC International Conference on Performance Engineering (ICPE), pp. 309–320, 2016.
Abstract | BibTeX | Tags: Object-relational mapping performance, Performance configuration optimization
@inproceedings{Singh16,
title = {Optimizing the Performance Configuration of Object-Relational mapping Frameworks Using a Multi-Objective Genetic Algorithm},
author = {Ravjot Singh and Cor-Paul Bezemer and Weiyi Shang and Ahmed E. Hassan},
year = {2016},
date = {2016-03-12},
urldate = {2016-03-12},
booktitle = {ACM/SPEC International Conference on Performance Engineering (ICPE)},
pages = {309--320},
abstract = {Object-relational mapping (ORM) frameworks map low-level database operations onto a high-level programming API that can be accessed from within object-oriented source code. ORM frameworks often provide configuration options to optimize the performance of such database operations. However, determining the set of optimal configuration options is a challenging task.
Through an exploratory study on two open source applications (Spring PetClinic and ZK), we find that the difference in execution time between two configurations can be large. In addition, both applications are not shipped with an ORM configuration that is related to performance: instead, they use the default values provided by the ORM framework. We show that in 89% of the 9 analyzed test cases for PetClinic and in 96% of the 54 analyzed test cases for ZK, the default configuration values supplied by the ORM framework performed significantly slower than the optimal configuration for that test case. Based on these observations, this paper proposes an approach for automatically finding an optimal ORM configuration using a multi-objective genetic algorithm. We evaluate our approach by conducting a case study of Spring PetClinic and ZK. We find that our approach finds near-optimal configurations in 360-450 seconds for PetClinic and in 9-12 hours for ZK. These execution times allow our approach to be executed to find an optimal configuration before each new release of an application.},
keywords = {Object-relational mapping performance, Performance configuration optimization},
pubstate = {published},
tppubtype = {inproceedings}
}
Through an exploratory study on two open source applications (Spring PetClinic and ZK), we find that the difference in execution time between two configurations can be large. In addition, both applications are not shipped with an ORM configuration that is related to performance: instead, they use the default values provided by the ORM framework. We show that in 89% of the 9 analyzed test cases for PetClinic and in 96% of the 54 analyzed test cases for ZK, the default configuration values supplied by the ORM framework performed significantly slower than the optimal configuration for that test case. Based on these observations, this paper proposes an approach for automatically finding an optimal ORM configuration using a multi-objective genetic algorithm. We evaluate our approach by conducting a case study of Spring PetClinic and ZK. We find that our approach finds near-optimal configurations in 360-450 seconds for PetClinic and in 9-12 hours for ZK. These execution times allow our approach to be executed to find an optimal configuration before each new release of an application.
Cor-Paul Bezemer; Johan Pouwelse; Brendan Gregg
Understanding Software Performance Regressions Using Differential Flame Graphs Inproceedings
IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER), pp. 535–539, 2015.
@inproceedings{Bezemer15,
title = {Understanding Software Performance Regressions Using Differential Flame Graphs},
author = {Cor-Paul Bezemer and Johan Pouwelse and Brendan Gregg},
year = {2015},
date = {2015-03-02},
urldate = {2015-03-02},
booktitle = {IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER)},
pages = {535--539},
abstract = {Flame graphs are gaining rapidly in popularity in industry to visualize performance profiles collected by stack-trace based profilers. In some cases, for example, during performance regression detection, profiles of different software versions have to be compared. Doing this manually using two or more flame graphs or textual profiles is tedious and error-prone.
In this ‘Early Research Achievements’-track paper, we present our preliminary results on using differential flame graphs instead. Differential flame graphs visualize the differences between two performance profiles. In addition, we discuss which research fields we expect to benefit from using differential flame graphs.v We have implemented our approach in an open source prototype called FLAMEGRAPHDIFF, which is available on GitHub. FLAMEGRAPHDIFF makes it easy to generate interactive differential flame graphs from two existing performance profiles. These graphs facilitate easy tracing of elements in the different graphs to ease the understanding of the (d)evolution of the performance of an application.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
In this ‘Early Research Achievements’-track paper, we present our preliminary results on using differential flame graphs instead. Differential flame graphs visualize the differences between two performance profiles. In addition, we discuss which research fields we expect to benefit from using differential flame graphs.v We have implemented our approach in an open source prototype called FLAMEGRAPHDIFF, which is available on GitHub. FLAMEGRAPHDIFF makes it easy to generate interactive differential flame graphs from two existing performance profiles. These graphs facilitate easy tracing of elements in the different graphs to ease the understanding of the (d)evolution of the performance of an application.
Jaap Kabbedijk; Cor-Paul Bezemer; Andy Zaidman; Slinger Jansen
Defining Multi-Tenancy: A Structured Mapping Study on the Academic and Industrial Perspective Journal Article
Journal of Systems and Software (JSS), 100 , pp. 139-148, 2015.
Abstract | BibTeX | Tags: Academic perspective, Definition, Industrial perspective, Multi-tenancy, Systematic mapping study
@article{KabbedijkJSS14,
title = {Defining Multi-Tenancy: A Structured Mapping Study on the Academic and Industrial Perspective},
author = {Jaap Kabbedijk and Cor-Paul Bezemer and Andy Zaidman and Slinger Jansen},
year = {2015},
date = {2015-01-01},
urldate = {2015-01-01},
journal = {Journal of Systems and Software (JSS)},
volume = {100},
pages = {139-148},
publisher = {Elsevier},
abstract = {Software as a service is frequently offered in a multi-tenant style, where customers of the application and their end-users share resources such as software and hardware among all users, without necessarily sharing data. It is surprising that, with such a popular paradigm, little agreement exists with regard to the definition, domain, and challenges of multi-tenancy. This absence is detrimental to the research community and the industry, as it hampers progress in the domain of multi-tenancy and enables organizations and academics to wield their own definitions to further their commercial or research agendas.
In this article, a systematic mapping study on multi-tenancy is described in which 761 academic papers and 371 industrial blogs are analysed. Both the industrial and academic perspective are assessed, in order to get a complete overview. The definition and topic maps provide a comprehensive overview of the domain, while the research agenda, listing four important research topics, provides a roadmap for future research efforts.},
keywords = {Academic perspective, Definition, Industrial perspective, Multi-tenancy, Systematic mapping study},
pubstate = {published},
tppubtype = {article}
}
In this article, a systematic mapping study on multi-tenancy is described in which 761 academic papers and 371 industrial blogs are analysed. Both the industrial and academic perspective are assessed, in order to get a complete overview. The definition and topic maps provide a comprehensive overview of the domain, while the research agenda, listing four important research topics, provides a roadmap for future research efforts.
Cor-Paul Bezemer; Elric Milon; Andy Zaidman; Johan Pouwelse
Detecting and Analyzing I/O Performance Regressions Journal Article
Journal of Software: Evolution and Process (JSEP), 26 (12), pp. 1193–1212, 2014.
Abstract | BibTeX | Tags: Performance analysis, Performance optimization, Performance regressions
@article{Bezemer14jsep,
title = {Detecting and Analyzing I/O Performance Regressions},
author = {Cor-Paul Bezemer and Elric Milon and Andy Zaidman and Johan Pouwelse},
year = {2014},
date = {2014-07-17},
urldate = {2014-07-17},
journal = {Journal of Software: Evolution and Process (JSEP)},
volume = {26},
number = {12},
pages = {1193--1212},
publisher = {John Wiley & Sons, Ltd},
abstract = {Regression testing can be done by re-executing a test suite on different software versions and comparing the outcome. For functional testing, the outcome of such tests is either pass (correct behaviour) or fail (incorrect behaviour). For non-functional testing, such as performance testing, this is more challenging as correct and incorrect are not clearly defined concepts for these types of testing.
In this paper, we present an approach for detecting and analyzing I/O performance regressions. Our method is supplemental to existing profilers and its goal is to analyze the effect of source code changes on the performance of a system. In this paper, we focus on analyzing the amount of I/O writes being done. The open source implementation of our approach, SPECTRAPERF, is available for download.
We evaluate our approach in a field user study on Tribler, an open source peer-to-peer client and its decentralized solution for synchronizing messages, Dispersy. In this evaluation, we show that our approach can guide the performance optimization process, as it helps developers to find performance bottlenecks on the one hand, and on the other allows them to validate the effect of performance optimizations. In addition, we perform a feasibility study on Django, the most popular Python project on Github, to demonstrate our applicability on other projects. Copyright (c) 2013 John Wiley & Sons, Ltd.},
keywords = {Performance analysis, Performance optimization, Performance regressions},
pubstate = {published},
tppubtype = {article}
}
In this paper, we present an approach for detecting and analyzing I/O performance regressions. Our method is supplemental to existing profilers and its goal is to analyze the effect of source code changes on the performance of a system. In this paper, we focus on analyzing the amount of I/O writes being done. The open source implementation of our approach, SPECTRAPERF, is available for download.
We evaluate our approach in a field user study on Tribler, an open source peer-to-peer client and its decentralized solution for synchronizing messages, Dispersy. In this evaluation, we show that our approach can guide the performance optimization process, as it helps developers to find performance bottlenecks on the one hand, and on the other allows them to validate the effect of performance optimizations. In addition, we perform a feasibility study on Django, the most popular Python project on Github, to demonstrate our applicability on other projects. Copyright (c) 2013 John Wiley & Sons, Ltd.
Cor-Paul Bezemer
Performance Optimization of Multi-Tenant Software Applications PhD Thesis
Delft University of Technology, 2014.
BibTeX | Tags:
@phdthesis{phd_cp,
title = {Performance Optimization of Multi-Tenant Software Applications},
author = {Cor-Paul Bezemer},
year = {2014},
date = {2014-04-14},
urldate = {2014-04-14},
school = {Delft University of Technology},
howpublished = {Delft University of Technology},
keywords = {},
pubstate = {published},
tppubtype = {phdthesis}
}
Cor-Paul Bezemer; Andy Zaidman
Performance Optimization of Deployed Software-as-a-service Applications Journal Article
Journal of Systems and Software (JSS), 87 , pp. 87-103, 2014.
Abstract | BibTeX | Tags: Performance analysis, Performance maintenance
@article{BezemerJSS13,
title = {Performance Optimization of Deployed Software-as-a-service Applications},
author = {Cor-Paul Bezemer and Andy Zaidman},
year = {2014},
date = {2014-01-01},
urldate = {2014-01-01},
journal = {Journal of Systems and Software (JSS)},
volume = {87},
pages = {87-103},
publisher = {Elsevier},
abstract = {The goal of performance maintenance is to improve the performance of a software system after delivery. As the performance of a system is often characterized by unexpected combinations of metric values, manual analysis of performance is hard in complex systems. In this paper, we propose an approach that helps performance experts locate and analyze spots – so called performance improvement opportunities (PIOs) –, for possible performance improvements. PIOs give performance experts a starting point for performance improvements, e.g., by pinpointing the bottleneck component. The technique uses a combination of association rules and performance counters to generate the rule coverage matrix, a matrix which assists with the bottleneck detection.
In this paper, we evaluate our technique in two cases studies. In the first, we show that our technique is accurate in detecting the timeframe during which a PIO occurs. In the second, we show that the starting point given by our approach is indeed useful and assists a performance expert in diagnosing the bottleneck component in a system with high precision.},
keywords = {Performance analysis, Performance maintenance},
pubstate = {published},
tppubtype = {article}
}
In this paper, we evaluate our technique in two cases studies. In the first, we show that our technique is accurate in detecting the timeframe during which a PIO occurs. In the second, we show that the starting point given by our approach is indeed useful and assists a performance expert in diagnosing the bottleneck component in a system with high precision.
Riccardo Petrocco; Cor-Paul Bezemer; Johan Pouwelse; Dick Epema
Libswift: the PPSPP Reference Implementation Technical Report
Delft Univ. of Technology (PDS-2014-004), 2014.
BibTeX | Tags:
@techreport{Petrocco2014,
title = {Libswift: the PPSPP Reference Implementation},
author = {Riccardo Petrocco and Cor-Paul Bezemer and Johan Pouwelse and Dick Epema},
year = {2014},
date = {2014-01-01},
urldate = {2014-01-01},
number = {PDS-2014-004},
institution = {Delft Univ. of Technology},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Cor-Paul Bezemer; Andy Zaidman
Improving the Diagnostic Capabilities of a Performance Optimization Approach Technical Report
Delft Univ. of Technology (TUD-SERG-2013-015), 2013.
@techreport{Bezemer13tech,
title = {Improving the Diagnostic Capabilities of a Performance Optimization Approach},
author = {Cor-Paul Bezemer and Andy Zaidman},
year = {2013},
date = {2013-01-01},
urldate = {2013-01-01},
number = {TUD-SERG-2013-015},
institution = {Delft Univ. of Technology},
abstract = {Understanding the performance of a system is difficult because it is affected by every aspect of the design, code and execution environment. Performance maintenance tools can assist in getting a better understanding of the system by monitoring and analyzing performance data. In previous work, we have presented an approach which assists the performance expert in obtaining insight into and subsequently optimizing the performance of a deployed application. This approach is based on the classification results made by a single classifier. Following results from literature, we have extended this approach with the possibility of using a set (ensemble) of classifiers, in order to improve the classification results. While this ensemble is maintained with the goal of optimizing its accuracy, the completeness (or coverage) is neglected. In this paper, we present a method for improving both the coverage and accuracy of an ensemble. By doing so, we improve the diagnostic capabilities of our existing approach, i.e., the range of possible causes it is able to identify as the bottleneck of a performance issue. We present several metrics for measuring coverage and comparing two classifiers. We evaluate our approach on real performance data from a large industrial application. From our evaluation we get a strong indication that our approach is capable of improving the diagnostic capabilities of an ensemble, while maintaining at least the same degree of accuracy.},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Cor-Paul Bezemer; Andy Zaidman; Ad van Hoeven; Andre de Graaf; Maarten Wiertz; Remko Weijers
Locating Performance Improvement Opportunities in an Industrial Software-as-a-Service Application Inproceedings
International Conference on Software Maintenance (ICSM), pp. 547-556, 2012.
@inproceedings{Bezemer12,
title = {Locating Performance Improvement Opportunities in an Industrial Software-as-a-Service Application},
author = {Cor-Paul Bezemer and Andy Zaidman and Ad van Hoeven and Andre de Graaf and Maarten Wiertz and Remko Weijers},
year = {2012},
date = {2012-09-23},
urldate = {2012-09-23},
booktitle = {International Conference on Software Maintenance (ICSM)},
pages = {547-556},
abstract = {The goal of performance maintenance is to improve the performance of a software system after delivery. As the performance of a system is often characterized by unexpected combinations of metric values, manual analysis of performance is hard in complex systems. In this paper, we extend our previous work on performance anomaly detection with a technique that helps performance experts locate spots — so-called performance improvement opportunities (PIOs) —, for possible performance improvements. PIOs give performance experts a starting point for performance improvements, e.g., by pinpointing the bottleneck component. The technique uses a combination of association rules and several visualizations, such as heat maps, which were implemented in an open source tool called WEDJAT.
In this paper, we evaluate our technique and WEDJAT in a field user study with three performance experts from industry using data from a large-scale industrial application. From our field study we conclude that our technique is useful for speeding up the performance maintenance process and that heat maps are
a valuable way of visualizing performance data.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
In this paper, we evaluate our technique and WEDJAT in a field user study with three performance experts from industry using data from a large-scale industrial application. From our field study we conclude that our technique is useful for speeding up the performance maintenance process and that heat maps are
a valuable way of visualizing performance data.
Cor-Paul Bezemer; Andy Zaidman
Server Overload Detection and Prediction Using Pattern Classification Inproceedings
International Conference on Autonomous Computing (ICAC), pp. 163-164, 2011.
BibTeX | Tags: Performance
@inproceedings{Bezemer2011,
title = {Server Overload Detection and Prediction Using Pattern Classification},
author = {Cor-Paul Bezemer and Andy Zaidman},
year = {2011},
date = {2011-06-14},
urldate = {2011-06-14},
booktitle = {International Conference on Autonomous Computing (ICAC)},
pages = {163-164},
keywords = {Performance},
pubstate = {published},
tppubtype = {inproceedings}
}
Cor-Paul Bezemer; Andy Zaidman
Multi-tenant SaaS applications: maintenance dream or nightmare? Inproceedings
Joint ERCIM Workshop on Software Evolution (EVOL) and International Workshop on Principles of Software Evolution (IWPSE), pp. 88–92, ACM, 2010, ISBN: 978-1-4503-0128-2.
@inproceedings{Bezemer10iwpse,
title = {Multi-tenant SaaS applications: maintenance dream or nightmare?},
author = {Cor-Paul Bezemer and Andy Zaidman},
isbn = {978-1-4503-0128-2},
year = {2010},
date = {2010-09-20},
urldate = {2010-09-20},
booktitle = {Joint ERCIM Workshop on Software Evolution (EVOL) and International Workshop on Principles of Software Evolution (IWPSE)},
pages = {88--92},
publisher = {ACM},
abstract = {Multi-tenancy is a relatively new software architecture principle in the realm of the Software as a Service (SaaS) business model. It allows to make full use of the economy of scale, as multiple customers – “tenants†– share the same application and database instance. All the while, the tenants enjoy a highly configurable application, making it appear that the application is deployed on a dedicated server. The major benefits of multi-tenancy are increased utilization of hardware resources and improved ease of maintenance, in particular on the deployment side. These benefits should result in lower overall application costs, making the technology attractive for service providers targeting small and medium enterprises (SME). However, as this paper advocates, a wrong architectural choice might entail that multi-tenancy becomes a maintenance nightmare.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Cor-Paul Bezemer; Andy Zaidman; Bart Platzbeecker; Toine Hurkmans; Aad 't Hart
Enabling Multi-tenancy: An Industrial Experience Report Inproceedings
International Conference on Software Maintenance (ICSM), pp. 1-8, 2010.
@inproceedings{Bezemer10ICSM,
title = {Enabling Multi-tenancy: An Industrial Experience Report},
author = {Cor-Paul Bezemer and Andy Zaidman and Bart Platzbeecker and Toine Hurkmans and Aad 't Hart},
year = {2010},
date = {2010-09-12},
urldate = {2010-09-12},
booktitle = {International Conference on Software Maintenance (ICSM)},
pages = {1-8},
abstract = {Multi-tenancy is a relatively new software architecture principle in the realm of the Software as a Service (SaaS) business model. It allows to make full use of the economy of scale, as multiple customers – “tenants†– share the same application and database instance. All the while, the tenants enjoy a highly configurable application, making it appear that the application is deployed on a dedicated server. The major benefits of multi-tenancy are increased utilization of hardware resources and improved ease of maintenance, resulting in lower overall application costs, making the technology attractive for service providers targeting small and medium enterprises (SME). Therefore, migrating existing single-tenant to multi-tenant applications can be interesting for SaaS software companies. In
this paper we report on our experiences with reengineering an existing industrial, single-tenant software system into a multi-tenant one using a lightweight reengineering approach.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
this paper we report on our experiences with reengineering an existing industrial, single-tenant software system into a multi-tenant one using a lightweight reengineering approach.
Cor-Paul Bezemer; Ali Mesbah; Arie van Deursen
Automated Security Testing of Web Widget Interactions Inproceedings
European Software Engineering Conference/ACM SIGSOFT International Symposium on the Foundations of Software Engineering (ESEC/FSE), pp. 81-90, 2009.
Abstract | BibTeX | Tags: Security testing, Web applications
@inproceedings{cp_fse,
title = {Automated Security Testing of Web Widget Interactions},
author = {Cor-Paul Bezemer and Ali Mesbah and Arie van Deursen},
year = {2009},
date = {2009-08-24},
urldate = {2009-08-24},
booktitle = {European Software Engineering Conference/ACM SIGSOFT International Symposium on the Foundations of Software Engineering (ESEC/FSE)},
pages = {81-90},
abstract = {We present a technique for automatically detecting security vulnerabilities in client-side self-contained components, called web widgets, that can co-exist independently on a single web page. In this paper we focus on two security scenarios, namely the case in which (1) a malicious widget changes the content (DOM) of another widget, and (2) a widget steals data from another widget and sends it to the server via an HTTP request. We propose a dynamic analysis approach for automatically executing the web application and analyzing the runtime changes in the user interface, as well as the outgoing HTTP calls, to detect inter-widget interaction violations.
Our approach, implemented in a number of open source Atusa plugins, called Diva, requires no modification of application code, and has few false positives. We discuss the results of an empirical evaluation of the violation revealing capabilities, performance, and scalability of our approach, by means of two case studies, on the Exact Widget Framework and Pageflakes, a commercial, widely used widget framework.},
keywords = {Security testing, Web applications},
pubstate = {published},
tppubtype = {inproceedings}
}
Our approach, implemented in a number of open source Atusa plugins, called Diva, requires no modification of application code, and has few false positives. We discuss the results of an empirical evaluation of the violation revealing capabilities, performance, and scalability of our approach, by means of two case studies, on the Exact Widget Framework and Pageflakes, a commercial, widely used widget framework.
Cor-Paul Bezemer
Automated Security Testing of AJAX Web Widgets Masters Thesis
Delft University of Technology, 2009.
@mastersthesis{msc_cp,
title = {Automated Security Testing of AJAX Web Widgets},
author = {Cor-Paul Bezemer},
year = {2009},
date = {2009-03-27},
urldate = {2009-03-27},
school = {Delft University of Technology},
abstract = {Over the years AJAX, a technique for improving the responsiveness of web applications, has become increasingly popular. One of the results of AJAX is the development of a new type of web application component called web widget. Widgets are mini-applications which are placed next to each other on a web page. This has consequences for their security. In this report two security threats are explained. The first threat discussed is the case in which a widget changes the DOM of another widget. The second threat discussed is the case in which a widget steals data from another widget. We propose a dynamic approach for automatically detecting these issues. Our approach uses ATUSA, a testing framework capable of crawling AJAX applications, for which we have developed two security testing plugins. In this report we also evaluate our approach using three case studies. The first case study is conducted on test widgets, which we created for a simplified widget framework. The second case study is conducted on the Exact Widget Framework, a widget framework which is being prototyped by the Research and Innovation team of Exact Software. The final case study is performed on Pageflakes, an industrial, widely used widget framework. The results of these case studies show that our approach has high violation-detection capabilities with a low false positive detection rate.},
keywords = {},
pubstate = {published},
tppubtype = {mastersthesis}
}