Towards an Empirical Model to Identify When Bugs are Introduced

Bird

Doctoral Thesis, 2018

What is this thesis about?

Finding which changes introduced bugs into the source code is important because it is the first step to understand how code becomes buggy. Changes do not introduced bugs in the same way, for instance, some changes do not directly introduce bugs into the software product, and others introduce bugs after changing some especifications of the project.

This dissertation proposes a theoretical model to determine how changes introduce bugs in software products. The model is based on the concept of when bugs manifest themselves for the first time, and how that can be determined by running a test. The validity of the model has been explored with a careful, manual analysis of a number of bugs in two different open source systems.

The results of the analysis have demonstrated that bugs are not always introduced in the source code, and this phenomenon should be further investigated to improve other disciplines of software engineering.

An interesting specific result of the model is that it provides a clear condition to determine if a given algorithm for identifying the change introducing a bug is correct or not when performing the identification. This allows (i) to compute the “real” performance of algorithms based on backtracking the modified lines that fixed a bug, and (ii) a sound evaluation of those algorithms.

Contents and Resources

Finding which changes introduced bugs into the source code is important because it is the first step to understand how code becomes buggy. Changes do not introduced bugs in the same way, for instance, some changes do not directly introduce bugs into the software product, and others introduce bugs after changing some especifications of the project.

This dissertation proposes a theoretical model to determine how changes introduce bugs in software products. The model is based on the concept of when bugs manifest themselves for the first time, and how that can be determined by running a test. The validity of the model has been explored with a careful, manual analysis of a number of bugs in two different open source systems.

The results of the analysis have demonstrated that bugs are not always introduced in the source code, and this phenomenon should be further investigated to improve other disciplines of software engineering.

An interesting specific result of the model is that it provides a clear condition to determine if a given algorithm for identifying the change introducing a bug is correct or not when performing the identification. This allows (i) to compute the “real” performance of algorithms based on backtracking the modified lines that fixed a bug, and (ii) a sound evaluation of those algorithms.

Thesis and Slides (PDF)

Data

Results

Publications


A list with all my publications can be found in:

Google Scholar Dblp

Who am I?


My name is Gema Rodriguez Perez, I am a researcher in LibreSoft group and Ph.D. student at University Rey Juan Carlos, Spain. Currently I'm working as visiting researcher at Delft University of Technology and Eindhoven University of Technology in The Netherlands. My research interests focus on Open Source System, Software Evolution, Mining Software Repositories and Empirical Study. More information about myselft and publications can be found in my Personal web page

ME