Abstract

Intrinsic bugs are bugs for which a bug-introducing changes (BICs) can be identified in the version control system of a software. In contrast, extrinsic bugs are caused by external changes to a software system, such as errors in APIs or changes in requirements. Up to now, the research literature, esp. the one on bug prediction, has assumed that all bugs are of intrinsic nature. In this paper, we show an example of how considering extrinsic bugs can affect software engineering practices.

Therefore, we study the impact of extrinsic in Just-In-Time (JIT) bug prediction models. JIT models attempt to predict bugs before they are discovered in a software component. They are trained using properties of earlier BICs. We partially replicate a recent study by McIntosh and Kamei on JIT models. Therefore, we manually curate their dataset to distinguish between intrinsic and extrinsic bugs. Then, we address the research questions of their original study, this time removing extrinsic bugs, to study (1) if JIT models lose predictive power over time, (2) if the relationship between code change properties and the likelihood of BICs evolve, and (3) how accurately current importance scores of code change properties represent future ones. Finally, we study whether characteristics of intrinsic and extrinsic bugs are different.

Our results show that intrinsic and extrinsic bugs are of different nature. When removing extrinsic bugs in JIT models (1) the percentage of predictive power is better in each period, (2) the performance increases up to 16 AUC (Area Under the Curve) percentage points, and (3) the fluctuations in six families of code change properties are up to 20 AUC percentage points more stable. We conclude that extrinsic bugs negatively impact JIT bug prediction models, so researchers and practitioners should remove them to obtain better results. We also offer evidence that extrinsic bugs should be further investigated, as they can have significant impact in how we understand bugs.

Raw Data

OpenStack

OpenStack is an interesting and worthwhile project to study the impact of extrinsic bugs in JIT bug prediction models because it has more than 10,300 contributors with significant industrial support from several major IT companies such as Red Hat, Google, Huawei and IBM. Currently, OpenStack has more than 330K commits with more than 48M lines of code and around 8400 active developers

You can find the initial dataset from McIntosh and Kamei [1] here

You can find how to label extrinsic bugs here

You can find our final dataset here

You can find the classification of issues here

[1] S. McIntosh and Y. Kamei, “Are fix-inducing changes a moving target? a longitudinal case study of just-in-time defect prediction,” IEEE Transactions on Software Engineering, vol. 44, no. 5, pp. 412– 428, 2018.

Scripts

To ensure that the bug reports can be applied to our model, we verified that they describe real bug reports at the moment of their report and not other issues. For that, we carefully read the description and comments in the issue tracking system and code review system to analyze whether we can apply them the model.

During this analysis we used some scripts to (1) create our clean dataset (cleaning.r), (2) to identify intrinsic and extrinsic bugs from our dataset (polishingData.R); and (3) to perform the statistical analysis (statistical_analysis.R). All of these scrpits can be found here.

Furthermore, we also used the replication scripts from McIntosh and Kamei [1]. They can be found here

Results

RQ1: How does our manually curated dataset differ from the one by McIntosh and Kamei?

You can find the figures here

RQ2: Do JIT models lose predictive power over time when extrinsic bugs are removed?

You can find the figures without extrinsic bugs here

You can find the figures without NotBug here

You can find the figures without extrinsic bugs and NotBugs here

You can find the Precision and Recall of our models here

RQ3: Does the relationship between code change properties and the likelihood of BICs evolve when extrinsic bugs are removed?

You can find the figures without extrinsic bugs here

You can find the figures without NotBug here

You can find the figures without extrinsic bugs and NotBugs here

RQ4: How accurately do current importance scores of code change properties represent future ones when extrinsic bugs are removed?

You can find the figures without extrinsic bugs here

You can find the figures without NotBug here

You can find the figures without extrinsic bugs and NotBugs here

RQ5: How do mislabeled bugs affect JIT models?

You can find the figures of RQ2 for RQ5 here

You can find the figures of RQ3 for RQ5 here

You can find the figures of RQ4 for RQ5 here

RQ6: Are the properties of BFCs and BICs linked to extrinsic, intrinsic, and mislabeled bugs different?

You can find the figures here

Authors and Contributors

Gema Rodríguez-Pérez(@gerope90), Gregorio Robles (@gregoriorobles), and Meiyappan Nagappan (@MeiNagappan).