Data-Driven Search-Based Software Engineering

NII Shonan Meeting:

Shonan Village Center, December 11-14, 2017

Organizers

  • Markus Wagner, University of Adelaide , Australia
  • John Clark, University of Sheffield, United Kingdom
  • Ahmed Hassan, Queen’s University, Canada
  • Leandro Minku, University of Leicester, United Kingdom

Overview

Description of the meeting

Search-Based Software Engineering (SBSE) is a well-established research area that not only involves the
application of search and optimisation techniques in software engineering, but also promotes rethinking
and reformulation of classical software engineering problems in different ways. By doing so, it has provided
many inspirations for improving software engineering, both in terms of the engineering process as well as
the software product. In particular, it has shown how difficult software engineering problems can be solved
more effectively using search and optimisation algorithms [1], and how re-formulating software
engineering problems as multi-objective optimisation problems can lead to better solutions as well as
richer information that can be provided to software engineers [2,3].
In recent years, there has been a renewed interest in this area, driven by the need to cope with increased
software size and complexity. Moreover, advances in search-based algorithms, especially in genetic
programming, are now enabling successful use of search-based software engineering to automatic
programming, which is an area where SBSE had been previously struggling to address. For example, there
have been recent advances in terms of bug-fixing [4,7], speeding-up software execution [5] and making
software more energy efficient [6]. These advances show that SBSE is able to improve non-trivial real world
programs.
Meanwhile, software processes and products have been generating a wealth of various data, e.g., source
change history, test cases, bug reports, operation logs, field crashes, etc. Hidden in these data is rich and
valuable information about the quality of software and services and the dynamics of software
development.
The software data analytics community has been achieving promising results in using such data to gain
insights into several tasks such as identifying what software modules are most likely to contain bugs [10,
11], estimating the amount of effort likely to be required to develop new software projects or Web
applications [12, 13], determining what software changes are most likely to induce bugs [14, 15], and
tracking how the productivity of a company changes over time [16].
The availability of software data and the promising results being achieved have resulted in a steep growth
of the software data analytics community. This is very well illustrated by the Working Conference on
Mining Software Repositories (MSR), which observed an increase in the number of submissions from
around 40 to more than 180 from 2004 to 2015. Companies (Microsoft, Google, Facebook, Cisco, Yahoo,
IBM, RIM, etc.) are also increasingly adding analytics as an important role in their organizations, leveraging
the wealth of various data produced around their software or services.
Such wealth of data also has the potential to guide the search and optimization process in SBSE towards
promising solutions considering the specific environment where the software process or product operates.
It has the potential to take SBSE to yet another level – that of creating context-aware solutions.
Nevertheless, very few data-driven SBSE approaches have been proposed so far, e.g., [8,9]. Moreover, the
software data analytics and SBSE communities are considerably disjoint. With that in mind, this proposed
NII Shonan Meeting aims at getting these two communities together in order to discuss software
engineering problems that can benefit from the integration of software data analytics and SBSE, and
potential ways how to combine these two areas. We expect this meeting to identify the key challenges and
opportunities in integrating software data analytics with SBSE, and to form new research collaborations
among members of these communities. Ultimately, this will push the boundaries of research in both
software data analytics and SBSE, helping to consolidate the area of data-driven SBSE.

References
[1] M. Harman and B. F. Jones, Search-based software engineering. Information and Software Technology,
43:833-839, 2001.
[2] K. Praditwong, M. Harman and X. Yao, Software Module Clustering as a Multi-Objective Search Problem.
IEEE Transactions on Software Engineering, 37(2):264-282, 2011.
[3] Z. Wang, K. Tang and X. Yao, Multi-objective Approaches to Optimal Testing Resource Allocation in
Modular Software Systems. IEEE Transactions on Reliability, 59(3):563-575, 2010.
[4] A. Arcuri and X. Yao, “A novel co-evolutionary approach to automatic software bug fixing,” Proceedings
of the 2008 IEEE Congress on Evolutionary Computation (CEC2008), (Piscataway, NJ), pp. 162-168, IEEE
Press, 2008.
[5] W. B. Langdon and M. Harman. Optimising existing software with genetic programming. IEEE
Transactions on Evolutionary Computation (TEVC), 2014
[6] Mario Linares-Vásquez, Gabriele Bavota, Carlos BernalCárdenas, Rocco Oliveto, Massimiliano Di Penta,
Denys Poshyvanyk. Optimizing Energy Consumption of GUIs in Android Apps: A Multi-objective Approach
Proceedings of the 10th Joint Meeting on Foundations of Software Engineering, pp. 143-154, 2015.
[7] C. Le Goues, T. Nguyen, S. Forrest, and W. Weimer. GenProg: A generic method for automatic software
repair. IEEE Transactions on Software Engineering, 38(1):54–72, 2012.
[8] Shin Yoo, Amortised Optimisation of Non-Functional Property in Production Environment. Proceedings
of the Symposium on Search-Based Software Engineering (SSBSE), 2015.
[9] Harman et al. Genetic Improvement for Adaptive Software Engineering. 9th International Symposium on
Software Engineering for Adaptive and Self-Managing Systems, 2014.
[10] Hall, T., Beecham, S., Bowes, D., Gray, D., Counsell, S.: A systematic literature review on fault
prediction performance in software engineering. IEEE Transaction on Software Engineering 38(6), 1276–
1304, 2012.
[11] Menzies, T., Milton, Z., Turhan, B., Cukic, B., Jiang, Y., Bener, A.: Defect prediction from static code
features: current results, limitations, new approaches. Automated Software Engineering 17(4), 375–407,
2010.
[12] Dejaeger, K., Verbeke, W., Martens, D., Baesens, B.: Data mining techniques for software effort
estimation: a comparative study. IEEE Transaction on Software Engineering, 38(2), 375–397, 2012.
[13] Mendes, E., Mosley, N.: Web Engineering. Springer Science & Business Media, New York, 2006.
[14] An, L., Khomh, F.: An empirical study of crash-inducing commits in mozilla firefox. In: Proceedings of
the 11th International Conference on Predictive Models and Data Analytics in Software Engineering
(PROMISE), pp. 5.1–5.10, 2015.
[15] Kamei, Y., Shihab, E., Adams, B., Hassan, A., Mockus, A., Sinha, A., Ubayashi, N.: A large-scale empirical
study of just-in-time quality assurance. IEEE Transaction on Software Engineering, 39(6), 757–773, 2013.
[16] Minku, L., Yao, X.: How to make best use of cross-company data in software effort estimation? In:
Proceedings of the 36th International Conference on Software Engineering, pp. 446–456, 2014.

 

Comments are closed.