PhD Position: Using Machine-Learning Techniques to Enhance Test Suite Effectiveness

The Software Engineering & Information Systems Group at the University of Tartu is calling for applications from aspiring computer science researchers to conduct doctoral research at the crossroads of Artificial Intelligence (AI) and Software Testing. System-level testing has become a highly automated practice in software industry. Growing size and configurability of software, which needs to be tested in ever-shorter cycles, has resulted in high demand for optimizing test suites with regards to efficiency and effectiveness.

The goal of the PhD project is as follows: With the help of machine-learning/AI, three approaches with prototypical tool support shall be developed for test suite optimization. The approaches shall provide (i) a method to develop a model that predicts in a mutation testing context which mutants won’t be killed, thus informing about missing tests in the test suite, (ii) a method to support semi-automatic test oracle generation, thus completing automatically generated test data, and (iii) a method to detect usage profile differences between test execution logs and end user execution logs, thus informing about gaps in the test suite. The three approaches will be integrated and evaluated in case studies with industry. Part of the PhD will be conducted at the site of our cooperation partner Software Competence Center Hagenberg in Austria (SCCH).

As PhD student you will be embedded within the group, which consists of 8 staff members and 12 PhD students. The group conducts world-leading research in the fields of software analytics and business process management. In the past 5 years, the group has earned 8 best or distinguished paper awards at international conferences. During your PhD, you will receive mentorship from at least two senior staff members and will be given the opportunity to undertake research visits in leading research teams in Europe and world-wide.

The PhD position comes with a scholarship of 1200 euros/month net for a period of four years. Additional income can be generated through teaching tasks and through special agreements with SCCH based on performance.

To apply, you must have a Masters degree (or be close to obtaining your Masters) in Computer Science, Software Engineering, Information Systems or a related discipline. You will need a good foundation in machine learning, a basic understanding of software testing techniques, and openness to cross-disciplinary research.

Preference will be given to applicants with high GPA, and applicants who can demonstrate ...

  • Strong background in AI (Machine-Learning)
  • Strong interest in software testing/QA
  • Strong interest in learning how to do proper research
  • Good programming skills
  • Good communication skills
  • Ability to work autonomously
  • Ability to adapt to new work environment
  • Couriosity and willingness to travel (temporary re-location to Austria)

Interested candidates may send an expression of interest to Prof. Dietmar Pfahl (firstname.lastname ät ut.ee). The deadline for applications is June 10. The official application procedure is available at: https://www.ut.ee/en/phd-computer-science. The position has been announced here: https://reaalteadused.ut.ee/en/admissions/phd-projects

--

More details about the individual topics (incl. literature) can be found below:

Topic 1:
Mutation testing has gained new interest in industry due to advances in automatic mutant generation [1]. However, growing test suite sizes and large numbers of auto-generated mutants make it difficult to run all tests on all mutants to identify mutants that won’t be killed. Experiments at SCCH have shown that running all tests on all mutants requires calendar weeks even when using high-performing hardware [2]. Undetected mutants inform about the actual strength of a test suite and the kind of tests to add for optimization. Building upon previous research [3], a ML-based baseline approach will be designed to build a model that predicts which mutants will be killed. This makes the execution of the test suite on mutants obsolete.

  1. Goran Petrovic and Marko Ivankovic (2018) State of Mutation Testing at Google. In ICSE-SEIP ’18: 40th International Conference on Software Engineering: Software Engineering in Practice Track, May 27-June 3, 2018, Gothenburg, Sweden. ACM, New York, NY, USA, 9 pages.
  2. Rudolf Ramler, Thomas Wetzlmaier, and Claus Klammer (2017) An empirical study on the application of mutation testing for a safety-critical industrial software system. In Proceedings of the Symposium on Applied Computing (SAC '17). ACM, New York, NY, USA, 1401-1408.
  3. Jie Zhang , Ziyi Wang , Lingming Zhang , Dan Hao , Lei Zang , Shiyang Cheng , Lu Zhang (2016) Predictive mutation testing, in: Proceedings of the 25th International Symposium on Software Testing and Analysis, July 18-20, 2016, Saarbrücken, Germany

Topic 2:
A popular approach to increase the effectiveness of test suites is the use of automatically generated test data (e.g., in the context of random testing and combinatorial testing) [1]. To address the problem of the missing test oracle, Machine Learning techniques can be used to support the (semi-) automatic generation of test oracles [2].

  1. J. D. Hagar, T. L. Wissink, D. R. Kuhn, R. N. Kacker (2015) Introducing Combinatorial Testing in a Large Organization. IEEE Computer, (4), 64-72.
  2. Huai Liu, Fei-Ching Kuo, Dave Towey, and Tsong Yueh Chen (2014) How Effectively Does Metamorphic Testing Alleviate the Oracle Problem?. IEEE Trans. Softw. Eng. 40, 1 (January 2014), 4-22.

Topic 3:
Model-based approaches have potential to facilitate the automatic generation of test suites [1]. Moreover, the mining of execution logs created during software use in the field and during system testing is making advances [2, 3]. Combining these two approaches could be used to analyse the overlap of the usage profiles. High overlap indicates that the test suite used during development of the software adequately anticipates the actual usage in the field. Low overlap indicates a potential improvement potential. ML approaches will be applied to identify under-specified areas in the models that were used to generate the test suites. This information helps closing the test suite gaps such that the usage profiles during testing become similar to those observed during field usage.

  1. Mark Utting, Alexander Pretschner, and Bruno Legeard (2012) A taxonomy of model-based testing approaches. Softw. Test. Verif. Reliab. 22, 5 (August 2012), 297-312.
  2. Zhen Ming Jiang, Alberto Avritzer, Emad Shihab, Ahmed E. Hassan, and Parminder Flora (2010) An Industrial Case Study on Speeding Up User Acceptance Testing by Mining Execution Logs. In Proceedings of the 2010 Fourth International Conference on Secure Software Integration and Reliability Improvement (SSIRI '10). IEEE Computer Society, Washington, DC, USA, 131-140.
  3. Aichernig B.K., Mostowski W., Mousavi M.R., Tappler M., Taromirad M. (2018) Model Learning and Model-Based Testing. In: Bennaceur A., Hähnle R., Meinke K. (eds) Machine Learning for Dynamic Software Analysis: Potentials and Limits. Lecture Notes in Computer Science, vol 11026. Springer, Cham