Student Projects (MSc/BSc Theses), Academic Year 2023-2024
Below is a list of project topics for Masters and Bachelors theses offered by the Software Engineering & Information Systems Research Group for students who intend to defend in June 2024.
- Master's theses topics offered by:
- Marlon Dumas, Professor of Information Systems
- Dietmar Pfahl, Professor of Software Engineering
- Literature survey on approaches to generate synthetic population data - co-supervised by one of my PhD students
- Literature survey on digital twins for dynamically changing data sets - co-supervised by one of my PhD students
- A tool for generating digital data twins using generative AI, e.g., based on large language models like GPT (generatie pre-trained transformer) - co-supervised by one of my PhD students
- Exploring the MATLAB Autonomous Driving Toolbox to create simulation environments for testing toy automatic driving systems (Donkey Cars).
- A tool for mapping data collected from an ADS (Automated Driving System) with safety driver on board to test scenarios in simulation-based safety testing of ADS. The data used for the mapping would be collected during a time interval before an (unplanned) disengagement happened (i.e., the safety driver took over control because he/she thought a dangerous situation emerges that might not be adequately handled by the ADS) - co-supervised by one of my PhD students
- “Umbrella” topic for students working in industry: Case Study in Software Testing or Software Analytics (focus on software quality)
- For bachelor students: Lab package for the course "Software Testing" on the topic: Generative AI for generation of software test code (with focus on unit testing) - this topic could be extended to an MSc thesis if additional teaching material (lecture, video, demos, etc.) is added.
- Kristiina Rahkema, Junior Research Fellow of Software Engineering
- Alexander Nolte, Associate Professor of Information Systems
- Fredrik Milani, Associate Professor of Information Systems (0.5)
- Ezequiel Scott, Lecturer (Assistant Professor) of Software Engineering (0.25)
- Hina Anwar, Lecturer (Assistant Professor) of Software Engineering
- Anastasija Nikiforova, Lecturer (Assistant Professor) of Information Systems -- anastasija (dot) nikiforova (at) ut (dot) ee
- Towards automating data quality specification by extracting data quality requirements from data features
- Market analysis of Systems Modelling tools (for "Systems Modelling" course), incl. Generative AI implementation tools (for UML modelling)
- A recommender system for an improved data findability in open government data portals (with/without the use of LLM)
- Automated classification of open datasets to improve data findability on open government data portals (with/without the use of LLM)
- Aligning Categories from OGD Portals to a “Comprehensive Set of Categories”
- Chatbot for open government data portals: towards making open data user-friendly to users regardless their level of (open) data literacy (with/without the use of LLM)
- Emerging technologies use for improved user-system interaction in public data systems
- Data Quality or (Big) Data Analytics or DataOps
- Integrating artificial intelligence (AI) technologies into customer service
- The role of Social media-based crisis management
- Technology acceptance-driven analysis of ChatGPT acceptance in academia - 2 perspectives (students vs teaching staff)
- Barriers to Openly Sharing Government Data: towards Open Data-adapted Innovation Resistance Theory
- Multi-perspective framework towards HVD determination (country-specific study)
- Student's own topic is also possible if related to Data Quality management, or supportive tools or feature such as chatbots, recommender systems, or (but preferably and to the previous) public data systems
- Faiz Ali Shah, Lecturer (Assistant Professor) of Software Engineering -- faizalishah (at) gmail (dot) com
- Evaluating Generative AI for Generating Synthetic Test Data
- Revisiting App Feature Extraction in the Era of Large Language Models
- Leveraging LLMs for Annotating Data for Automatically Detecting Software Requirements
- Evaluating Continuous Prompting and Tuning-free Prompting to Find Developer-relevant Information
- Evaluating Data Augmentation Techniques for Review Classification
- Identifying App Functionalities that Violate Quality Aspects
- Detecting Sarcasm in App Reviews
- A Survey on Agent-based Approaches to Automate Software Testing
- Vimal Kumar Dwivedi, Junior Lecturer of Software Engineering
- Baseer Ahmad Baheer, Junior Lecturer of Software Engineering
- Alejandra Duque-Torres, Junior Research Fellow of Software Engineering -- alejandra [dot] duque [dot] torres [at] ut [dot] ee
- Building a Metamorphic Testing Tool: Investigating Existing Approaches and Developing a tool prototype
- Developing a User-Friendly Visualization Plugin for Metamorphic Testing Logs
- Metamorphic Mate: Building a Tool for Automated Metamorphic Relation Discovery, Querying, and Visualization
- Translating Metamorphic Rules from Natural Language to Test Code using AI APIs like Chat GPT in Forms of Templates
- Ilia Bider, Adjunct Professor
- Mohamad Gharib, Lecturer (Assistant Professor) of Information Systems -- mohamad (dot) gharib (at) ut (dot) ee
- An Approach for Deriving “Integration Requirements” for Enterprise Systems
To trust or not to trust? Uncertainty-aware prescriptive monitoring of business processes
Supervisor: Marlon Dumas (firstname [dot] lastname [ät] ut [dot] ee) and Mahmoud Shoush
Prescriptive Process monitoring is a family of methods that uses historical data about business process executions, to learn if, when, and how to trigger runtime actions (e.g. giving a discount to a customer) in order to prevent negative case outcomes (e.g. preventing customer complaints or churn).
A common technique to tackle this question is to train machine learning models to estimate the probability that a given business process instance (a case) will finish in a negative outcome [1]. When the predicted probability of a negative case outcome is above a given threshold, the action (also called the "intervention") is performed.
Oftentimes though, the predictions that machine learning classifiers product have a high level of uncertainty. If the prediction uncertainty is too high, there is little point in triggering interventions based on this prediction.
In this Masters thesis project, you will address this problem by predicting how an ongoing case will unfold and by estimating the level of uncertainty of this prediction. You will use real-life data and you will learn about a range of techniques for predictive modeling and uncertainty estimation. You will conduct benchmarks to compare different techniques for predictive modeling under uncertainty.
[1] Stephan A. Fahrenkrog-Petersen, Niek Tax, Irene Teinemaa, Marlon Dumas, Massimiliano de Leoni, Fabrizio Maria Maggi, Matthias Weidlich. Fire now, fire later: alarm-based systems for prescriptive process monitoring. Knowledge and Information Systems 64(2):559-587, 2022.
Emerging Tech & Financial Industries
Supervisor: Fredrik Milani (fredrik [dot] milani [ät] ut [dot] ee)
New technologies provide value when used to improve processes or products. However, how new technologies can innovate, enhance, or significantly improve existing processes and products is not always clear. This thesis topic explores one emerging technology to understand better how it can deliver value for the financial sector. The work required for this thesis predominantly includes (1) research on the technology (what it is, how it works, its capabilities, use cases, etc.) and (2) conducting 8-12 interviews with people within the financial sector to learn about potential use cases within the financial sector. Finally, analyze and overlay the results with a framework. IoT, Web3, Quantum Computing, Digital Twins, and Metaverse have been covered. If you have another emerging tech, we can discuss.
Benchmark Study of Log Augumentation Tools
Supervisor: Fredrik Milani (fredrik [dot] milani [ät] ut [dot] ee)
Due to different reasons such as privacy or lack of data, it is possible to use existing data to create more data. In this data, synthetic data is generated to create more data. In other words, existing data is augmented. This thesis topic is about selecting several existing tools, benchmark them on different data sets, including event logs, and assess their accuracy. The contribution is some form of framework that aids in selecting the right log augmentation tool. This thesis is in collaboration with a bank.
Securing Data Quality in Software Development Processes – A Case Study [Reserved]
Supervisor: Fredrik Milani (fredrik [dot] milani [ät] ut [dot] ee)
Quality in software development processes help reduce waste and costs. Furthermore, it can help speed up delivery without compromising quality. However, it is not clear how quality can be measured in a software development process. It is also not clear where in the process one should use what kind of metrics to assess quality. This topic is about exploring this topic to provide a set of suggestions on how the software development process of a bank can be assessed and monitored. This thesis is in collaboration with a bank.
UX-Driven Privacy for Financial Product [Reserved]
Supervisor: Fredrik Milani (fredrik [dot] milani [ät] ut [dot] ee)
Financial product offered by banks are heavily regulated. Privacy requirements can often prolong the onboarding process and reduce customer experience. This thesis topic is about exploring how privacy can be embedded in the UX. To this end, we begin with a review of privacy by design approaches and examples, take an existing financial product that is heavily regulated and analyze its onboarding, propose a new onboarding process that is easier, more customer friendly, and compliant, and evaluate the new design. This thesis is in collaboration with a bank.
Data-Driven Process Analysis – A Case Study
Supervisor: Fredrik Milani (fredrik [dot] milani [ät] ut [dot] ee)
Most information systems can log the transactions. These logs can be used to conduct process analysis using process mining tools. Although there are some automated tools, most analysis is still conducted manually. As such, this topic is about taking an event log for a financial business process and analyze it using Apromore (a process mining tool). The objective is to provide descriptive analysis of the business process, highlight strengths and weaknesses in the business process, and propose a set of changes that, if implemented, can improve process performance. This thesis is in collaboration with a bank.
Defining a Data Mesh Architecture for Financial Industry – A Case Study [Reserved]
Supervisor: Fredrik Milani (fredrik [dot] milani [ät] ut [dot] ee)
Companies with legacy systems struggle with making necessary data available for advanced data analytics. While cloud is a strategy being pursued, there are some options, at least as a step. One option is to introduce the concept of data mesh architecture where the focus is on making needed data accessible. Such a concept strives to define domains that define what data is accessed. Defining the domain is not straightforward. Domains, or data hubs, can be defined based on product area, services, processes, or by other principles. The objective of this thesis is to investigate how such domains (data hubs) should be defined for a financial institution so to enable data access for business development. This thesis is in collaboration with a research group in Portugal and a bank.
Customer-Centric Business Process Improvement and Redesign
Supervisor: Fredrik Milani (fredrik [dot] milani [ät] ut [dot] ee)
Business processes have, historically, been improved with efficiency as the main focus. However, customer-centricity is becoming more important and a competitive advantage. Although some work has been published on this matter, the field is still open for further investigation. This thesis is about using different research methods (literature review and interviews) to elicit what customer centricity is, how processes can be improved with customer centricity in focus without compromising efficiency, what technology to use, what redesign patterns to apply etc. For this thesis, we focus on the financial service domain.
Digital Twin for Financial Services – A Case Study [Reserved]
Supervisor: Fredrik Milani (fredrik [dot] milani [ät] ut [dot] ee) and Alexander Nolte
Digital twins are used for optimizing manufacturing processes. In such use cases, IoT devices are used to track, predict, prescribe, and optimize manufacturing processes. However, digital twins are not as clearly understood for non-manufacturing processes such as services of financial service products. The aim of this thesis is to outline a conceptual framework for digital twins using financial service processes as an example. To this end, this thesis requires conducting literature review and conducting interviews. This thesis is in collaboration with a bank.
Technology-Driven Redesign of Business Processes – SLR Study
Supervisor: Fredrik Milani (fredrik [dot] milani [ät] ut [dot] ee) and Kateryna Kubrak
Substituting a technology with a newer one does not necessarily yield significant benefits. It is when technology is used to redesign processes that new value can be added. However, it is not easy to know what technology can enable what kind of redesigns. Previous work has examined capabilities of technologies and mapped them to process redesign patterns. However, they examined case studies in academic publications. In this thesis, we explore grey literature, i.e., non-academic literature to elicit a framework for how digital technologies can help deliver value through process redesign.
Discovery of Potential Interventions for Prescriptive Process Monitoring
Supervisor: Fredrik Milani (fredrik [dot] milani [ät] ut [dot] ee) and Mahmoud Shoush
When process workers are dealing with an ongoing case, they benefit from receiving recommendations on how they should process the case. This is called prescriptive process monitoring. Using different techniques, we can tap into previous executed cases, identify what made them conclude successfully given a specific metric, and recommend (prescribe) an action (intervention) to the process worker when dealing with an ongoing case. However, most existing techniques focus on a specific prescription. Currently, there is no solution for discovering potential recommendations from an event log. This thesis topic is about first developing a conceptual framework for discovering candidate prescriptions and, then, developing a solution that can detect, assess, and produce a list of candidate interventions.
Analysis Templates for Process Mining using Apromore
Supervisor: Fredrik Milani (fredrik [dot] milani [ät] ut [dot] ee) and Katsiaryna Lashkevich
Process analysts often use process mining tools, like Apromore, to identify how they can improve the business process. This is challenging task as there is so much data and different data and filters one can use. Initially, we developed 21 templates for identifying improvement opportunities using Apromore process mining tool. These templates provide step by step instructions on what to do in Apromore to identify improvement opportunities. However, there are other aspects that can be analyzed using process mining tools. The topic of this thesis is to develop analysis templates for additional use cases such as variant analysis, conformance checking etc.
Self-Driving Process Automation based on Prescriptive Monitoring
Supervisor: Fredrik Milani (fredrik [dot] milani [ät] ut [dot] ee) and Marlon Dumas
Prescriptive monitoring helps optimize execution of a process case by providing recommendations to process workers. These recommendations can be, for instance, for what activity to execute next. Using different techniques, we use historical data and recommendations. However, optimizing one ongoing case can negatively impact another ongoing case. Thus, it is important to ensure that a recommendation improves the overall process performance. In this thesis, the objective is to use an event log to measure the performance of the process, detect possible recommendations, and assess if the recommendation will improve the overall process performance. To this end, we capitalize on existing methods for performance assessment, prescriptive monitoring, prediction, and simulation methods.
Data-Driven Process Analysis using Apromore– A Case Study
Supervisor: Fredrik Milani (fredrik [dot] milani [ät] ut [dot] ee)
Most information systems can log the transactions. These logs can be used to conduct process analysis using process mining tools. Although there are some automated tools, most analysis is still conducted manually. As such, this topic is about taking an event log and analyze it using Apromore (a process mining tool). The objective is to provide descriptive analysis of the business process, highlight strengths and weaknesses in the business process, and propose a set of changes that, if implemented, can improve process performance. Access to an event log is a prerequisite for this topic. This topic is for those who have a log to work with.
How to Explain Outputs of Prescriptive Process Monitoring to End Users
Supervisor: Fredrik Milani (fredrik [dot] milani [ät] ut [dot] ee) and Kateryna Kubrak
Prescriptive monitoring help optimize execution of a process case by providing suggestions to process workers. These suggestions can be, for instance, for what activity to execute next. Such suggestions can also be shared with end users. However, little work has been done on how to communicate the suggestions to end users so they understand and can act on it. In this thesis, the aim is to explore how suggestions in other fields are communicated, contextualize that to prescriptive monitoring, and design a set of examples that are evaluated with real users.
Gamification for Teaching Courses – Case Study of SPM Course [Reserved]
Supervisor: Fredrik Milani (fredrik [dot] milani [ät] ut [dot] ee) and Kateryna Kubrak
Prescriptive monitoring help optimize execution of a process case by providing suggestions to process workers. These suggestions can be, for instance, for what Gamification is a phenomenon that has traction in recent years. The idea is to use game-inspired elements to increase engagement. Within education, gamification has also become more prevalent. At the same time, tools for gamification have become more available. However, it is not clear how gamification can or should be used to enhance learning experience. In this thesis, we explore the theory of gamification for teaching Master-level courses. Then, we review the gamification strategy of one course, propose how it can be improved, and evaluate it. The course in question is Software Product Management course. It is highly recommended that you have taken this course.
Tertiary Study of Process Mining
Supervisor: Fredrik Milani (fredrik [dot] milani [ät] ut [dot] ee)
Process mining has increasingly matured in the past two decades and many systematic literature reviews (SLR) have been conducted and published on various aspects of process mining. If software engineering, a tertiary study is a systematic literature review of SLR studies within a field. This thesis topic is about conducting a SLR study on process mining SLRs.
Review Study of CBDC [Reserved]
Supervisor: Fredrik Milani (fredrik [dot] milani [ät] ut [dot] ee)
In recent years, interest for exploring using blockchain or DLT by central banks have increased. While some central banks have begun with Central Bank Digital Currency (CBDC), others are examining the idea. However, there are a variety of ways to do CBDC from perspectives such as technology, processes, governance, etc. This thesis is about conducing a literature review (predominantly grey literature) to map and analyse existing approaches to CBDC.
Feasibility of Binary Analysis for iOS
Supervisor: Kristiina Rahkema (kristiina [dot] milani [ät] ut [dot] ee)
Static code analysis tools exist for Android, as it is possible to extract the source code (or at least an approximation of it) form the application APK file. This is not as easily possible for iOS applications. Such a tool would, however, be very beneficial, as it would allow the analysis of closed source applications. The aim of this project is to investigate ways how this kind of analysis could be conducted for iOS applications and to determine how feasible such a tool would be. If possible, the student should try to implement a prototype that can either analyze the structure or the library dependencies used in a closed source iOS application. It is expected that such a prototype would not work on all iOS applications, any limitations would need to be discussed in the thesis.
Analysis on the Use of Vulnerable Library Dependencies
Supervisor: Kristiina Rahkema (kristiina [dot] milani [ät] ut [dot] ee)
Through the analysis of manifest files it is fairly easy to collect data on library dependencies on a large scale. We have used such an approach to find projects that have dependencies to vulnerable library versions in the iOS ecosystem. A dependency to a vulnerable library version is, however, not sufficient to determine that the project itself is really vulnerable. The aim of this project is to investigate different approaches how to identify if a library dependency is really used in a project. Depending on the chosen approach, a study on the use of library dependencies can either be conducted on a specific library dependency ecosystem, for example the iOS library dependency ecosystem, or on a larger scale covering multiple ecosystems.
Case Study in Using Fractal Enterprise Model (FEM) for Practical Tasks in an Organization
Supervisor: Ilia Bider (firstname [dot] lastname [ät] ut [dot] ee)
This is a "placeholder" Masters project topic, which needs to be negotiated individually. To engage in this Masters thesis, you need to cooperate closely with a company (preferably a company where you are currently working).
The master’s thesis would include understanding and modeling a (business) organization or a part of it (e.g. a department, service) with a modelling technique called Fractal Enterprise Model. Fractal Enterprise Model (FEM) is a relatively new advanced modeling technique that competes with other techniques used for Enterprise Architecture/Modeling world. It shows connection between different components (processes and assets) in an organization and can be used for business analysis and design on various levels, including the strategic one, like Business Model Innovation (BMI). The topics of your project can range from figuring out the ways FEM can be used in a case organization to using it for a specific task, e.g. finding a cause for a problem, suggesting alternative solutions for a known problem, finding where new IT systems are needed and for what, developing a new Business Model for the organization, or creating a capability map of the organization. The choice of the task depends on the needs of the organization, and on the student’s priorities. Ideally, your project should be connected to some problem/challenge/task that is already understood by the managers in a case organization, as beside your own time you might need to ask for engaging other people in the organization, e.g. for conducting interviews. A successfully completed project may result in a published paper later. Students who have full-time or part-time jobs and who can find a topic connected to their work place will particularly benefit for taking this topic.
Note: FEM is taught in the spring course called Enterprise Modeling. However, going through this course is not a prerequisite for taking this Masters thesis topic.
If you want to have a look on a thesis related to this topic completed at Tartu University, let me know, and I will send you an example.
References
- https://www.fractalmodel.org/ - a site that has many resources related to FEM, including video recordings
- Bider I., Chalak A. (2019) Evaluating Usefulness of a Fractal Enterprise Model Experience Report – an example of a published paper resulting from an MS thesis project
- Bider, I., Lodhi, A. Moving from Manufacturing to Software Business: A Business Model Transformation Pattern – an example related to Business Model Innovation
Adding Annotation Capabilities to the FEM viewer
Supervisor: Ilia Bider (firstname [dot] lastname [ät] ut [dot] ee)
The thesis belongs to the “Applied research type”, more exactly, it belongs to the category “Thesis based on a software solution created by the author”. The task is to add new capabilities to a previously developed tool. The tool in question is called FEM viewer. It provides a user-friendly access to enterprise models created using a “heavy-weight” tool. The viewer is aimed for business people to view and navigate through a package of interconnected enterprise models created by a modeling expert who uses a heavy-weight modeling tool.
The FEM viewer was developed in a BS project by a student at Tartu university using a number of standard graphical libraries available as open source packages. Currently, it provides only possibility to view the models. The main objective for a new project is to extend the tool with capabilities for the viewer to provide feedback and annotate the models. The annotation includes textual annotation (unstructured feedback), as well as more formal logical annotation, like highlighting elements that needs special attention, e.g. from the security point of view.
The FEM viewer is aimed at viewing a special kind of enterprise models called Fractal Enterprise Models. The models are created using a specific tool – FEM toolkit. The latter was developed based on the ADOxx modeling environment. ADOxx has been used by different research and professional groups for creating tools for other modeling techniques, which makes the topic general, as a similar to the FEM viewer tools could be developed for other modeling techniques supported by the tools created with ADOxx. The thesis work would first consist of clarifying the requirements for the new functionality of the FEM viewer, making engineering decisions on how to implement them in the existing FEM viewer and then implementing them. The analysis and implementation process, as well as the resulting software product would have to described in the thesis.
References and pointers
You can investigate the current FEM viewer by going to https://femviewerserver.cloud.ut.ee and using FEMguest/FEMviewer to login. Please, do not change password so that other students can access this account. This account allows only view models, not to add models or administrate the accounts.
The FEM viewer is installed on Linux UBUNTU server. The following components/packages were used when developing FEM viewer:
- React: https://reactjs.org/
- Node: https://nodejs.org/en/
- Express - Node.js web application framework. https://expressjs.com/
- Passport: https://www.passportjs.org/
- Certbot: https://certbot.eff.org/
- Mysql: https://dev.mysql.com
More information on FEM viewer is available at https://github.com/siimlangel/FEM
Some ideas on annotating Enterprise Models can be found here https://hal.archives-ouvertes.fr/hal-00232842/document
Overview of FEM and FEM toolkit see in https://www.fractalmodel.org/
For information on ADOxx, see https://www.adoxx.org