Student Projects 2012/2013

Below is a list of project topics for Masters and Bachelors theses offered by the software engineering research group in 2012-2013. The projects are divided into:

If you're interested in any of these projects, please contact the corresponding supervisor.

Note that the number of projects for Bachelors students is limited. For Bachelors students we're open to student-proposed projects. So if you have an idea for your Bachelors projects and your idea falls in the area of software engineering (broadly defined), please contact the group leader: Marlon . Dumas ät ut.ee


Masters projects

Case Study on Exploratory Testing

Supervisor: Dietmar Pfahl (firstname dot lastname at ut dot ee)

Exploratory software testing (ET) is a powerful and fun approach to testing. The plainest definition of ET is that it comprises test design and test execution at the same time. This is the opposite of scripted testing (having test plans and predefined test procedures, whether manual or automated). Exploratory tests, unlike scripted tests, are not defined in advance and carried out precisely according to plan.

Testing experts like Cem Kaner and James Bach claim that - in some situations - ET can be orders of magnitude more productive than scripted testing, and a few empirical studies exist supporting this claim to some degree. Nevertheless, ET is usually is often confused with (unsystematic) ad-hoc testing and thus not always well regarded in both academia and industrial practice.

The objective of this project will be to conduct a case study in a software company investigating the following research questions:

  • To what extend is ET currently applied in the company?
  • What are the advantages/disadvantages of ET as compared to other testing approaches (i.e., scripted testing)?
  • How can the current practice of ET be improved?
  • If ET is currently not used at all, what guidance can be provided to introduce ET in the company?

This project requires that the student has (or is able to establish) access to a suitable software company to conduct the study.

Note that one such thesis project is currently ongoing in one Estonian company and thus, your target company must be a different one.

Exploring the Software Release Planning Problem with Constraint Solving

Supervisor: Dietmar Pfahl (firstname dot lastname at ut dot ee)

Decision-making is central to Software Product Management (SPM) and includes deciding on requirements priorities and the content of coming releases. Several algorithms for prioritization and release planning have been proposed, where humans with or without machine support enact a series of steps to produce a decision outcome. Instead of applying some specific algorithm to find an acceptable solution to a decision problem, in this thesis we propose to model SPM decision-making as a Constraint Satisfaction Problem (CSP), where relative and absolute priorities, inter-dependencies, and other constraints are expressed as relations among variables representing entities such as feature priorities, stakeholder preferences, and resource constraints. The solution space is then explored with the help of a constraint solver without humans needing to care about specific algorithms.

The goal of this thesis project is to discuss advantages and limitations of CSP modeling in SPM and to give principal examples as a proof-of-concept of CSP modeling in requirements prioritization and release planning. If time permits, an evaluation of the CSP-based models via comparison with established tools such as ReleasePlanner will be part of the project.

The project will consist of the following steps:

  • Formulation of the release planning problem as a CSP
  • Familiarisation with JaCoP – Java Constraint Solver or an equivalent tool
  • Development of a constraint solver for the release planning problem with JaCoP
  • Application of the constraint solver to a set of open source feature models available from the SPLOT Feature Model Repository, maintained at the University of Waterloo, Canada
  • Performance evaluation of the constraint solver
  • Optional: Comparison with the performance of existing release planning tools, e.g., ReleasePlanner
  • Summary of the findings, discussion, outline of recommended follow-up research

Gamification for Software Engineering Education

Supervisor: Dietmar Pfahl (firstname dot lastname at ut dot ee)

Gamification as a business practice has exploded over the past years. Organizations are applying it in areas such as marketing, human resources, productivity enhancement, sustainability, training, health and wellness, innovation, and customer engagement.

The objective of this project will be to apply gamification in the context of higher education in software engineering. To this end, the following steps will be taken in this project:

  • Selection of a suitable university course (whole course or course elements, including lab sessions). Suggested examples are courses on software testing or software engineering management. The final decision on the course selection and scope will be agreed between the student and supervisor before the start of the project.
  • Study into the nature and techniques of gamification (i.e., gamification design framework).
  • Application of gamification to the selected course.
  • Evaluation of the gamified course.

This project requires that the student either has taken the course that will be subject to gamification or that he/she accepts one of the course/lab proposals made by the supervisor.

Web-Based Single or Multi-Player Project Simulation Game

Supervisor: Dietmar Pfahl (firstname dot lastname at ut dot ee)

Software development is a dynamic and complex process as there are many interacting factors throughout the life-cycle that impact cost and schedule of the development project, and quality of the developed software product. In addition, software industry constantly faces increasing demands for quality, productivity, and time-to-market, thus making the management of software development projects one of the most difficult and challenging tasks in any software organization.

The potential of simulation models for the training of managers has long been recognized: flight-simulator-type environments (or microworlds) confront managers with realistic situations that they may encounter in practice, and allow them to develop experience without the risks incurred in the real world.

The objective of this project is to develop a simulation-based software project management game for two to four players, comprising the following elements:

  • Development of a process simulation model (based on existing work), integrated into a web-based application suitable for single or multi-player gaming sessions
  • Didactic gaming scenarios
  • Proof of concept, i.e., at least one successful game played by students with lessons learnt recorded

A Simulator for Analysing the Robustness of Software Release Plans

Supervisor: Dietmar Pfahl (firstname dot lastname at ut dot ee)

With ReleasePlanner(TM), developed at the University of Calgary, Canada, a tool exists that proposes optimised alternatives of requirements allocations to software releases, i.e., release plans. However, the quality (optimality) of a release plan depends highly on assumptions made about the cost and benefit of requirements as well as their dependencies. It is generally unclear, to what extent small errors in the underlying assumptions impact the configurations of proposed optimised release plans. If the impacts are small (i.e., feature allocations don’t change dramatically), a proposed release plan might be considered robust against such kinds of errors.

The objective of this thesis project is to develop a method that would develop a systematic approach for the robustness analysis of automatically generated software release plans, using existing tools for software release planning, such as, for example, the tool ReleasePlanner developed at the University of Calgary, Canada.

This project could be conducted in collaboration with the University of Calgary and might offer the opportunity for a visit of Prof. Ruhe's Software Engineering Decision Support Laboratory (SEDSL) in Calgary, Canada.

Development and Application of a Process Simulator Family for Analysing Software Development Processes

Supervisor: Dietmar Pfahl (firstname dot lastname at ut dot ee)

Process simulation is a common practice in many engineering disciplines. The advantage of process simulation is that the capability of existing (and future) process designs can be evaluated without executing the actual process. However, in software development, process simulation s not a common practice to analyse (and improve) the processes according to which software is developed. There are many reasons for this situation. Two important reasons are the difficulty of calibrating process models with real-world data (due to the lack of such data) and the lack of stability of used processes. Nevertheless, process simulation could be a useful research tool to analyse the capability of process paradigms (waterfall, iterative, incremental, various types of agile and lean processes, etc.) under varying assumptions about the application context (e.g., type of product, available resources, quality goals, project size).

The aims of this thesis project are the following:

  • Develop a family of process simulators representing process paradigms commonly used in industrial software development projects.
  • Systematic evaluation of process paradigms in various contexts using the developed process simulators.
  • Based on the analysis of evaluation results, discussion of the advantages/disadvantages of process paradigms.

The project will consist of the following steps:

  • Selection of a process modeling tool (choices are to be determined; one possibility is to use a business process modeling tool which has simulation capability)
  • Selection of process paradigms to be analysed
  • Definition of contexts and application scenarios (to evaluate process paradigms)
  • Development of process simulators
  • Application of process simulators to evaluate process paradigms in various contexts

Note: This thesis topic can be worked on by several students. The task can be split with regards to the choice of the modeling tools and/or the choice of process paradigms. For students interested in a BSc thesis, this topic can be tailored to fit into the reduced time frame.

Mining Business Process Models with Branching Conditions

Supervisor: Luciano García-Bañuelos (luciano dot garcia ät ut dot ee)

Process mining is a family of techniques to discover business process models and other knowledge about business processes from event logs. Existing process mining techniques are able to discover models that capture the order in which tasks are executed in a business process (i.e. the "control-flow" relations between tasks) but not the conditions under which a given task or set of tasks are performed (also called branching conditions). As a result, process models discovered by existing process mining techniques are of limited use.

A few process mining techniques such as ProM's Decision Miner are able to discover branching conditions from event logs. However, these techniques rely on decision tree mining and can only mine predicates composed of terms of the form "x op c" where "x" is a variable, "op" is a comparison operator and "c" is a constant.

In this project you will apply state-of-the-art predicate mining techniques in order to implement a tool for mining branching conditions of a more general form that those that ProM's decision miner is able to extract. The initial ideas and a partial implementation will be given to you at the start of the project. Your job will be to test these ideas in a systematic way and to refine the ideas based on the feedback from the tests. It is foreseen that this project will lead to a new plugin to be integrated into the ProM framework.

Mining Business Process Models with Exception Handlers

Supervisor: Luciano García-Bañuelos (luciano dot garcia ät ut dot ee)

Process mining is a family of techniques to discover business process models and other knowledge about business processes from event logs. Existing process mining techniques are able to automatically discover large and complex process models from event logs. However, oftentimes the models produced by these techniques are difficult to understand. One reason why such models are difficult to understand is because they only capture "local" control-flow relations such as sequential flow, conditional splits and parallel splits. For example, these techniques are not able to produce models with exception handlers, which in some situations make models easier to understand because they clearly separate normal behavior from exceptional or "secondary" behavior.

In this project you will design and implement new techniques to automatically discover process models with exception handlers. The models to be produced will be captured in the BPMN notation and will include both interrupting and non-interrupting boundary events, which are the constructs available in BPMN for capturing exceptional and secondary behavior. This project requires some background knowledge in BPM (for example having completed the BPM course).

Mining Business Process Models with Multi-Instance Activities

Supervisor: Viara Popova (firstname dot lastname at ut dot ee)

Process mining is a family of techniques to discover business process models and other knowledge about business processes from event logs. Existing process mining techniques are able to automatically discover large and complex process models from event logs. However, oftentimes the models produced by these techniques are difficult to understand. One reason why such models are difficult to understand is because they only capture "local" control-flow relations such as sequential flow, conditional splits and parallel splits. For example, these techniques misinterpret the case where multiple instances of a given activity are performed in parallel and then synchronized upon completion. This happens for example in a business process where raw materials need to be obtained from multiple suppliers in order to assemble a product.

In this project you will design and implement new techniques to automatically discover process models with so-called multi-instance activities as well as synchronization constraints attached to these multi-instance activities. This project requires some background knowledge in BPM (for example having completed the BPM course).

Aggregating Complexity Metrics to Predict Process Model Understandability

Supervisor: Marlon Dumas (firstname dot lastname at ut dot ee)

There exists several complexity metrics for business process models that attempt to determine (computationally) to what extent a given process model is likely to be understandable by process analysts. It has been shown that some of these complexity metrics are statistically correlated with perceived understandability and with error-proneness (i.e. number of modeling errors found in process models). However, none of these complexity metrics alone is able to predict whether or not a given model M will be perceived as easy-to-understand or whether or not it is error-prone.

The objective of this project will be to design and to evaluate aggregated complexity metrics that can accurately predict the perceived understandability and error-proneness of process models.

This project requires some background knowledge in BPM (for example having completed the BPM course) as well as background knowledge in data mining and machine learning.

Business Process-Focused Requirements Elicitation for Packaged System Evaluation

Supervisor: Payman Milani (milani ät ut dot ee)

It is not uncommon that a company wants to replace an existing legacy custom-made system supporting a given set of business processes, with a "new" packaged system offered by a vendor that supports similar processes. In this context, analysts are given the task to evaluate one or multiple alternative systems in terms of their ability to support the business processes that the current system supports. This evaluation that goes deeper than the typical "salesperson presentation" where everything works perfectly and is fully functional, but not too deep as it is costly and time-consuming and analysts do not have an infinite amount of time and resources to perform such evaluations.

In order to evaluate a given packaged system, a requirements specification is needed. But trying to gather all the requirements in a systematic way before the evaluation is too costly. Often organizations do not know exactly what they need or what they want. Furthermore, analysts need to resist the temptation of assuming that the new packaged system will support the same business processes supported by the old legacy system. First of all, it is unlikely that the new system supports the existing processes in an "as is" manner. Secondly, it is costly to adapt and maintain an adapted packaged system in order to fit the "as is" processes. Finally, bringing in a new packaged system is in fact an opportunity to improve the current business processes.

In this context, the analyst is left with the question of gathering "just enough requirements" to allow them to: (i) evaluate several software packages deeply enough to be able to narrow down these alternatives; and (ii) assess the effort needed to adapt the current business processes and the evaluated packaged systems so that they fit with one another.

The goal of this project will be to define a method for business process-focused requirements gathering that could be used in situations such as the one described above. To this end, you will start by reviewing existing methods for requirements elicitation from business process models and consolidate them into a clearly-defined and reproducible method. You will apply your proposed method on at least one (real) case study.

A Survey of Security Requirements Engineering Practice

Supervisor: Raimundas Matulevicius (firstname dot lastname ät ut dot ee)

Nowadays, security is recognized an important artifact when developing the information and software systems. However in many cases theoretical research and practice might go to different directions. On the one hand practitioners might not be aware of the current research results; on the other hand researcher might not be aware of the practice needs when developing to developing secure systems. The main goal of this thesis is to understand the link of these two perspectives. Firstly you will design and execute a systematic survey to understand the practice needs. Secondly, a framework (including the sufficient and necessary conditions) to facilitate security research transfer to practice will be defined.

Interoperability Between Business Process and Security Requirements Modelling

Supervisor: Raimundas Matulevicius (firstname dot lastname ät ut dot ee)

Security risk management is a part of understanding security requirements when analyzing security risks. In the previous work a number of modelling languages (e.g., BPMN, EPC, Secure Tropos, Misuse cases, mal-activities) were considered on how they supper the concepts of the domain model for the information system risk management (ISSRM). After defining this correspondence, the next step is to understand the model interoperability means. In other words, the main goal of this thesis is to define how model created using one of the mentioned languages could be translated to models of other languages. The candidate would need to understand the modelling languages and how they are aligned to the ISSRM domain model. In the next step s/he would need to define a set of transformation rules to translate between different modelling languages for the security risk management. The contribution would need to be validated in empirical cases studies or by developing a prototype tool (as a proof of concept). This thesis is a part of the ETF project "A Framework for Aligning Business Processes and Security Requirements".

Transforming Secure Tropos Models to Misuse Cases

Supervisor: Raimundas Matulevicius (firstname dot lastname ät ut dot ee)

Security modeling languages, such as Secure Tropos and Misuse case, has proven its usefulness to elicit, negotiate and visualize security requirements and contribute to the thorough definition of the secure information systems. Although these languages expresses different viewpoints on the modelled system, i.e., Secure Tropos help to understand the security rationale, Misuse cases help to relate security and functionality together), there are little work done to define interoperable tools that would allow to combine models created using these languages together.

This thesis is expected to define a platform that would facility and support transformation of the Secure Tropos models to Misuse cases and vice versa. The candidate is expected to survey the existing work on the topic, to understand what are the theoretical bases to translate between these languages. Next the candidate will be asked to develop a prototype tool which would allow translate between security modelling languages. Basically this would include development of the plugins to the industrial case tool Magic Draw. This thesis is a part of the collaboration between the University of Tartu (Estonia) and University of East London (United Kingdom).

Two-Staged Crowdsourcing Tool for Linking Data and Evaluating Linked Data Quality

Supervisor: Peep Küngas (peep.kungas ät ut.ee)

Linked data and crowdsourcing have emerged in recent years as mechanisms to enable effective and economical large-scale data collection, filtering, aggregation, and presentation on the Web. Recent initiatives such as Civil War Data 150 have had some success at combining linked data methods with crowdsourcing. However, this and other initiatives suffer from a major drawback, namely a lack of quality assurance (QA) during the crowd-sourcing process. At the same time QA has been successfully handled in reCAPTCHA, which is used to crowdsource transcription of text from digitized books in the Internet Archive.

The goal of this Master thesis work will be to develop a tool for crowdsourcing that will suppport the creation and quality-checking of links between data objects originating from a variety of sources. The main contribution will be an implementation of a two-staged method, allowing first the creation of links between datasets at the metadata level, and second, the evaluation of the created links by means of simple closed-ended questions. The question-answering process will be facilitated through a specific widget, which can be easily placed in any Web site, similarly to reCAPTCHA. As a starting point the DBpedia will be used for experimentation.

Data Source Selection Strategies for Deep Web Surfacing

Supervisor: Peep Küngas (peep.kungas ät ut.ee)

Deep Web search aims at surfacing the data, not available to mainstream Web crawlers, from databases and other data sources to the visible Web for further use. During the past years several components of a Deep Web search engine have been developed at ATI for covering different aspect of deep Web search. These components include: SOAP-JSON proxy for retrieving data from Web services, data visualization engine and SOAP cache solution for faster data retrieval. A simple prototype solution capable to handle a case consisting of several services has been designed and implemented. However, this solution lacks performance and scalability, which origins partly from inefficient data source selection scheme. The aim of this project is to design a strategy for effective and efficient selection of Web services during the search session. The proposed strategy will be evaluated on a large collection of Web services providing hundreds of thousands of operations for data retrieval.

A Crawler for RESTful, SOAP Services and Web Forms

Supervisor: Peep Küngas (peep.kungas ät ut.ee)

The Deep Web, consisting of online databases hidden behind SOAP-based or REST-ful Web services or Web forms, is estimated to contain about 500 times more data than the (visible) Web. Despite many advances in search technology, the full potential of the Deep Web has been left largely underexploited. This is partially due to the lack of effective solutions for surfacing and visualizing the data. The Deep Web research initiative at University of Tartu's Institute of Computer Science has developed an experimental platform to surface and visualize Deep Web data sources hidden behind SOAP Web service endpoints. However, currently this experimental platform only supports a limited set of SOAP endpoints, updated on ad hoc basis.

The aim of this project is to build a crawler and an indexing engine capable of recognizing endpoints behind Web forms, RESTful services and SOAP-based services, together with their explicit descriptions (e.g. WSDL interface descriptions, when available). Furthermore, the crawler should identify examples of queries that can be forwarded to those endpoints, especially for endpoints with no explicit interface descriptions such as Web forms.

This project is available both for Master and for Bachelor students. The goal of the Masters project would be to build a crawler supporting endpoints with and without explicit interfaces. The goal of the Bachelor thesis will be to crawl WSDL interfaces only.

Transforming the Web into a Knowledge Base: Linking the Estonian Web

Supervisor: Peep Küngas (peep.kungas ät ut.ee)

The aim of the project is to study automated linking opportunities for Web content in Estonian language. Recent advances in Web crawling and indexing have resulted in effective means for finding relevant content from the Web. However, getting answers to queries, which require aggregation of results, is still in its infancy since better understanding of the content is required. At the same time there has been a fundamental shift in the content linking - instead of linking Web pages, more and more Web content is tagged and annotated to facilitate linking of smaller fragments of Web pages by means of RDFa and microformat markups. Unfortunately this technology has not been widely adopted yet and further efforts are required to advance the Web in this direction.

This project aims at providing a platform for automating this task by exploiting existing natural language technologies, such as named entity recognition for Estonian language, in order to link content of the entire Estonian Web. For doing this, two Master students will work closely, first in setting up the conventional crawling and indexing infrastructure for the Estonian Web and then extending the indexing mechanism with a microtagging mechanism, which will enable linking the crawled Web sites. The microtagging mechanism will take advantage of existing language technologies to extract names (such as names of persons, organizations and locations) from the crawled Web pages. In order to validate the approach a portion of the Estonian Web is processed and exposed in RDF form through a SPARQL query interface such as the one provided by the Virtuoso OpenSource Edition.

Automated Estimation of Company Reputation

Supervisor: Peep Küngas (peep.kungas ät ut.ee)

Reputation is recognized as a fundamental instrument of social order - a commodity, which is accumulated over time, is hard to gain and easy to loose. In case of organizations reputation is also linked to their identity, performance and the way others respond to their behaviour. There is an intuition that reputation of a company affects perception of its value by investors, helps to attract new customers and to retain the existing ones. Therefore organizations, focusing to long-term operation, care about their reputation.

Several frameworks, such as WMAC (http://money.cnn.com/magazines/fortune/most-admired/, http://www.haygroup.com/Fortune/research-and-findings/fortune-rankings.aspx), used by the Fortune magazine, have been exploited to rank companies by their reputation. However, there are some serious issues associated with reputation evaluation in general. First, the existing evaluation frameworks are usually applicable to evaluation of large companies only. Second, the costs of applying these frameworks are quite high in terms of accumulated time of engaged professionals. I.e. in case of WMAC more than 10,000 senior executives, board directors, and expert analysts were engaged to fill questionnaires to evaluate nine performance aspects of Fortune 1000 companies in 2009. Third, the evaluation is largely based on subjective opinions rather than objective criteria making continuous evaluation cumbersome and increases the length of evaluation cycles.

This thesis project aims at finding a solution to these issues. More specifically, the project is expected an answer the following research question: in which degree the reputation of a company is determined by objective criteria such as its age, financial indicators, sentiment of news articles and comments in the Web etc. The more specific research questions are the following:

  1. Which accuracy in reputation evaluation can be achieved by using solely objective criteria?
  2. Which objective criteria and which combinations discriminate best reputation of organizations?
  3. In which extent does reputation of an organization affect reputation of another organization through people common in their management?
  4. How do temporal aspects (organization's age, related past events etc) bias reputation?

In order to answer to these questions network analysis and machine learning methods will be exploited and a number of experiments will be performed with a given dataset. The dataset to be used is an aggregation of data from the Estonian Business Registry, Registry of Buildings, Land Register, Estonian Tax and Customs Board, Register of Economic Activities, news articles from major Estonian news papers and blogs and some propriatory data sources.

Analyzing the Evolution of Formal Networks of Companies and Their Board Members for Bankruptcy Prediction

Supervisor: Peep Küngas (peep.kungas ät ut.ee)

There are certain symptoms that characterize companies that are likely to face bankruptcy in the near future. One of them, in addition to various financial ratios, is a high turnover in the management. Furthermore, in case of fraudulent or strategic bankruptcies there is often a closed group of people who will be brought to the management to take over the responsibilities of the previous management.

The aim of this thesis is to design a set of heuristics able to estimate the likelihood that a company will go bankrupt in the near future. For doing this, first, the evolution of business networks around affected companies will be analysed. Based on the identified evolution patterns, suitable heuristics will be developed and, finally, validated on real-life datasets. The input data for this project will be provided by Inforegister.

Tools for software project data collection and integration

Supervisor: Siim Karus (siim04 ät ut.ee)

Data generated in software projects is usually distributed across different systems (e.g. CVS, SVN, Git, Trac, Bugzilla, Hudson, Wiki, Twitter). These systems have different purposes and use different data models, formats and semantics. In order to analyze software projects, one needs to collect and integrate data from multiple systems. This is a time-consuming task. In this project, you will design a unified data model for representing data about software development projects extracted for the purpose of analysis. You will also develop a set of adapters for extracting data from some of the above systems and storing it into a database structured according to the unified model.

GPU-accelerated data analytics

Supervisor: Siim Karus (siim04 ät ut.ee)

In this project a set of GPU accelerated data mining or analytics algorithms will be implemented as an extension to an analytical database solution. For this task, you will need to learn parallel processing optimisations specific to GPU programming (balancing between bandwidth and processing power), implement the analytics algorithms, and design a user interface to accompany it. As the aim is to provide extension to analytical databases (preferably MSSQL, Oracle or PostgreSQL), you will also need to learn the extension interfaces of these databases and their native development and BI tools. Finally, you will assess the performance gains of your algorithms compared to comparable algorithms in existing analytical database tools.

Code clone detection using wavelets

Supervisor: Siim Karus (siim04 ät ut.ee)

Code clones have been identified as "bad smells" in software development often leading to increased maintenance costs and increased code complexity. Thus, identification of such clones is a required step of code quality assurance. Wavelet analysis has been found to be extremely useful for clone detection in image processing and financial market analysis. Wavelets have the benefit of allowing comparisons than span different scales and strength. Wavelet analysis also benefits a lot from parallelisation, which has become more affordable thanks to GPU computing and cloud computing advances. Thus, it makes sense to evaluate wavelet analysis for solving problems in software engineering as well.

In this project you will evaluate the usefulness of wavelets for code clone detection. You will accomplish that by first designing/proposing a way to encode source code as multidimensional numeric series and then running wavelets based clone detection algorithm on the series. Finally, you need to assess the performance of your solution to alternative solutions.

End-to-End Automated Validation of HPLC Analytical Procedures

Supervisor: Peep Küngas (peep.kungas ät ut.ee)

High-performance liquid chromatography is a technique used in analytical chemistry to separate compounds out of a given mixture.

Whenever an HPLC analysis is performed on a given substance, the results provided by the HPLC equipment need to be validated in order to ensure that they are reliable with respect to the purpose of the analysis. To this end, the laboratory personnel needs to gather a significant amount of data and analyze it using various procedures.

In a previous Master's thesis, a system was developed that allows a developer to define Web forms and report generators for validation of HPLC procedures. Also, one guideline (out of about 10 possible guidelines) was implemented from start to end, including all the required forms and reports.

The aim of this new Master's project is to extend this system with the ability to automate the transfer of data from HPLC equipment to the Web-based system for HPLC validation (CVG) and to automate validation steps that currently require manual intervention, specifically those related to the identification of spikes in histograms. The goal is to have a workflow for validation that is as automated as possible.

The project will also address the problem of reusing data collected during a validation using a given guideline, in order to perform a validation using a different guideline, so as to minimize the effort required to perform validations using multiple guidelines.

This Master's project is part of a broader project on automation of HPLC Valdidation involving UT's Institute of Computer Science and Institute of Chemistry. Funding is available to provide remuneration to Master's students who contribute to this project.

Bachelors projects

Lightning-Fast Multi-Level SOAP-JSON Caching Proxy (Bachelors topic)

Supervisor: Peep Küngas (peep.kungas ät ut.ee)

In a previous Master thesis a solution was developed for proxying SOAP requests/responses to JavaScript widgets exchanging messages with JSON payload. Although this approach was shown to be useful for surfacing Deep Web data, it suffers from some performance bottlenecks, which arise when a SOAP endpoint is frequently used.

This Bachelors thesis aims at developing a cache component, which will make dynamic creation of SOAP-JSON proxies more effective with respect to runtime latency. The resulting cache component will be evaluated from the performance point of view.

Complex Event Processing for OpenAjax Hub 2.0 Widgets

Supervisor: Peep Küngas (peep.kungas ät ut.ee)

Web Widgets provide a mechanism for exposing Web application components, which can be used to compose new Web applications. Currently the majority of widgets available over the Web are rather static (typical examples are clock widgets, weather widgets, stock ticker widgets etc) and do not facilitate interaction with other widgets due to the constraints originating mainly from Web browsers. OpenAjax Hub 2.0 is a framework for facilitating inter-widget communication such that widgets running in the same user agent (i.e. Web browser) listen to the events thrown by one another. The main disadvantage of OpenAjax Hub is that it assumes that event message structures, used to implement inter-widget communication, of independent widgets are known already at design-time. The aim of this project is to transfer the main concepts of complex event processing to OpenAjax Hub platform such that rules for run-time processing of events can be defined. These run-time event processing rules will be then the main means to implement Web application logic while keeping the implementations of independent widgets untouched. To enable this vision, in this Bachelors project you will design and implement a rule execution engine on top of OpenAjax Hub, which will support the description of application logic at high level of granularity.

A Crawler for RESTful, SOAP Services and Web Forms

Supervisor: Peep Küngas (peep.kungas ät ut.ee)

The Deep Web, consisting of online databases hidden behind SOAP-based or REST-ful Web services or Web forms, is estimated to contain about 500 times more data than the (visible) Web. Despite many advances in search technology, the full potential of the Deep Web has been left largely underexploited. This is partially due to the lack of effective solutions for surfacing and visualizing the data. The Deep Web research initiative at University of Tartu's Institute of Computer Science has developed an experimental platform to surface and visualize Deep Web data sources hidden behind SOAP Web service endpoints. However, currently this experimental platform only supports a limited set of SOAP endpoints, updated on ad hoc basis.

The aim of this project is to build a crawler and an indexing engine capable of recognizing endpoints behind Web forms, RESTful services and SOAP-based services, together with their explicit descriptions (e.g. WSDL interface descriptions, when available). Furthermore, the crawler should identify examples of queries that can be forwarded to those endpoints, especially for endpoints with no explicit interface descriptions such as Web forms.

This project is available both for Master and for Bachelor students. The goal of the Masters project would be to build a crawler supporting endpoints with and without explicit interfaces. The goal of the Bachelor thesis will be to crawl WSDL interfaces only.

Web Forms and Reports for Validation of HPLC Analytical Procedures

Supervisor: Peep Küngas (peep.kungas ät ut.ee)

High-performance liquid chromatography is a technique used in analytical chemistry to separate compounds out of a given mixture.

Whenever an HPLC analysis is performed on a given substance, the results provided by the HPLC equipment need to be validated in order to ensure that they are reliable with respect to the purpose of the analysis. To this end, the laboratory personnel needs to gather a significant amount of data and analyze it using various procedures.

In a previous Master's thesis, a system was developed that allows a developer to define Web forms and report generators for validation of HPLC procedures. In this thesis, one guideline (out of about 10 possible guidelines) was implemented from start to end, including all the required forms and reports.

The aim of this Bachelor's thesis is to implement additional guidelines, using the forms generator and the reports generator developed in the previous Master's project.

This project will be undertaken in collaboration with UT's institute of Chemistry. Supervision and access to documentation and domain experts will be facilitated by the Institute of Chemistry.

Gamification for Software Engineering Education

Supervisor: Dietmar Pfahl (firstname dot lastname at gmail dot com)

Gamification as a business practice has exploded over the past two years. Organizations are applying it in areas such as marketing, human resources, productivity enhancement, sustainability, training, health and wellness, innovation, and customer engagement.

The objective of this project will be to apply gamification in the context of higher education in software engineering. To this end, the following steps will be taken in this project:

  • Selection of a suitable university course (preferably a part of a course rather than an entire one). Suggested examples are courses on software testing or software engineering management. The final decision on the course selection and scope will be agreed between the student and supervisor before the start of the project.
  • Study into the nature and techniques of gamification (i.e., gamification design framework).
  • Application of gamification to the selected course.
  • Evaluation of the gamified course.

This project requires that the student either has taken the course that will be subject to gamification or that he/she accepts one of the course/lab proposals made by the supervisor.

Web-Based Single-Player Project Simulation Game

Supervisor: Dietmar Pfahl (firstname dot lastname at ut dot ee)

Software development is a dynamic and complex process as there are many interacting factors throughout the life-cycle that impact cost and schedule of the development project, and quality of the developed software product. In addition, software industry constantly faces increasing demands for quality, productivity, and time-to-market, thus making the management of software development projects one of the most difficult and challenging tasks in any software organization.

The potential of simulation models for the training of managers has long been recognized: flight-simulator-type environments (or microworlds) confront managers with realistic situations that they may encounter in practice, and allow them to develop experience without the risks incurred in the real world.

The objective of this project is to develop a simulation-based software project management game for two to four players, comprising the following elements:

  • Development of a process simulation model (based on existing work), integrated into a web-based application suitable for single-player gaming sessions
  • Didactic gaming scenarios
  • Proof of concept, i.e., at least one successful game played by students with lessons learnt recorded