Each year, we announce new topics that reflect the latest trends and advancements in the field. The topics are general and every student shall agree on a specific area of interest and research with their supervisor. Dissertation theses must be submitted 6 years at the latest after the day of enrollment in studies.
Benešová Vanda, prof. Ing., CSc.
Research of new methods of computer vision in medical applications using artificial intelligence
Computer vision is gaining an increasingly important position in the automatic processing of medical visual data, especially radiological and histological images. The most important current topics of computer vision research in medical applications are related to the diagnosis of various diseases and their goal is to provide the doctor with additional relevant information, or to relieve him of some tasks in an automatic or semi-automatic mode.
To meet these goals, the development of new, robust computer vision methods using the modern approaches of deep neural networks is necessary.
We see challenges not only in the research of new methods of computer vision using deep learning, research of their interpretability and explainability, in the generation of data for the data augmentation, but also in the optimization of the process of iterative development of a new medical application, including the efficiency of the annotation process.
Research during the doctoral studies will focus on one of the mentioned areas.
Čejka Tomáš, doc. Ing., Ph.D.
Anomaly detection and mitigation in computer and IoT networks
Specialist supervisor: doc. Ing. Tomáš Čejka, Ph.D.
The aim of this work will be research and development of algorithms for detection, identification, and mitigation of security threats and anomalies in computer networks, especially Internet of Things (IoT). From the perspective of network security, it is necessary to consider the IoT area as a threat not only for IoT devices but also for other devices and services in the Internet, since the IoT network can easily become a source of trouble (as it was observed in recent history). It is necessary to monitor IoT network traffic, derive meta information about the traffic based on events and device behavior, and use such meta information for identification and mitigation of threats, choosing a suitable mitigation strategy.
The goal of the dissertation thesis will be to find suitable ways to create long-term models of communication, which represent negative and positive events, and to use such models for anomaly detection, identification and mitigation of sources of troubles with low latency. Dissertability of this topic is based on non-trivial challenges such as processing and filtering huge volume of network traffic and creation of models of network traffic, discovering anomalies, identification of sources of trouble and choosing correct strategy of mitigation of malicious traffic. Contrary to classic IP networks, the IoT communication lays mainly in specific physical layers. This brings new potential attack vectors that are not detectable using ordinary IP traffic analysis. Therefore, it is needed to find new approaches to IoT traffic monitoring. The base of the work will be a research in the area of statistical methods, probabilistic models, and usage of algorithms of artificial intelligence.
Due to the modern networks bandwidth and on-line monitoring requirements, it is necessary to design and develop algorithms using a decomposition among hardware and software components and to use suitable technologies for hardware acceleration (e.g., FPGA).
Encrypted network traffic analysis and security threats detection in high-speed networks
The aim of this topic will be a research of algorithms for classification of encrypted network traffic, detection of security threats in high-speed computer networks, and the creation of automatically annotated data sets. Encrypted traffic currently represents a big challenge for standard monitoring tools, which use mainly unencrypted information extracted from packets. Thus it is an important topic for the scientific community in network security and the professional public. Most network traffic is already encrypted, so it is necessary to explore new sources of information about what is happening on the network. This information is essential for both network operators and security analysts. The aim of this general topic of the dissertation theses is research that uses mainly statistical properties of traffic, which can be calculated in real-time at a speed of above 100Gb/s (using hardware acceleration).
The goal is to research classification and detection algorithms based on machine learning for network processing of IP flows enriched with new statistics and the experimental evaluation of these developed algorithms on the long-term high-speed operation.
The dissertability of the topic is based on the fact that it is a solution to very non-trivial problems, such as processing and filtering large volumes of data, network traffic modeling, finding deviations, identifying attackers, and proper management of mitigation of the detected security incidents. In the field of encrypted traffic analysis, research results from the global scientific community are beginning to emerge, but a sufficiently feasible solution has not yet been published. The basis will be research into the possibilities of using statistical methods, probabilistic models, and artificial intelligence algorithms.
Due to the current speeds of network transmissions and requirements for online monitoring, it is necessary to design and implement algorithms using decomposition into hardware and software and using appropriate hardware acceleration technologies (e.g., FPGA).
Fišer Petr, doc. Ing., Ph.D.
Approximate Logic Circuits Testing
So called “approximate” logic circuits are one of contemporary mainstreams in logic design. Here the logic circuit needs not compute the desired function exactly, some error is tolerated. The main purpose of designing such circuits is a significantly reduced area and power consumption. Approximate circuits find application in image/audio processing and recognition, neural networks, data-mining, etc.
Testing of these circuits now becomes a new challenging task. Test generation for approximate circuits offers more degrees of freedom: the test needs not be complete, not all faults need to be tested in order to comply with the approximation demands. As a result, the test can be shorter.
The aim of the research will be to design novel Automated Test Patterns Generation (ATPG) algorithms for approximate circuits.
Artificial Intelligence in Logic Synthesis
Algorithms used to design digital circuits (EDA algorithms) are usually of a greedy nature. Local decisions are made randomly, or based on topological structure, or by the algorithm designer’s experience. These decisions need not be well chosen, resulting in inferior result quality. To resolve this problem, AI strategies can be incorporated into EDA algorithms. Here the AI learns in the EDA process and then helps in the subsequent processing.
The aim of the research is to analyze possibilities of application of AI in logic synthesis algorithms and devise new AI-guided logic synthesis algorithms.
Improvements in the Logic Synthesis and Optimization Process Control
Contemporary logic synthesis and optimization tools (both commercial and academic) mostly emphasize computational speed, at expense of result quality. Our recent research has shown that these tools tend to get stuck in deep local optima, and therefore they often produce very inferior results (in terms of area and/or delay). One of the reasons for it is a deterministic nature of the algorithms. Randomization of the algorithms has been found to be a viable, but just partial solution to this problem [1], [2]. The second, and more important reason of failure is the lack of a global algorithm control. Most of present logic synthesis and optimization algorithms are of an iterative nature, whereas their individual parts (elementary operations) are executed essentially ad-hoc and speculatively. Implementation of efficient top-level algorithm control means should significantly improve performance of logic synthesis and optimization.
The aim of the research is to investigate the behavior of individual elementary synthesis steps (e.g., in the ABC synthesis tool [3]), determine their dependence and orthogonality, and devise an improved overall algorithm control.
- [1] P. Fišer and J. Schmidt, “Improving the Iterative Power of Resynthesis,” in Proc. of 15th IEEE Symposium on Design and Diagnostics of Electronic Systems (DDECS), Tallinn (Estonia), April 18-20, 2012, pp. 30-33.
- [2] P. Fišer and J. Schmidt, “On Using Permutation of Variables to Improve the Iterative Power of Resynthesis,” in Proc. of 10th Int. Workshop on Boolean Problems (IWSBP), Freiberg (Germany), September 19-21, 2012, pp. 107-114.
- [3] Berkeley Logic Synthesis and Verification Group, “ABC: A System for Sequential Synthesis and Verification” [Online]. Available: http://www.eecs.berkeley.edu/alanmi/abc/.
Randomized Iterative Algorithms in Logic Synthesis
Contemporary logic synthesis and optimization tools (both commercial and academic) mostly emphasize computational speed, at expense of result quality. Our recent research has shown that these tools tend to get stuck in deep local optima, and therefore they often produce very inferior results (in terms of area and/or delay). Randomized iterative algorithms seem to efficiently solve this problem [1], [2] – they offer a trade-off between the run-time and result quality.
Moreover, present studies have shown that most of logic synthesis and optimization tools are very sensitive to randomness accidently introduced “from outside”, by the designer itself [3], [4]. Synthesis then produces results significantly differing in quality, when only slight changes in the source circuit description are made. Such a behavior is highly unwanted. Thus, it is required to analyze this behavior, determine its reasons, and to suggest more efficient algorithms.
The aim of the research is to analyze selected logic synthesis and optimization algorithms [5], identify the reasons of the above-mentioned behavior, and identify points, where randomness can be introduced. The influence of randomness will be then analyzed and algorithms exploiting the randomness in a positive way will be devised [3], [4]. Next, new algorithms minimizing the sensitivity on the external randomness will be developed.
- [1] P. Fišer and J. Schmidt, “Improving the Iterative Power of Resynthesis,” in Proc. of 15th IEEE Symposium on Design and Diagnostics of Electronic Systems (DDECS), Tallinn (Estonia), April 18-20, 2012, pp. 30-33.
- [2] P. Fišer and J. Schmidt, “On Using Permutation of Variables to Improve the Iterative Power of Resynthesis,” in Proc. of 10th Int. Workshop on Boolean Problems (IWSBP), Freiberg (Germany), September 19-21, 2012, pp. 107-114.
- [3] A. Puggelli, T. Welp, A. Kuehlmann, and A. Sangiovanni-Vincentelli, “Are Logic Synthesis Tools Robust?,” in Proc. of the 48th ACM/EDAC/IEEE Design Automation Conference (DAC), 5-9 June 2011, pp. 633-638.
- [4] P. Fišer, J. Schmidt, and J. Balcárek, “On Robustness of EDA Tools,” in Proc. of 17th Euromicro Conference on Digital Systems Design (DSD), Verona (Italy), August 27-29, 2014, pp. 427-434.
- [5] Berkeley Logic Synthesis and Verification Group, “ABC: A System for Sequential Synthesis and Verification” [Online]. Available: http://www.eecs.berkeley.edu/alanmi/abc/.
Test Compression for ASIC Circuits
With the increasing complexity of presently manufactured chips, increasingly more data must be delivered to individual chip cores to test them. Compression of this data becomes now inevitable, because of the high cost of the ATE memory and test time expenses. The ratio of expenses spent by chip testing and its development is increasing. Therefore, test compression is and will increasingly be a hot topic. It is very challenging to contribute to the research in this area and to try to overcome established industrial tools in performance.
Different test compression mechanisms have been proposed, some of them are used in industry [1], [2]. Most of them rely on a combination of pseudo-random testing (and possibly BIST), which can be implemented on-chip as whole and does not need any data to be transmitted, and deterministic test. The deterministic test is algorithmically compressed, stored in the ATE memory, and decompressed on-chip.
The aim of the research is to design test compression/decompression methods based on advanced design-for-testability (DFT) architectures [3]. This will comprise of a design of possibly novel decompression architectures, algorithms for test compression for these architectures, and design of the overall HW architecture, where test decompression, BIST and various DFT features will be implemented.
This research will (can) follow up a successfully completed Ph.D. thesis [4], [5].
- [1] J. Rajski et al. “Embedded Deterministic Test”, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 23, no. 5, pp. 776-792, 2004.
- [2] Y. Huang, S. Milewski, J. Rajski, J. Tyszer and C. Wang, “Low Cost Hypercompression of Test Data,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 39, no. 10, pp. 2964-2975, 2020.
- [3] R. Dorsch, H. Wunderlich, “Reusing Scan Chains for Test Pattern Decompression”, Journal of Electronic Testing: Theory and Applications, vol. 18, no. 2, pp. 231-240, 2002.
- [4] J. Balcárek, P. Fišer, and J. Schmidt, “Techniques for SAT-based Constrained Test Pattern Generation,” in Microprocessors and Microsystems, Elsevier, Vol. 37, Issue 2, March 2013, pp. 185-195. [5] J. Balcárek, „Implicit Representations in Testing and Dependability of Digital Circuits“, Ph.D. Thesis, CTU in Prague, 2016.
- [5] J. Balcárek, „Implicit Representations in Testing and Dependability of Digital Circuits“, Ph.D. Thesis, CTU in Prague, 2016.
Haindl Michal, prof. Ing., DrSc.
Analysis of visual materials properties
The aim of this work is to analyze the perception of surface materials under variable illumination and observation conditions. The work will have the task of finding suitable static and dynamic visual stimuli and psychophysical verification of their relative importance for human perception and recognition of different materials.
Automatic shape estimation from video
The work is focused on the research of methods for recognizing and modeling the shape of bodies from video for virtual reality applications. Propose and implement a suitable method of automatic estimation of a 3D model from measured data using a video camera. Verify the method on selected models of sculptures and buildings.
Charred Herculaneum Scrolls Text Detection
Current advances in computed tomography scanning make it possible to virtually reconstruct Greek parchment scrolls burned 2,000 years ago after the eruption of Mount Vesuvius. The existing 280 scrolls found represent an enormous historical value. Their reading is a real revolution in modern methods of archeology and an international competition is being launched to solve it. The individual layers of the scroll must first be virtually unfolded, and the individual fragments connected to each other. Finding individual Greek letters on burnt scrolls is very difficult from CT scans, a possible solution is to use texture analysis methods. So far, only a few isolated words have been successfully read. The topic of the work is the improvement of the current state of reading burnt scrolls.
- Haindl, M. “Bidirectional Texture Function Modeling,” In: Handbook of Mathematical Models and Algorithms in Computer Vision and Imaging, chapter 28, pp. 1023–1064, ISBN 978-3-030-03009-4, DOI 10.1007/978-3-030-98661-2 103, Springer
- International Publishing, 2023.
- Mikes, S. - Haindl, M. “Texture Segmentation Benchmark,” IEEE Transactions on Pattern Analysis and Machine Intelligen-ce ,vol. 44, no. 9, pp. 5647-5663, DOI 10.1109/TPAMI.2021.3075916, ISSN 0162-8828, September, 2022.
- Haindl, M. - Havlicek, V. “Transfer Learning of Mixture Texture Models,” 12th Int. Conf. on Computational Collective Intelli-gence, ICCCI 2020, ISBN 978-3-030-63006-5, DOI 10.1007/978-3-030-63007-2 65, pp. 825–837, Lecture Notes in Artificial Intelligence, vol. 12496, 2020, Springer Nature Switzerland AG.
Light field estimation from visual scene
The thesis objective is to study environmental light field modelling methods from measured visual scenes applied to model BTF illumination in virtual reality application. Propose and implement an appropriate method for realistic illumination model estimated from real measured visual scenes. Verify this method on interior and exterior virtual reality models with variable illumination.
Material appearance measuring on real objects
The thesis objective is to propose acquisition methods, capable to estimate advanced visual material representations (BTF, SVBRDF) directly from a real 3D object in natural lighting environment. Propose and implement an appropriate method for automatic visual surface material inference from hand held video camera sequences. Verify this method on the selected surface materials.
Modeling of spatial spread of viral infections
Design and implementation of a suitable statistical model that will allow modeling and prediction of the spread of viral infection between neigboring geographical areas. The model will also allow the evaluation of some selected epidemiological measures and their impact on the rate of spread of viral infection. The developed Markov model will be verified on COVID-19 coronavirus infection data.
Modelling of visual surface material properties
The appearance of real surface materials significantly changes with variations in lighting and viewing parameters. Therefore, today's most advanced texture representations (BTF) require modeling reflectivity over a wide range of lighting parameters and camera locations using complex mathematical models. The aim of the work will be to develop and verify a new BTF model based on the theory of Markov random fields, which will improve the current state of physically correct modeling of material surfaces.
Multispectral textural features
Design of suitable multispectral texture features for analysis of surface materials of visual scenes with variable observation conditions. The features will be derived from the multidimensional statistical models, their invariant modifications will be proposed, compared with the current best published alternative textural features, and applied to BTF data.
Skin Cancer Detection and Treatment Progress Monitoring
Malignant melanoma, as the most dangerous form of skin cancer, has been a rapidly growing threat over the past decades. The effective treatment requires its early diagnoses and surgical excision. The thesis objective is to find discriminative image features and a robust classifier which will allow to recognize skin cancer on colour skin images and to evaluate a treatment progress between several mutitemporal skin images.
Unsupervised dynamic image segmentation
The objective of this work is to critically evaluate the current state of unsupervised segmentation of image data and to develop an algorithm for unsupervised segmentation of dynamic color / multispectral / BTF images into individual homogeneous areas. The method will be based on the underlying multidimensional Markovian models and verified on the PTSB benchmark.
Visual quality measures
Verification of visual data modelling quality is a difficult and still unresolved problem due to the lack of existing mathematical criteria capable of approximating the human eye's perception. The work will aim at suggestion of an appropriate mathematical criterion which could quantify and rank simulation results according to their similarity to the measured visual template. The criterion will also be applicable to visual textures.
Hanzálek Zdeněk, prof. Dr. Ing.
Scheduling algorithms for energy-efficient production
The considered basic problem is RCPSP (Resource-Constrained Project Scheduling Problem). It is characterized by a set of activities to be scheduled. The order of these activities is partly defined by precedence constraints. Each activity requires some amount of renewable resources (e.g., machines, employees) while every resource has a planned capacity and some energy (i.e., a consumable resource with limits given by 15-minute contracts). The goal is to find a sequence of activities assigned to resources (i.e., a schedule) and capacity of resources with respect to two criteria. One objective is to minimize the earliness/tardiness penalty (costs incurred by completing some of the jobs before or after their due dates). The second objective is to minimize costs connected with the increasing capacity of resources (e.g. hiring more employees or working overtime). Then the solution of the problem might consider a linear combination or a Pareto front considering these two objectives.
We may further extend the problem while considering alternatives or modes of processes to be performed. Eventually, we might investigate the use of Machine Learning to speed up our algorithms, for example, to accelerate the computation of the objective function in the algorithm performing frequent swaps of tasks as it proved to be useful in our prior work dealing with a nurse rostering problem.
- Čapek, R. - Šůcha, P. - Hanzálek, Z.: Production Scheduling with Alternative Process Plans, European Journal of Operational Research, Volume 217, Issue 2, March 2012, Pages 300–311.
- Módos, I. - Šůcha, P. - Hanzálek, Z.: Algorithms for robust production scheduling with energy consumption limits, Computers & Industrial Engineering, Volume 112, October 2017, Pages 391-408.
- Václavík, R. - Šůcha, P. - Hanzálek, Z.: Roster evaluation based on classifiers for the nurse rostering problem, Journal of Heuristics, October 2016, Volume 22, Issue 5, Pages 667–697
Scheduling algorithms for large-scale applications
As the number of jobs and resources grows rapidly in many applications, calculating a Time-Triggered (TT) schedule is becoming computationally difficult. Even for very basic models, these problems are known to be NP-hard. Therefore, for large-scale problems, the first challenge is to obtain a reasonable solution in a reasonable time. The most common approach is to model the scheduling problem using some formalism, usually SMT or ILP, and to find a feasible solution using a general-purpose solver. The downside of this approach is its limited performance. To achieve sufficient scalability, a tailored algorithm must be developed.
We are interested to develop efficient and scalable algorithms that will be able to compute schedules (e.g. Time-Triggered schedules in IEEE 802.1Qbv Time-Sensitive Networks) with tens of thousands of jobs (i.e., messages) spanning hundreds of resources (i.e., communication channels). Besides, the algorithms will be expected to store some valuable information during the search, referred to as a context, which will then be exploited to speed up the process of evolving the schedule in response to changing requirements.
We may further extend our algorithms to handle jointly TT and Event-Triggered (ET) traffic classes, thereby improving the system timing properties. Eventually, we might investigate the use of Machine Learning to speed up our algorithms. Most recently we proposed a deep neural network combined with Lawler's decomposition for the single machine total tardiness problem. The results show that our data-driven algorithm outperformed the former state-of-the-art heuristics, and we see a great potential to extend it to other problems with the known decomposition rule.
- Vlk, M. - Brejchova, K. - Hanzalek, Z. - Tang, S.: Large-Scale Periodic Scheduling in Time-Sensitive Networks, Computers & Operations Research, Volume 137, January 2022.
- Minaeva, A. - Hanzalek, Z.: Survey on Periodic Scheduling for Time-Triggered Hard Real-Time Systems, ACM Computing Surveys, Volume 54, Issue 1, March 2021, Article 23, Pages 23:1-23:32.
- Bouška, M., Novák, A., Šůcha, P., Módos, I., Hanzálek, Z.: Data-driven Algorithm for Scheduling with Total Tardiness. In: Proceedings of the 9th International Conference on Operations Research and Enterprise Systems, ICORES 2020.
Holeňa Martin, prof. Ing. RNDr., CSc.
Advanced methods of evolutionary black-box optimization
Optimization tasks encountered in real-world applications more and more frequently optimize objectives that are not mathematically calculated functions, but results of computer simulations or experimental measurements. That kind of optimization, called black-box optimization, presents two great challenges: 1. it is possible to get only the values of such a black-box objective, not its gradient or higher derivatives, 2. the evaluation of the objective typically costs much time and/or money. As far as the first challenge is concerned, evolutionary algorithms have proven very successful for optimization using only the values of the objective during the last decades. However, they typically require a large number of evaluations, which conflicts with the second challenge. That conflict incited an intense reseaech into black-box evolutionary optimization during the last decade, bringing a broad spectrum of topics suitable for PhD theses.
Artificial neural networkks in black-box optimization
As black-box optimization, we refer to optimization in which the mathematical expression of the optimized function is not used (typically because no such expression is known), but the optimization algorithm only has its values at specific points available. These values are usually obtained empirically by measurement or through experiments, either physical or in the form of simulations. Black-box optimization uses algorithms that make almost no assumptions about the mathematical properties of the optimized function, most often evolutionary algorithms and other nature-inspired algorithms such as particle swarms. Since these algorithms work only with the functional values of the optimized function, they approach its optimum much more slowly than the optimization methods for smooth functions, which also use information about the gradient or second derivatives of the optimized function. This property is particularly disadvantageous in conjunction with the fact that empirically obtaining the value of the optimized function is usually quite expensive and time-consuming. However, evolutionary algorithms can be significantly accelerated by using the empirical black-box function only occasionally for evaluating the value of the optimized function, whereas mostly only its sufficiently accurate regression model is evaluated. Among the regression models used for this purpose, artificial neural networks have been for about 20 years, first multilayer perceptrons and later networks with radial basis functions. Due to the current popularity of modern types of neural networks, often referred to as deep neural networks, new approaches usable speed up black-box optimization based on modern neural networks have nevertheless been proposed in recent years, such as deep Gaussian processses, using Bayesian neural networks, optimization in a latent space of a lower dimension, mapped by a generative neural network into the spacein which the inputs of the optimized black-box function lie, or making use of GAN-type networks (generative adversarial network), the two components of which are used for the exploration and exploitation components of the optimization.
Explainability of graph neural networks
Graph data are used to keep knowledge about many important areas of the present world, such as computer networks, social networks, chemical molecules, communication systems, industrial processes or text documents. However, data-based methods for data analysis and modelling, such as classification, regression and clustering, have been developed primarily for vector data, and graph data need to be for using those methods first represented in some vector space. Most successful at such representation are artificial neural networks. Due to the need to learn graph representation, a specific kind of neural networks appeared, called graph neural networks. However, also graph neural networks have the propertzy of an overwhelming majority of neural networks that the rtransformation of network inputs to their outputs is a black-box mapping, which for a given network input does not enable to explain its output. In connetion with traditional neural networks, primarily multiayer perceptrons and radial basis function networks, attention has been paid, already since the 1990s, to methods enabling to describe the dependence of network output on its input by means of logical implications and equivalences, or to explain in some other way the value of its output for a given input. In the case of graph neural networks, however, research into explainability methods is only starting. Contribute to it should also the proposed thesis.
Making use of active learning in optimization
Evolutionary algorithms are, in the last 20 years, one of the most successful methods for solving non-traditional optimization problems, such as search for the most suitable documents containing required information, discovery of the most interesting knowledge in available data, or other kinds of optimization tasks in which the values of the objective function can be obtained only empirically. Because evolutionary algorithms employ only function values of the objective function, they approach its optimum much more slowly than optimization methods for smooth functions, which make use of information about the objective function gradients as well, possibly also about its second derivatives. This property of evolutionary algorithms is particularly disadvantageous in the context of costly and time-consuming empirical way of obtaining values of the objective function. However, evolutionary algorithms can be substantially speeded up if they employ the empirical objective function only sometimes when evaluating objective function values, whereas they mostly evaluate only a sufficiently accurate regression model of that function. It is the accuracy of the model that determines how successful surrogate of the original empirical function it will be. Therefore, the model is made more accurate after obtaining each new generation of points in which the empirical function has been evaluated, through re-learning including those points. However, it is possible to go even further and to consider, when choosing points for empirical evaluation, besides the value of the empirical function also how they contribute, during model re-learning, to making it more accurate. Such an approach is termed active learning. However, using active learning to accelerate evolutionary algorithms is only at a very beginning, and should be supported also by the proposed thesis.
Transfer learning in black-box optimization
Transfer learning is a method of algorithmic extraction of knowledge from a solved problem and its incorporation into the solution of another problem. It began to be studied more deeply about 20 years ago, in connection with the development of modern types of neural networks, often referred to as deep networks. It allows using a network trained on a large amount of data to train another network of similar quality on a much smaller amount of data.
In recent years, there have been also attempts to use transfer learning in black-box optimization. This is an optimization in which the mathematical expression of the optimized function is not used (typically because no such expression is known), but only its values at specific points are available to the optimization algorithm. These values are usually obtained empirically by measurement or by means of experiments, whether they take place physically or in the form of simulations. Black-box optimization uses algorithms that make almost no assumptions about the mathematical properties of the optimized function, most often evolutionary algorithms and other nature-inspired algorithms such as particle swarm optimization. As for transfer learning, it turns out that, similar to the case of neural networks, it allows to train a network of the same quality with a smaller amount of training data, it makes it possible to find the optimum based on a smaller number of values of the black-box function during black-box optimization. This is very promising because empirically obtaining the value of the optimized function is usually quite expensive and time-consuming.
However, research on methods for transfer learning in black-box optimization is still in a very early stage. A contribution to it should also be the proposed thesis.
Holub Jan, prof. Ing., Ph.D.
Data Indexing for Bioinformatics
Cheap technology for genome sequencing started a huge expansion of data to be stored and processed. Such a huge volume of data is needed for personalized medicine, for pangenome research, or for finding biomarkers for various diseases. Traditional indices allowing efficient search fail this time due to exceeding the available memory. They also do not exploit the important property of high similarity of human genome sequences.
In the last few years, new developments in stringology brought various compressed indices based on the Burrows-Wheeler transform. The goal of the project is to find more efficient methods for storing and indexing data for various tasks in Bioinformatics.
Hrabák Pavel, doc. Ing., Ph.D.
Agent heterogeneity in multi-agent models of pedestrian dynamics
Specialist supervisor: Ing. Hana Najmanová, Ph.D., prof. RNDr. Pavel Surynek, Ph.D.
Multi-agent (MA) models of pedestrian dynamics become a promising tool supporting an evacuation analysis in performance-based fire safety design. A pedestrian or evacuee is represented in such model by an agent, which parameters, dynamical and interaction rules are inspired by the physical, cognitive, and psychical processes of people in crowd. Recent studies show that very important but rarely considered aspect influencing the evacuation process is the heterogeneity of the crowd in various levels. Most challenging, and so far poorly discussed, is a simulation of a crowd composed of evacuees with essentially different roles and competences during the evacuation (following or giving instructions, keeping or controlling the formation, etc.) as e.g. evacuation of school (pupils vs. teachers) or cultural event (visitors vs. staff).
Goal of the thesis is to fill in this research gap in several of the following aspects:
- Thorough investigation of the influence of essential agents’ heterogeneity on important characteristics of pedestrian flow and evacuation process in most common MA models (cellular, social-force, or rule based).
- Development of new or modification of existing MA models enabling the above-mentioned hierarchical heterogeneity, supporting inter-agent bonds and crowd formations.
- Organization and analysis of evacuation experiments addressing the above-mentioned issues. Comparison of experiments and simulations, development of metrics quantifying the correspondence of heterogeneity in models and reality.
- Investigation of possible strategies of leading/controlling pedestrians/agents influencing the evacuation process by means of the multi-agent path finding or planning tools.
The work should be consulted with both, fire engineering expert and expert on multi-agent planning processes.
Hyniová Kateřina, doc. Ing., CSc.
A Comparative Study on Various Control Strategies for One-quarter-car Suspension Model Based on Numerical Simulations
The aim of the DP is to propose a method for the analysis of vibrations caused by road disturbances and to use it to compare the effectiveness of different car suspension control strategies. The strategies will be compared and evaluated using software-in-the-loop simulation on Matlab platform. The results of simulations of the quarter model suspension for different types of road irregularities will be compared . Car suspension strategy is a compromise between two conflicting criteria: car holding on the road and passenger comfort. These criteria will be taken into account in the study. The comparison will be made for both the active as well as the semi-active suspension group. The work will also include state-of-the-art.
Janoušek Jan, doc. Ing., Ph.D.
Arbology
Formal tree languages, formalisms for their generating and description, and models of computations for processing tree languages are one of the areas of the theory formal languages and automata. Practical related research topics can be effective algorithms for various problems of processing trees, such as pattern matching, indexing or computing repeats in trees. Various types of trees are considered, such as ordered and unordered trees, or ranked nad unranked trees. Classes of formal tree languages that are considered can be regular or context-free tree languages. Models of computation for processing trees can be various kinds of tree automata or pushdown automata reading linear notations of trees. The aim of the Ph.D. study is to develop the theory of formal tree languages and of effective algorithms for sequential and parallel processing of trees.
Kalina Jan, RNDr., Ph.D.
Reliability of deep learning
Recent research on reliability in the context of deep learning has shown that there is no unique understanding of reliability assessment for training of deep networks, i.e. there is no unity as to which steps should be performed to accompany the training. Statistical approaches to the newly arising field of diagnostics for deep learning start from the simplest ideas such as verification of the trained network on out-of-distribution data [5]. More powerful approaches include verifying the influence of adversarial examples in the data [1], evaluation of uncertainty within deep learning algorithms [3], or computational methods for error propagation throughout the network [2]. One of rare theoretical approaches standing on probabilistic reasoning is the hypothesis test of reliability for a multi-class classifier [4]. The thesis will propose novel tools for assessing reliability of trained deep networks. Theoretical investigations will evaluate the influence of possible errors (uncertainty, measurement errors, or outlying measurements) on the results of a trained deep network. Another aim is to derive diagnostic tools for checking the probability assumptions for deep networks. The tools used here may include bootstrapping and nonparametric combinations of tests.
- [1] Alshemali B., Kalita J. (2020). Improving the reliability of deep neural networks in NLP: A review. Knowledge-based systems 191, 105210.
- [2] Bosio A., Bernardi P., Ruospo A., Sanchez E. (2019). A reliability analysis of a deep neural network. IEEE Latin American Test Symposium LATS 2019, 1-6.
- [3] Caldeira J., Nord B. (2021). Deeply uncertain: Comparing methods of uncertainty quantification in deep learning algorithms. Machine Learning: Science and Technology 2, 015002.
- [4] Gweon H. (2022). A power-controlled reliability assessment for multi-class probabilistic classifiers. Advances in Data Analysis and Classification. Online first.
- [5] Martensson G., Ferreira D., Granberg T., Cavallin L., Oppedal K. et al. (2020). The reliability of a deep learning model in clinical out-of-distribution MRI data: A multicohort study. Medical Image Analysis 66, 101714.
Robust linear diskriminant analysis
Linear discriminant analysis (LDA) represent a well known statistical classification method. Its various regularized versions are available for high-dimensional data with the number of variables exceeding (perhaps largely) the number of data samples. Recent proposal include regularized robust versions of LDA that are resistant to the presence of outlying values in the data. These consider robust estimates of the mean and of the covariance matrix. The aim of the work will be to propose and investigated novel versions of robust regularized LDA and to perform extensive simulations to study their performance. These will be based on new robust estimates of the covariance matrix: One idea is to exploit the novel MRWCD estimator (minimum regularized weighted covariance determinant) [1]. Another idea is to exploit the estimator based on numerical linear algebra manipulations, particularly on the minimum product of a small number of the largest eigenvalues of the covariance matrix [2].
- [1] Kalina J., Tichavský J. (2022): The minimum weighted covariance determinant estimator for high-dimensional data. Advances in Data Analysis and Classification 16, 977-999.
- [2] Kalina J. (2020): The minimum weighted covariance determinant estimator revisited. Communications in Statistics Simulation and Computation 51, 3888-3900.
Knápek Jaroslav, prof. Ing., CSc.
Modelling the knowledge base of economic processes in OntoUml
Specialist supervisor: Ing. Petra Pavlíčková, Ph.D.
In the business economy, a lot of data is currently generated and then stored in enterprise information systems (ERP). The data thus obtained and stored are a suitable basis for decision-making by managers at all levels of business management. To support managerial decision-making, it is desirable to formalise expert knowledge on the financial health of the enterprise and the management of the main economic processes in the enterprise in a way that makes them more efficient and increases the profit of the enterprise. Therefore, the framework topic of the dissertation focuses on finding a suitable method of representing the knowledge base in the area of managing the economic processes of the enterprise, in particular using modelling tools used in medical knowledge modelling (GLIF, knowledge ontology). The main direction of the research will be oriented towards modelling the knowledge base of economic processes primarily using OntoUML methodologies.
Knop Dušan, doc. RNDr., Ph.D.
Complexity of problems in social choice and network processes
Numerous topics in (computational) social choice are nowadays important parts of various systems in our day to day life. These include, e.g., recommender systems, or distribution of goods stored in food banks. Now, as the use of these crucial tools grows so does the need for efficient computation of solutions to these problems. The main (suggested) toolbox used to pursue this task is the framework of parameterized (or, multivariate) complexity that is becoming more and more popular in the past decades. The practical need for the computation also yields the possibility to a more applied research using, e.g., ILP or SAT solvers. The biggest challenges in the area include problems which are complete for “higher classes” of classical complexity (i.e., possibly above NP); e.g., the Judgement Aggregation problem.
Kordík Pavel, doc. Ing., Ph.D.
Algorithms and architectures of recommender systems
In the recommendation systems, we are currently focusing research on a few open problems that have deep theoretical underpinnings, but whose solutions also have very concrete practical applications. We are exploring the utilization of deep neural networks to reduce the cold start problem of recommender systems, the design of transformers to predict shopping baskets. In the field of general machine learning, we focus on reinforced learning to optimize longer-term metrics such as user satisfaction, on transfer learning methods to incorporate new referral databases, and on using AutoML to optimize the architecture and hyperparameters of recommender systems.
Křikava Filip, doc. Ing., Ph.D.
Fast and robust data-analysis pipelines
Data analysis is typically performed by composing a series of discrete tools and libraries into a data analysis pipeline. These pipelines are at the core of data-driven science that has been central to most disciplines and today see an explosion in the widespread use of computational methods and available data. As the number of tools and size of data keep growing, we face problems with the scalability of the pipelines and the trustworthiness of their results.
The goal of this work is to research ways to make data analysis pipelines scalable (accommodate growing data and computational needs) and trustworthy (facilitate auditing of the analysis result). The research will go along two axes. The first will focus on extending the R programming language with transparent horizontal and vertical scaling. The second will study a combination of static and dynamic program analysis techniques to gain insight into the nature and severity of programming errors in the code of data-analysis pipelines, and propose algorithms for their detection and possible automated repair.
Kroha Petr, prof. Dr. Ing., CSc.
Text extraction
Text extraction is used in many fields to reduce text by omitting its parts that are not relevant for the user. This includes, for example, summarizing text. The topic comes from the field of text mining that is an area full of open problems. In our published work, we used the techniques of linguistic analysis of sentences (part-of-speech tagging) and clustering according to subsets of sentences (chunking) to analyze the texts of functional requirements specification of software products. We generated questions from the texts to improve the specification of functional requirements by analyzing the answers. The aim of the work is to use proven techniques from our publications in the field of text extraction, compare them with current results of published methods and find better new methods.
Kubátová Hana, prof. Ing., CSc.
Dependability models and reliability parameters’ computation with respect to realistic properties of modeled systems
Specialist supervisor: Ing. Martin Kohlík, Ph.D.
Currently used dependability models are often based on simplified processes leading to unrealistic estimations of dependability and reliability parameters of the modeled systems [1] or rough pessimistic estimations [2]. The aim of the research should be a methodology of the dependability model design allowing fast and accurate calculations of the dependability parameters of the system. The methodology should take over-time changes (e.g. aging, maintenance, repairs) of the system, the structure (blocks and their fail-safes) of the system, and the ways of acquiring dependability parameter data from real applications into account [3].
- [1] Electronic Reliability Design Handbook - MIL-HDBK-338B. US Department of Defense, 1998.
- [2] M. Kohlík, "Hierarchical Dependability Models Based on Markov Chains", Dissertation thesis, Czech Technical University in Prague, 2015.
- [3] Daňhel, M.: "Prediction and Analysis of Mission Critical Systems Dependability", Dissertation thesis, Czech Technical University in Prague, 2018.
Design methodology of dependable fault-tolerant and attack resistant systems
The research of methods and processes in the design of systems with pre-defined reliability parameters using programmable hardware sources (FPGA, processors). The reaserch of the impact of a redundancy at different levels (space, time, software, hardware) to the attack resistance of the whole system. The research of automatization methods of the design processes including the dependability models construction and dependability parameters computations. Partial results and proposed methods will be evaluated by real-life application and benchmarks.
Formalization and automatization of digital system design methods
The research area will be based on formal methods and models (Petri Nets, Markov chains, UML diagrams) to use them for simplificationa and automatization of digital design process. The connection of verification methods and dependability modeling at all design periods to obtain an optimized structure according different parameters is assumed. The aim should be the combination of different models and detailed study of their relations and possible automatic conversions. Partial results and proposed methods will be evaluated by real-life applications and benchmarks. An integral part of the topic is the study of possible new models used in industry and / or research.
New architectures with guaranteed level of dependability parameters for reconfigurable circuits
Specialist supervisor: Ing. Pavel Kubalík, Ph.D.
Main aims of this research are:
- Design of new architectures based on on-line error control codes available for implementation in reconfigurable hardware (FPGAs). The fulfillment of required reliable parameters is essential, together with a low area overhead, appropriate operational frequency, and low power consumption for mission-critical systems (intelligent cars, robots, etc.) where FPGA are more and more used due to their properties (and price).
- Proposal of appropriate methods to automatically select the best type of fault-tolerance and safety with respect to a particular application, its requirements and the necessary constrains, including design speed.
- Utilization of existing models and modifying them to solve this problem at the system level, and linking them to the hierarchical dependability models, created at the department.
The target platform will be FPGA circuits allowing fault recovery of a part design or completely change of an implemented function. The research will be mainly focused on the utilization of new error control codes for an optimized architecture enabled error detection and correction suitable for FPGA. The standard known fault-tolerant structures such as TMR or a duplex will be taken into account, too.
All proposed architectures and the partial results will be tested on standard known benchmarks and on dedicated sets of own circuits. The evaluation criterion will be mainly focused on realistic calculations of dependable parameters, together with low area occupation requirements.
Research of Dependability Improvement at ISA Level
The proposed research has to verify possibilities, how to achieve (and guarantee) predefined system design constraints (size, power-consumption, dependability characteristics, fault-tolerance and attack-resistance levels via trade-off between hardware and software parts and types.
It is supposed to use Codasip system and RISC-V processor (open-source hardware instruction set architecture (ISA) based on established reduced instruction set computer (RISC) principles, designed (not only) for embedded applications, where low-power and performance are emphasized. Codasip can provide their processors with high-level tools that enable to configure the RISC-V cores, to automatically generate toolchains and synthesizable, human-readable Verilog RTL. Codasip system (https://codasip.com/) offers a verification environment, too.
The research will contain proper evaluation of obtained architecture by improved dependability models. The possible methods will be ISA improvement (e.g. adding the cryptographic instructions, specialized block, using of more application-specific processors). Final experiments will be performed by simulations and FPGA implementations.
Kůrková Věra, RNDr., DrSc.
Robustness of learning of deep and shallow networks
Specialist supervisor: RNDr. Petra Vidnerová, Ph.D.
Deep networks became state-of-the-art performance on a variety of pattern-recognition tasks, most notably visual classiffication ones. Several recent studies revealed that some deep networks can be easily fooled by changing an image in a way imperceptible to humans which can cause a deep network to label the image as something else entirely. Thus it is important to investigate robustness of learning of networks with respect to adversarial patterns.
The goal of the dissertation is to design multiobjective algorithms generating imperceptible perturbations of images leading to their misclassifications. Evolutionary algorithms for generation of adversarial patterns will be developed that maximize the error of the model while being as similar to original training patterns as possible. The role of depth of networks and locality properties of computational units in misclassiffication will be analyzed. Methods of making neural networks more robust to such adversarial patterns will be investigated.
Langr Daniel, doc. Ing., Ph.D.
Numerical methods for computer modeling of atomic nuclei
Symmetry-adapted no-core shell model (SA-NCSM) is a state-of-the-art method for ab-initio (from first principles) modeling of atomic nuclei. The principle of this model is based on computation of nuclear wave functions in the form of eigenvectors of matrix operators relevant to a given nucleus. In the current implementation of this model, elements of these matrix operators are explicitly stored in a computer memory. This limits its applicability to only relatively light nuclei, even when the largest contemporary existing supercomputers are utilized. The goal of this dissertation thesis topic is to research and develop suitable parallel and scalable numerical algorithms and corresponding data structures that would enable “on-the-fly” computation of matrix operators in SA-NCSM calculations. Specific research of interest is the development and implementation of scalable parallel algorithms for finding extremal eigenvalues and corresponding eigenvectors of large sparse symmetric matrices.
Lórencz Róbert, prof. Ing., CSc.
Advanced Framework for Threat Monitoring and Detection in Linux Environments
Specialist supervisor: Ing. Simona Fornůsek, Ph.D.
In contemporary computing environments, the security of Linux-based systems is of paramount importance due to their widespread adoption in critical infrastructure and enterprise settings. Traditional methods of threat monitoring and detection often fall short in effectively identifying and mitigating sophisticated cyber threats. This dissertation thesis will propose an advanced framework leveraging machine learning, anomaly detection and behavioral analysis techniques to enhance the monitoring and detection capabilities of threats within Linux environments.
By integrating various methodologies from the fields of cybersecurity and machine learning, the framework will address the evolving nature of cyber threats while minimizing false positives and false negatives. Through the development and implementation of novel algorithms and models, the proposed framework will seek to provide a proactive approach to security, enabling organizations to detect and respond to threats swiftly and effectively.
The research will also include the exploration of the effectiveness of machine learning, anomaly detection, and behavioral analysis algorithms for threat detection in Linux environments, alongside an in-depth analysis of adversarial defenses such as various exploitation and evasion techniques, and obfuscation, commonly employed by adversaries. Additionally, the efficiency of existing detection techniques against the adversarial techniques currently in use will be evaluated, while proposing improvements to enhance their efficacy.
This research will contribute to the advancement of cybersecurity practices in Linux environments by providing a robust and adaptable solution tailored to the complexities of modern-day cyber threats.
Combined attacks on cryptographic modules
In the field of hardware cryptographic devices, there is a continual competition between the development of new attacks and, on the contrary, defenses against them. An attack usually aims to reveal secret information, such as a secret symmetric key, a private key, or a message. One of the relatively new attack approaches is a combination of passive and active attacks on cryptographic devices. The aim of this work is to explore new possibilities of combination of active and passive physical attacks with knowledge of linear, differential or algebraic cryptanalysis.
Dedicated Hardware for Modular Arithmetic
The aim is the design and implementation of dedicated hardware architectures for computing modular arithmetic operations. The results are applicable in elliptic curve cryptography, as well as in other systems that utilize modular arithmetic.
Machine Learning-Based Malware Detection for Linux
The number of malware attacks targeting the Linux operating system has increased recently, and existing detection systems are insufficient. This growth is partly due to the growth in the number of IoT devices that use different variants of Linux. Malware detection models for Linux are significantly less studied than detection models for the Windows operating system. As a result, Linux malware detection models are not as advanced and effective, and there is a lot of room for improvement. Machine learning algorithms play an important role in detecting and classifying malware into families and are a common part of Windows antivirus programs. Differences between Linux and Windows operating systems, such as different file formats, must be taken into account when extracting data and preprocessing it. The dissertation ability of the topic is based on solving data preparation problems where deep knowledge of the Linux operating system is required, as well as on designing efficient machine learning-based detection systems that provide the highest possible accuracy with an acceptable false positive rate.
Mixed-radix conversion (MRC) algorithm for converting results from a system of linear congruences into a system of linear equations
The solution of an integer system of linear equations (SLE) without rounding errors can be done by dividing the solving process into systems of linear congruences (SLC), and then converting the results into a set of solutions of the original SLE. The so-called MRC algorithm is used for this conversion, which has the complexity O(nm2), where n is the matrix dimension and m is the number of SLK (modules) used.
The aim of this work is to find a more efficient way of using the MRC algorithm that benefits from the knowledge of mutual data dependency of the SLE solution. It is also possible to design a parallelization of the newly designed algorithm. The result is an MRC-based method with less than O(nm2) complexity for solving the conversion process of SLC results to SLE results.
Modeling behavior of semiconductor components due to ionizing radiation
The behavior of various semiconductor circuits is also dependent, among other factors, on the environment in which they operate. Desirable information for users of various HW devices is the reliability of these devices on age, and to some extent the associated resistance of the semiconductor components to ionizing radiation.
The topic of the dissertation is mathematical modeling of the behavior of HW semiconductor components at various technological levels, depending on irradiation with ionizing/particulate radiation. The aim of this work is to create a model of HW device behavior including aging factors and material degradation due to radiation. The results will be useful for determining the reliability/error-free lifetime of circuitry exposed to radiation or long-term use.
Post-quantum cryptography
The study of suitable post-quantum cryptosystems has long been in the interest of cryptologists. The reason for this is the thriving field of quantum computer technology, which could endanger the security of asymmetric cryptosystems by using suitable factorization algorithms.
The topic of the dissertation is the study and analysis of existing and new methods of post-quantum cryptographic algorithms. The goal is to create an asymmetric cryptosystem that is resistant against quantum-based attacks and is simple to implement and secure.
One of the candidates for post-quantum cryptosystems suitable for analysis and eventual improvement is the McEliece asymmetric encryption algorithm based on binary Goppa codes. This algorithm complies with the security requirements for asymmetric cryptosystems of today, but there is a problem with its large spatial complexity. Trying to reduce the size of the keys in this algorithm can be a good initial challenge for further research.
Quantum Machine Learning for Malware Detection
Specialist supervisor: Aurél Gábor Gábris, Ph.D.
The increase in computing power, along with the growing amount of data, has recently resulted in the use of machine learning, which has achieved impressive results in various domains, including malware detection. On average, almost 1.5 million new malware samples are generated every day, and due to the increasing size of data, as well as the physical limitations of classical computers, machine learning algorithms are running into limits due to computing power. For this reason, scientists are investigating the possibility of using quantum computing to speed up machine learning algorithms, while some works [1,2] in malware detection have already appeared. The thesis aims to apply quantum machine learning (e.g., Quantum Support Vector Machine [3] or Quantum Neural Networks [4]) to the problem of malware detection and compare it with classical machine learning algorithms. A quantum computing simulator or a quantum computer from IBM, currently available based on an agreement with CTU, can be used. The dissertation ability of the topic is based on the review of the use of quantum machine learning algorithms for classification tasks from the domain of malware detection and the identification of its advantages and disadvantages compared to classical machine learning models.
- [1] Mercaldo, F., Ciaramella, G., Iadarola, G., Storto, M., Martinelli, F., & Santone, A. (2022). Towards explainable quantum machine learning for mobile malware detection and classification. Applied Sciences, 12(23), 12025.
- [2] Barrué, G., & Quertier, T. (2023). Quantum Machine Learning for Malware Classification. arXiv preprint arXiv:2305.09674.
- [3] Havlíček, V., Córcoles, A. D., Temme, K., Harrow, A. W., Kandala, A., Chow, J. M., & Gambetta, J. M. (2019). Supervised learning with quantum-enhanced feature spaces. Nature, 567(7747), 209-212.
- [4] Wan, K. H., Dahlsten, O., Kristjánsson, H., Gardner, R., & Kim, M. S. (2017). Quantum generalisation of feedforward neural networks. npj Quantum information, 3(1), 36.
Research of the behavior of physically unclonable functions (PUFs) and true random number generators (TRNGs)
A quality TRNG is essential for current hardware components of cryptographic. Reliable key generators based on PUF are also required. Such key generation is very much in demand, because the key generated in this way remains the "secret" of the cryptosystem hardware itself.
The topic of the dissertation is the study of the proposed PUF and TRNG in terms of their long-term stable response. The aim of this work is to explore existing and propose new PUF and TRNG solutions that are suitable for long-term generation of high-quality output by TRNG and which also guarantee stable key generation based on PUF responses. The work includes the study and understanding of the behavior of these components at the statistical level and also at the physical/technological level.
Miškovský Vojtěch, Ing., Ph.D.
Side-Channel Analysis, Attacks and Countermeasures
Side-channel analysis is an important topic of computer security as it enables an attacker to reveal sensitive information even if it is protected by cryptographically strong algorithms. It has been more than twenty years since first side-channel attacks were proposed, nevertheless, this topic still offers many research challenges. Attacks against the forthcoming postquantum cryptographic algorithms or so far less researched algorithms, e.g., ARX-based ones, are very actual topics. Another one is utilization of artificial intelligence in SCA. Non-interference and gadget composition are current hot topics in area of SCA countermeasures as so as is high-level synthesis. The subject of the proposed dissertation thesis aims to these challenges and related research areas.
Novotný Martin, Dr.-Ing.
Cryptographic/Cryptanalytical Architectures in Embedded Systems and Reconfigurable Hardware
Research in methods of implementation and acceleration of cryptologic operations and schemes in embedded systems and in reconfigurable hardware, namely in field-programmable gate arrays.
Pokorný Jaroslav, prof. RNDr., CSc.
Conceptual models of graph databases
Specialist supervisor: Ing. Michal Valenta, Ph.D.
In recent years, graph databases have been intensively developed and used in various areas (data lineage, artificial intelligence, optimization of supply chains, data analysis, ...). Representing data using both directed and undirected graphs emphasizes the interconnectedness of data and allows it to be viewed from different perspectives. With the use of graph databases, there is a growing need for research and development of conceptual modeling methods aimed at the subsequent implementation of schemes in graph databases. The dissertation will deal with the research of conceptual modeling methods and the implementation of conceptual models with great expressiveness (for example, OntoUML) in graph databases.
Ratschan Stefan, doc. Dipl.-Ing. Dr. techn.
Constraint Solving
We develop software, theory, and algorithms that can solve mathematical constraints consisting of non-linear equalities and inequalities, conjunctions, disjunctions, and logical quantifiers. We apply the constraint solving technology in several areas (e.g., verification of complex systems, computation of Lyapunov functions), and offer a large range of thesis topics in the area.
More information:
Executable Specifications
One of the big drawbacks of classical methods for specifying software requirements is that it is difficult to check the consistency and adequacy of a given specification. The software engineering community has come up with various remedies to this situation (e.g., model based software engineering, agile software development, executable algebraic specification) that usually aim at allowing the user to see the actual behavior of a given specification as early as possible in the development process, but that makes compromises in terms of expressivity or precision. In the thesis, the student will design an executable formal specification language that fulfills two seemingly contradicting objectives:
• High modeling capabilities in terms of expressivity and precision
• Efficient execution
The key to achieving both objectives at the same time will be the design of annotations for the specification language that provide information necessary of efficient execution. Increasing user efforts in writing annotations will result in increasing efficiency of execution of the given specification.
Robot Motion Planning Under Complex Dynamical Constraints
Assume a certain task for a robot, for example: Move from point A to point B without colliding with any obstacle. Robot motion planning is the problem of finding a plan for the actuators of the robot such that the resulting movement of the robot satisfies the given task. The currently dominant class of algorithms for motion planning are sampling based, rapidly expanding a tree that explores the search space of possible plans. However, such algorithms usually use rough approximations the dynamics of the robotics, which may result in highly sub-optimal plans.
The goal of this topic is to design theory, algorithms, and software for efficiently handling realistic models of robot dynamics in sampling based motion planning. All algorithms will be implemented in the frame of the Open Motion Planning Library (OMPL, https://ompl.kavrakilab.org/ ) which will allow their usage by practitioners in robotics world-wide.
Literature: Andreas Orthey, Constantinos Chamzas, and Lydia E. Kavraki: Sampling-Based Motion Planning: A Comparative Review, Annual Review of Control, Robotics, and Autonomous Systems,Volume 7, 2024
Verification of Complex Systems
The Pendolino train from Prague to Ostrava had severe problems in its initial phase: from time to time it stopped in the middle of nowhere due to a software bug. Similar bugs have been the cause of more severe incidents, for example the crash of the Ariane 5 rocket in 1996. We are doing research that will help to avoid such problems in the future, and offer a large range of thesis topics in the area.
More information:
Schmidt Jan, doc. Ing., Ph.D.
Algorithm acceleration in Electronic Design Automation
Contemporary industrial design and verification of digital systems has strong demands on time-to-market and hence design time. Often, we hear requirements such as “1 percent better result for 1 percent longer computation”. In this situation, accelerating the computation seems desirable. Yet, current algorithms have been designed with sequential processing in mind. They usually are complex, multi-layer iterative heuristics. To accelerate such algorithms, it is necessary either find parallelism e.g., in branch-and-bound heuristics, or replace some parts of the algorithm with better parallelizable algorithms. Further, we need to investigate which architectures and what granularity suit the designed algorithm best. It is necessary to keep work-optimality in mind, as it projects itself directly into energy efficiency.
Šimeček Ivan, doc. Ing., Ph.D.
New parallel algorithms for indexing of data from powder diffraction
Specialist supervisor: Ing. Jan Rohlíček, Ph.D.
X-ray structure analysis of data from powder diffraction is an important analytic method for crystal structure determination of samples that do not offer a suitable monocrystals. The indexation process is one of the critical bottlenecks for application of this method. Without the determination of the crystal lattice parameters, we cannot estimate the crystallic structure. Existing methods for indexation have problems with low-quality data as well as with the indexation of phase mixtures. The goal of this research is to develop more efficient parallel algorithms than the current ones for solving these problems.
Sparse matrix and tensor formats suitable for massively parallel algorithms in numerical linear algebra
Specialist supervisor: doc. Ing. Daniel Langr, Ph.D.
The used sparse matrix storage format has great impact on performance and scalability of the algorithms in numerical linear algebra.
The ideal format ensures minimal memory storage requirements, maximum utilization of floating point vector units, maximum utilization of cache memories, and enables load balanced parallelization of the algorithms on massively parallel systems.
Several sparse matrix formats have been proposed in recent years, but these specialized and efficient formats also have some drawbacks. They suffer from a significant transformation overhead, are designed only for a limited set of matrix operations, or do not support fast adding or removing nonzero elements.
The dissertation goal is to conduct research on new sparse-matrix formats for massively parallel architectures to address these trade-offs and develop optimization heuristics for using these formats for a sparse matrix in numerical linear algebra operations.
Skrbek Miroslav, Ing., Ph.D.
Synthesis of artificial intelligence and machine learning models to programmable hardware
Artificial intelligence and machine learning are increasingly used in real applications and thus significantly penetrates the field of embedded and especially cyberphysical systems. A typical example is the image analysis subsystem for self-driving cars or intelligent sensors. Unlike the big data area, these systems have limited computing power and have high variability in hardware architecture, and in addition, these systems are subject to additional requirements, especially for real-time processing, security and reliability, explainability, power consumption and chip size. From these perspectives, the resulting implementation must be optimized to minimize hardware resources while maintaining functionality and reliability to meet economic goals. Automated conversion of machine learning models such as deep neural networks into ASIC hardware and especially programmable hardware (FPGA) is currently a very current topic.
The subject of the proposed topic is the research of algorithms, procedures, methodologies and tools for the synthesis of artificial intelligence models and machine learning into programmable hardware. Current research is in the field of approximate calculations, hardware optimization using limited accuracy targeted according to the values of individual parameters, including other aspects of mapping to the target platform (consumption, chip size, timing). The subject of research will be not only a simple mapping of the learned model to hardware, but also the optimization of the AI model with respect to the implementation in hardware, which requires a closer connection of AI tools with tools for high-level and logical synthesis. The topic can be extended to the synthesis of networks with spiking models of neurons and the integration of learning algorithms into hardware, which is currently an unsolved or little solved problem.
- [1] Shubham Raif, et al. Logic Synthesis Meets Machine Learning: Trading Exactness for Generalization. arXiv:2012.02530v2 [cs.LG] 15 Dec 2020.
- [2] Chia-Chih Chi and Jie-Hong R. Jiang. Logic Synthesis of Binarized Neural Networks for Efficient Circuit Implementation. In IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN (ICCAD ’18), November 5–8, 2018, San Diego, CA, USA. ACM, New York, NY, USA, 7 pages. https://doi.org/10.1145/3240765.3240822
- [3] Yifan Qian, et al.: Approximate Logic Synthesis in the Loop for Designing Low-Power Neural Network Accelerator. 2021 IEEE International Symposium on Circuits and Systems (ISCAS) | 978-1-7281-9201-7/20/$31.00 ©2021 IEEE | DOI: 10.1109/ISCAS51556.2021.9401451
Starosta Štěpán, doc. Ing., Ph.D.
Combinatorial properties of languages generated by morphisms
Special cases of languages generated by morphism comprise, for instnance, language of fixed points of morphisms, languages of S-adic words [1], and factorial languages of L-systems [2]. The goal of the work is to investigate and describe combinatorial properties of such languages. Namely, their factor/palindromic/abelian complexity; properties preserved when the morphism is recognizable or circular. A possible option for the topic is focus on design of relevant effective algorithms and/or formalization in the proof assistant Isabelle/HOL [3-5].
Literatura:
- [1] Berthé, V., Delecroix, V.: Beyond substitutive dynamical systems: S-adic expansions, RIMS Lecture note ‘Kokyuroku Bessatu’ B46, pp. 81–123 (2014)
- [2] Rozenberg, G. Salomaa, A.: The Book of L, Springer Berlin Heidelberg, 1986
- [3] https://isabelle.in.tum.de/
- [4] Holub, Š., & Starosta, Š. (2021). Formalization of Basic Combinatorics on Words. In L. Cohen & C. Kaliszyk (Eds.), 12th International Conference on Interactive Theorem Proving (ITP 2021) (Vol. 193, pp. 22:1–22:17). Schloss Dagstuhl – Leibniz-Zentrum für Informatik. https://doi.org/10.4230/LIPIcs.ITP.2021.22
- [5] Klouda, K., Starosta, Š. (2024). The number of primitive words of unbounded exponent in the language of an HD0L-system is finite. Journal of Combinatorial Theory, Series A, 206, 105904. https://doi.org/10.1016/j.jcta.2024.105904
Šůcha Přemysl, doc. Ing., Ph.D.
Scheduling Tests in Medical Laboratories: Reduction of Turn-Around Time
There is no doubt that a timely diagnosis is critical to avoid significant harm and death of patients suffering from such diseases as acute ischemic stroke, heart attack, or serious cases of COVID-19 infection. Therefore, medical laboratories must be able to quickly analyze the sample and provide the result. The key performance indicator measuring how fast a laboratory can analyze a sample is called turn-around time (TAT). A factor influencing TAT is how the samples are sequenced in the laboratory workflow. This topic aims to study scheduling problems related to the samples' processing. Our analysis in medical laboratories and review of the literature has shown that this problem is not sufficiently studied while the scheduling of samples' processing can significantly reduce the TAT of high priority samples.
The goal of this thesis is to propose new mathematical models for the scheduling of samples in laboratories, and design efficient scheduling algorithms where the solution space exploration is accelerated by machine learning methods.
- [1] ATLAS Collaboration, "ATLAS Software and Computing HL-LHC Roadmap", https://cds.cern.ch/record/280291
Surynek Pavel, prof. RNDr., Ph.D.
Counterexample Guided Abstraction Refinement for Multi-robot Planning
Counterexample guided abstraction refinement (CEGAR) is a successful technique in model checking [1,2]. Within the proposed topic, the task is to investigate the possibilities of using the CEGAR technique in multi-robot planning [3,4]. Planning with many robots offers a lot of scope for designing abstractions of the problem that bring appropriate simplifications and allow the solver to solve the problem more easily, for example, some robots in the problem can be ignored, but at the cost of inaccuracies in the resulting solution. The design, search, and checking of counterexamples for a multi-robot system, with the help of which it would be possible to eliminate inaccuracies in the solution, represents another interesting challenge. Some progress has been made in multi-robot path planning, where the abstraction refinement has been made against collisions between robots. However, abstractions for path planning represent only a special case, while within this topic we would like to move the design of abstractions further towards more general multi-robot systems, where the range of robot actions is wider. An interesting direction is the extension of the CEGAR technique with abstractions that produce inaccuracy in the solution, against which the multi-robotic system is robust, and therefore there is no need to refine them.
- [1] Edmund M. Clarke, Anubhav Gupta, Ofer Strichman: SAT-based counterexample-guided abstraction refinement. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 23(7): 1113-1123 (2004)
- [2] Anubhav Gupta, Edmund M. Clarke: Reconsidering CEGAR: Learning Good Abstractions without Refinement. ICCD 2005: 591-598
- [3] Teng Guo, Si Wei Feng, Jingjin Yu: Polynomial Time Near-Time-Optimal Multi-Robot Path Planning in Three Dimensions with Applications to Large-Scale UAV Coordination. IROS 2022: 10074-10080
- [4] Lydia E. Kavraki, Steven M. LaValle: Motion Planning. Springer Handbook of Robotics, 2nd Ed. 2016: 139-162
Lazy Compilation for Classical Planning
In automated planning, the task is to find a sequence of actions that, after performing, meet a certain goal [1, 2]. This topic specifically focuses on compilation of the planning task into another formalism, such as constraint satisfaction (CSP) [3] or propositional satisfiability (SAT) [4]. Especially promising seems to be the lazy compilation, when the task is encoded into the target formalism incompletely. The encoding is subsequently refined in cooperation with the solver by generating counterexamples. The lazy compilation technique has been successfully used in the specific domain of multi-agent path finding [5]. However, the generalization of lazy compilation for classical planning is still an open question.
- [1] Malik Ghallab, Dana S. Nau, Paolo Traverso: Automated Planning and Acting. Cambridge University Press 2016, ISBN 978-1-107-03727-4
- [2] Malik Ghallab, Dana S. Nau, Paolo Traverso: Automated planning - theory and practice. Elsevier 2004, ISBN 978-1-55860-856-6, pp. I-XXVIII, 1-635
- [3] Rina Dechter: Constraint processing. Elsevier Morgan Kaufmann 2003, ISBN 978-1-55860-890-0
- [4] Armin Biere, Marijn Heule, Hans van Maaren, Toby Walsh: Handbook of Satisfiability. Frontiers in Artificial Intelligence and Applications 185, IOS Press 2009, ISBN 978-1-58603-929-5
- [5] Pavel Surynek: Unifying Search-based and Compilation-based Approaches to Multi-agent Path Finding through Satisfiability Modulo Theories. Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI 2019), pp. 1916-1922, Macau, China, International Joint Conferences on Artificial Intelligence, 2019, ISBN (Online): 978-0-9992411-4-1
Multi-agent Path Finding in Continuous Spaces
Specialist supervisor: doc. Dipl.-Ing. Dr. techn. Stefan Ratschan
Multi-agent path finding (MAPF) is the problem of computing collision-free paths from given starting positions to goal positions for a set of agents [1,2]. The MAPF problem is motivated by a number of real-world applications, for example agents can model robots for transporting goods in warehouses. Traditional solutions to this problem are based on models where both time and space are discrete. However, continuous models would be more realistic. The current state-of-the-art in this topic allows agents to move continuously along linear trajectories [3,4]. The goal of this work is to contribute to more general techniques, for example to allow agents to move along non-linear trajectories, or to model certain kinematic properties of agents, such as acceleration, or other properties that manifest themselves in a continuous model of the problem.
- [1] David Silver: Cooperative Pathfinding. AIIDE 2005: 117-122
- [2] Ko-Hsin Cindy Wang, Adi Botea: MAPP: a Scalable Multi-Agent Path Planning Algorithm with Tractability and Completeness Guarantees. J. Artif. Intell. Res. 42: 55-90 (2011)
- [3] Thayne T. Walker, Nathan R. Sturtevant, Ariel Felner: Extended Increasing Cost Tree Search for Non-Unit Cost Domains. IJCAI 2018: 534-540
- [4] Anton Andreychuk, Konstantin S. Yakovlev, Pavel Surynek, Dor Atzmon, Roni Stern: Multi-agent pathfinding with continuous time. Artif. Intell. 305: 103662 (2022)
Tvrdík Pavel, prof. Ing., CSc.
Adaptive algorithms for flow control and congestion control in data center communication networks
Specialist supervisor: Ing. Jan Fesl, Ph.D.
Typical tasks addressed in high-speed internet computer networks are flow control and congestion control. Both of these tasks are solved by network protocols at different layers of the OSI/OSI model.
In comparison with the Internet, data center (DC) communication networks are specific in their conditions and differ significantly in terms of flow control or congestion control. Communication links have higher speed and throughput and high reliability. More importantly, in DC networks compared to the Internet, there is a real possibility to process more global information about the status and load of communication links or network elements (their computer resources), or the possibility of adaptive reconfiguration of the entire DC network according to current communication requirements.
Analyze the properties of published algorithms for flow control and congestion control in DC communication networks. Create a realistic model of a data center communication network and use it to investigate the liḿtations of these algorithms. Develop a methodology for evaluating the quality of these algorithms based on the specific requirements on network traffic in DCs. Based on this analytical survey, design adaptive algorithms for flow control and congestion protection in DC networks utilizing more global network traffic information. Then validate your designs by testing them within the model and evaluate their quality according to your methodology.
Algorithms for optimal deployment of virtual machines in data centers
Specialist supervisor: Ing. Jan Fesl, Ph.D.
Data centers (DCs) play an important role in many different fields due to the growing importance of cloud computing, but their efficient management and administration pose non-trivial algorithmic problems. A prominent example is the problem of optimal placement of virtual machines (VMs) (or containers) across virtualization nodes. Some variants of this problem can be formulated as an multidimensional bin packing problem, which belongs to the class of NP-hard problems that can be approximately solved using optimization heuristics such as genetic algorithms, tabu search, simulated annealing, etc. The problem can also be solved by reformulating it to another computationally equivalent problem, such as SAT, where existing advanced solvers can be used.
Design a methodology to set various optimization criteria, such as energy consumption of running DC infrastructure, network load, communication latency of running applications, guaranteeing high availability, etc. Construct a model of a DC to simulate and test the properties of optimization algorithms for VM placement on DC virtualization nodes. Using this model, analyze the properties of existing algorithms for solving the problem of optimal VM placement on virtualization nodes, identify their shortcomings, and design, validate, and evaluate new algorithms.
Distributed IDS
Specialist supervisor: Ing. Jan Fesl, Ph.D.
Classic Intrusion Detection Systems (IDS) have been playing an important role in the security of computer networks or data centers for many years. In the case of networks with ultra-high data flow, the classic IDS concept can no longer be used due to limited local computing power. In order to eliminate this limitation, it appears promising to use a solution based on distributed IDSs (DIDS). The task of the dissertation will be to study algorithms and methods that will be used both for load distribution between network traffic probes and for the analysis of network traffic within a specific DIDS. To increase efficiency or refining the performance of the prediction of a specific network activity using DIDS collaborative machine learning appears perspective since it fully supports the DIDS concept.
Formal models of DDOS detection/elimination
Specialist supervisor: Ing. Jan Fesl, Ph.D.
Distributed DDOS attacks (classic or slow) pose a major threat in today's Internet as well as in software-defined networks and clouds. An effective defense against such attacks is their fast and reliable detection, which is, however, non-trivial given the number of parameters that need to be taken into account. The topic of the dissertation is the research of mathematical models enabling the design of algorithms for the detection of both types of DDOS attacks. Currently, models based on a combination of machine learning and traditional analytical and statistical methods, which enable fast data preprocessing and thus can significantly reduce the time for making decisions, which is absolutely essential in the case of DDOS attacks, seem to be promising.
Formal verification of network configuration models
The configuration of efficiently functioning computer networks arranged in complex topologies is nowadays considered a non-trivial problem. It often happens that a change in the configuration of a particular element within a computer network affects its other parts (e.g., in terms of availability, latency, quality of service, etc.). However, at the time of activation of the configuration, its effect may not be immediately apparent and it is therefore advisable to first verify the possible effects of the configuration. The task of the dissertation is the development of algorithms and methods for the formal verification of the state of network elements, which can point out a potential problem before the actual activation of the configuration.
Vitvar Tomáš, doc. Ing., Ph.D.
Operations Data Science: Machine learning on massive operational data from systems infrastructures
System infrastructures including operating systems, networking, web servers, load balancers, application servers, etc. produce massive amounts of operations data during systems runtime. Operations data science aims to improve various machine learning methods to help understand patterns, correlations or similarities from the past systems’ behavior in order to improve operations practice, for example, systems maintenance, changes in systems design or root cause analysis of incidents. This includes predicting systems behaviors when unplanned or unexpected events occur such as connectivity lost, failures of servers or spikes in systems workloads. This research will improve supervised and unsupervised learning methods tailored for problems from operational data analytics and cover a wide spectrum of underlying machine learning tasks including classification, clustering or regression.