Knowledge Engineering (version for Czech students)

Explainability in deep learning-based medical image analysis

Author

Martin Lank

Year

2024

Type

Master thesis

Supervisor

Ing. Magda Friedjungová, Ph.D.

Reviewers

Ing. Daniel Vašata, Ph.D.

Department

Department of Applied Mathematics

Summary

In this work, we apply Grad-CAM++, LayerCAM and SmoothGrad explainability methods to the proposed EfficientNetV2-based convolutional neural network fine-tuned on microscopic histology imaging. The network predicts mean diffusivity (MD) and fractional anisotropy (FA) obtained from diffusion tensor imaging. The aim of the work was to reveal which histology features tend to increase MD and FA. The proposed network achieved more than 98.5% R2 on all train, validation and test sets, surpassing the network proposed in the preceding work by tens of percentage points in R2. Nevertheless, the explainability methods applied to microscopy imaging were less valuable than anticipated. They indicate certain nuclei influence; however, the details about the relationship remain undiscovered.

Thesis on DSpace

Machine Learning Techniques for Laser-Plasma Acceleration Optimization

Author

Matěj Jech

Year

2024

Type

Master thesis

Supervisor

Mgr. Alexander Kovalenko, Ph.D.

Reviewers

doc. Ing. Ivan Šimeček, Ph.D.

Department

Department of Applied Mathematics

Summary

The thesis deals with the analysis of data from the laser-plasma particle accelerator in collaboration with the scientific institution ELI Beamlines. In the scope of the work, a data pre-processing process was designed and a generative model simulating the course of physics experiments was developed. The model is conditioned on a vector of experimental parameters and generates image data showing the energy spectrum of the accelerated electron beam. The developed model can be used as a partial substitute for real experiments, which are costly in terms of time and finances. It can also be used as a simulation of real experiments for various optimization methods. This thesis defines the process of training and testing candidate models with three different architectures and based on four hyperparameters. The resulting model can generate data at a rate of 1.8 images per second and has been evaluated based on a number of metrics, including the expert opinion of scientists, as a trustworthy tool to simulate the electron acceleration process.

Thesis on DSpace

Detection and removal of watermarks from image data

Author

Tomáš Halama

Year

2023

Type

Master thesis

Supervisor

Ing. Miroslav Čepek, Ph.D.

Reviewers

Ing. Magda Friedjungová, Ph.D.

Department

Department of Applied Mathematics

Summary

Digital image watermarking is a widely used technique for protecting intellectual property or authenticating digital media, but it can negatively impact image quality and usability. This motivates the need for removing watermarks from images, and deep learning presents a potential solution. This thesis develops a deep learning method for watermark removal, including a survey of existing techniques and the proposal of a novel architecture. The method's performance is evaluated in terms of watermark detection accuracy and image reconstruction quality.

Thesis on DSpace

Short-Term Precipitation Forecasting from Satellite Data Using Machine Learning

Author

Jiří Pihrt

Year

2023

Type

Master thesis

Supervisor

Mgr. Petr Šimánek

Reviewers

doc. Ing. Kamil Dedecius, Ph.D.

Department

Department of Applied Mathematics

Summary

Geostationary meteorological satellites are a source of global and frequent weather observations, but they do not directly observe precipitation. We research existing methods for inferring and forecasting rainfall from satellite data. The aim of this thesis is to predict high resolution precipitation radar observations up to 8 hours ahead from larger context but lower resolution multi-spectral geostationary satellite images. We develop a novel deep learning model for this task, utilizing the U-Net and PhyDNet neural networks. We name it WeatherFusionNet, as it fuses three different ways to process the satellite data; predicting future satellite images, estimating precipitation in the input sequence, and using the input sequence directly. To train and test it on real data, we participate in the NeurIPS Weather4cast 2022 competition, which provides spatially and temporally aligned satellite imagery and target precipitation radar data. WeatherFusionNet achieved first place in the Core challenge of the competition. We further experiment with several different models, try including static data in the input, and compare our model with a direct radar-to-radar model.

Thesis on DSpace

Constraint Programming in Scheduling for Garage

Author

Petr Švec

Year

2023

Type

Master thesis

Supervisor

prof. RNDr. Pavel Surynek, Ph.D.

Reviewers

Ing. Daniel Vašata, Ph.D.

Department

Department of Applied Mathematics

Summary

This thesis deals with the implementation of a system for scheduling workshop work in a car repair shop. The thesis analyses the customer requirements, based on which it defines a model. Model is using constraint programming. Based on the proposed model, we implemented the solver using the choco library. We validated the solver on synthetic and practice motivated data. Various properties of the generated solution for the test instances were measured. The focus was on universal use with maximum parameterization for the needs of individual clients and integrations.

Thesis on DSpace

Scalable Gaussian processes for surrogate modelling in Bayesian optimization

Author

Iveta Šárfyová

Year

2023

Type

Master thesis

Supervisor

Ing. Jiří Vošmik

Reviewers

Ing. Daniel Vašata, Ph.D.

Department

Department of Applied Mathematics

Summary

Bayesian optimisation is a global optimisation method suitable for finding extrema of expensive-to-evaluate black-box objective functions. Gaussian Processes are frequently used as models for approximating such functions. However, their cubic time complexity limits their deployment to applications in small-data regimes. This thesis provides an overview of state-of-the-art scalable Gaussian Processes for regression. The experiments performed within this work deal with tasks of regression and Bayesian optimisation, both utilising several selected Gaussian Process models. Evaluation is done using multiple metrics, some of which are particularly appropriate for probabilistic models. Our results suggest that the same few models consistently outperform the others in both tasks.

Thesis on DSpace

Anomaly detection on the CERN data centre monitoring data

Author

Antonín Dvořák

Year

2022

Type

Master thesis

Supervisor

Mgr. Alexander Kovalenko, Ph.D.

Reviewers

doc. Ing. Kamil Dedecius, Ph.D.

Department

Department of Applied Mathematics

Summary

One of the many tasks of CERN cloud service operators is to make sure that the desired computational power is delivered to all users of the scientific community. This task is accomplished by carefully setting threshold-based alarming on top of the infrastructure performance time series metrics. In order to maximize the efficiency of the cloud infrastructure and to reduce the monitoring effort for service operators, we have developed a fully automated Anomaly Detection System that leverages unsupervised machine learning methods for time series metrics. Moreover, adopting ensemble methods, we combine traditional (Isolation forest) and deep learning (Gated recurrent unit/Long short-term memory Autoencoders) approaches. This work presents a description of the CERN monitoring infrastructure, problem formulation, design of the Anomaly Detection Pipeline, description of used models, creation of the dataset and performance of the implemented models compared to the performance of the Current Alarming System.

Thesis on DSpace

Lazy Compilation in Classical Planning

Author

Zuzana Fílová

Year

2022

Type

Master thesis

Supervisor

prof. RNDr. Pavel Surynek, Ph.D.

Reviewers

Ing. Daniel Vašata, Ph.D.

Department

Department of Applied Mathematics

Summary

The subject of this diploma thesis is focused on a lazy compilation in classical planning. The theoretical part summarizes the basics of classical planning. Key concepts of the classical representation of planning problems are defined and basic planning algorithms are presented, in particular, the search in the planning state space and techniques using the planning graph. The compilation of the planning problem into the propositional satisfiability problem (SAT) is discussed at the end of this section. Based on the obtained knowledge, a new method for lazy compilation of planning problems into SAT has been proposed. Different from the classical compilation, in this method the propositional formula is gradually created and modified. As part of the practical part of the work, a planner was implemented using two compilation variants - the proposed method for lazy compilation and classical compilation. The planner was tested on planning problems from the International Planning Competition (IPC). The experiments focused on evaluating the success of the planner based on lazy compilation and comparing the results with the planner using the classical compilation method. A total of 79 problems of varying difficulty from four domains were used, of which the lazy planner was able to solve 63 faster than the classical planner. The performed experiments pointed out the advantages and possible disadvantages of lazy compilation. The results of the experiments indicate that the use of lazy compilation has the potential to improve the performance of the planner.

Thesis on DSpace

Power line vegetation management using UAV images

Author

Radek Ježek

Year

2022

Type

Master thesis

Supervisor

Ing. Lukáš Brchl

Reviewers

doc. Ing. Štěpán Starosta, Ph.D.

Department

Department of Applied Mathematics

Summary

The electric utility companies spend large amounts of money and effort every year to ensure the safe and uninterrupted operation of the electric power infrastructure. The most common source of outages is vegetation damaging power lines, for example, fallen trees. For this reason, companies perform regular inspections and maintenance of power line corridors, especially in forests and densely vegetated areas, creating a high demand for inexpensive and highly automated methods of power line corridor surveys. This work aims to create a robust algorithm for automatic detection of vegetation encroachment in the power line corridor using an Unmanned Aerial Vehicle (UAV), the techniques of photogrammetry, and computer vision. The study will cover the workflow for power line corridor inspection from comprehensive guidelines for data acquisition through power line 3D reconstruction to vegetation encroachment detection and visualization of the results.

Thesis on DSpace

The Study of Linear Self-Attention Mechanism in Transformer

Author

Uladzislau Yorsh

Year

2022

Type

Master thesis

Supervisor

Mgr. Alexander Kovalenko, Ph.D.

Reviewers

doc. Ing. Pavel Kordík, Ph.D.

Department

Department of Applied Mathematics

Summary

As the quadratic complexity of an attention mechanism in the Transformer architecture places a high demand on processing long sequences, the goal of this research is to explore possibilities of linear attention in Transformer-like architecture and implement new methods.

Thesis on DSpace

Improving deep learning precipitation nowcasting by using prior knowledge

Author

Matej Choma

Year

2022

Type

Master thesis

Supervisor

Mgr. Petr Šimánek

Reviewers

Mgr. Petr Novák, Ph.D.

Department

Department of Applied Mathematics

Summary

Deep learning methods dominate short-term high-resolution precipitation nowcasting in terms of prediction error. However, their operational usability is limited by difficulties explaining dynamics behind the predictions, which are smoothed out and missing the high-frequency features due to optimizing for mean error loss functions. This thesis summarizes our progress in addressing these issues. Firstly, we present Intensity Classification Loss to improve the prediction of severe rainfall. The model is trained to predict the probability of precipitation with an intensity over 40 dBZ as a secondary output, which is compared to binary ground truth. Experiments have shown that this approach helps predict severe rainfall but does not predict precipitation with higher intensities than the selected threshold. Secondly, we experiment with hand-engineering of the advection-diffusion differential equation into a PhyCell to introduce more accurate physical prior to a PhyDNet model that disentangles physical and residual dynamics. Results indicate that while PhyCell can learn the intended dynamics, training of PhyDNet remains driven by loss optimization, resulting in a model with the same prediction capabilities.

Thesis on DSpace

Bayesian filtering of state-space models with unknown covariance matrices

Author

Tomáš Vlk

Year

2021

Type

Master thesis

Supervisor

doc. Ing. Kamil Dedecius, Ph.D.

Reviewers

Ing. Ondřej Tichý, Ph.D.

Department

Department of Applied Mathematics

Summary

This thesis explores the problem of distributed Bayesian sequential estimation of unknown state-spacemodels with unknown processes and measurement noise covariance matrices. This is a frequent problem in real-world scenarios, where the information about noise covariance matrices for specific sensors may not be available. The solution proposed in this thesis is built upon the variational Bayesian paradigm, which is used for the estimation of the states, as well as the unknown measurement noise covariance matrix. From performance improvements, the measurements and posterior estimates are shared between the adjacent node in the network. It also shows a way of optimizing the process noise covariance matrix.

Thesis on DSpace

Anomaly detection using Extended Isolation Forest

Author

Adam Valenta

Year

2020

Type

Master thesis

Reviewers

doc. Ing. Pavel Kordík, Ph.D.

Department

Department of Applied Mathematics

Summary

The thesis deals with anomaly detection algorithms with a focus on the Extended Isolation Forest algorithm. Extended Isolation Forest generalizes its predecessor algorithm, the Isolation Forest. The original Isolation Forest algorithm brings a brand new form of detection, although the algorithm suffers from bias coming from tree branching. Extension of the algorithm removes the bias by adjusting the branching, and the original algorithm becomes just a special case. Extended Isolation Forest is implemented into the H2O-3 Machine Learning open-source platform. Implementation is required to run on a distributed computing system with a Map/Reduce library.

Thesis on DSpace

Neural Networks Based Domain Adaptation in Spectroscopic Sky Surveys

Author

Ondřej Podsztavek

Year

2020

Type

Master thesis

Supervisor

RNDr. Petr Škoda, CSc.

Reviewers

Ing. Kamil Dedecius, Ph.D.

Department

Department of Applied Mathematics

Summary

We present an analysis of the impact of neural-based domain adaptation in astronomical spectroscopy. Domain adaptation addresses the problem of apply- ing prior knowledge to a new data of interest. Therefore, we selected a problem of quasar identification in the Large Sky Area Multi-Object Fiber Spectroscopic Telescope survey using labelled data from the Sloan Digital Sky Survey. We choose to experiment with four neural models for domain adaptation: Deep Domain Confusion, Deep Correlation Alignment, Domain-Adversarial Network and Deep Reconstruction-Classification Network. However, our experiments reveal that these model cannot improve classification performance in comparison to a convolutional neural network that does not consider domain adaptation. Using dimensionality reduction, statistics of the selected methods and misclassifications, we show that the domain adaptation methods are not robust enough to be applied to the complex and dirty astronomical data.

Thesis on DSpace

Recommendations Model Based on Recurrent Neural Networks

Author

Ladislav Martínek

Year

2020

Type

Master thesis

Supervisor

Ing. Tomáš Řehořek, Ph.D.

Reviewers

Ing. Mgr. Ladislava Smítková Janků, Ph.D.

Department

Department of Applied Mathematics

Summary

This diploma thesis deals with matters of recommendation systems. The aim is to use recurrent neural networks (LSTM, GRU) to predict the subsequent interactions using sequential data from user behavior. Matrix factorization adapted for datasets with implicit feedback is used to create a representation of items (embeddings). An algorithm for creating recurrent models using the embeddings is designed and implemented in this thesis. Furthermore, an evaluation method respecting the sequential nature of the data is proposed. This evaluation method uses recall and catalog coverage metrics. Experiments are performed systematically to determine the dependencies on the observed methods and hyperparameters. The measurements were performed on three datasets. On the most extensive dataset, I managed to achieve more than double recall against other recommendation techniques, which were represented by collaborative filtering, reminder model, and popularity model. The findings, possible improvement by hyper-parametrization, and different possible means of model improvement are discussed at the end of the work.

Thesis on DSpace

Sequential Bayesian Poisson regression

Author

Radomír Žemlička

Year

2020

Type

Master thesis

Supervisor

Ing. Kamil Dedecius, Ph.D.

Department

Department of Applied Mathematics

Summary

The Poisson regression is a popular generalized linear model used to model discrete count variables. This thesis is focused on the problem of its sequential estimation under potentially slowly time-varying regression coefficients. A convenient approximation by normal distribution is used to do so in the Bayesian setting. Also, a calibration technique is discussed to enhance the estimation quality. Finally, a use case of the proposed approach in the signal processing domain is suggested, in particular, its application in diffusion networks to perform distributed collaborative estimation.

Thesis on DSpace

Deep Latent Factor Models for Recommender Systems

Author

Radek Bartyzal

Year

2019

Type

Master thesis

Supervisor

Ing. Tomáš Řehořek, Ph.D.

Reviewers

MSc. Juan Pablo Maldonado Lopez, Ph.D.

Department

Department of Applied Mathematics

Summary

Recommendation systems help users discover relevant items. One of the types of models used to generate the recommendations are latent factor models. We survey the state of the art neural network based latent factor models and implement four of them. We also design and implement a novel architecture of a deep latent factor model called Hybrid cSDAE that is able to process both the rating and attribute information. We comprehensively evaluate the implemented models on standard datasets.

Thesis on DSpace

Detection of material defects on foamed insulating panels

Author

Tomáš Duda

Year

2018

Type

Master thesis

Supervisor

doc. RNDr. Ing. Marcel Jiřina, Ph.D.

Reviewers

doc. Ing. Ivan Šimeček, Ph.D.

Department

Department of Applied Mathematics

Summary

This master's thesis deals with the automatic detection of material defects on foamed insulating panels using methods of image processing. The process of foamed glass production and current approach to output quality control by a human worker is described. The description of installed hardware for image data acquisition is provided. Related systems for automatic material inspection are reviewed and an analysis of various methods for image texture description is provided. Conceptual design of foamed glass panels inspection system is presented. Acquired images of panels are described and an annotation application is developed. A suitable image preprocessing algorithm is proposed as well as methods for detection of different kinds of foamed glass defects. The final design of detection algorithm is supported by measurement of the accuracy of several methods. The proposed algorithm is implemented and the final accuracy of the inspection system is measured. The results are discussed and possible future improvements are proposed. The developed system was successfully deployed in a production environment.

Thesis on DSpace

Approximate Pattern Matching In Sparse Multidimensional Arrays Using Machine Learning Based Methods

Author

Anna Kučerová

Year

2017

Type

Master thesis

Supervisor

Ing. Luboš Krčál

Reviewers

prof. Ing. Jan Holub, Ph.D.

Department

Department of Theoretical Computer Science

Summary

The main goal of this work is to propose a solution of approximate pattern matching with the use of machine learning based methods. This is done with the help of Locality Sensitive Hashing and existing algorithms. Idea of LSH is used for searching of positions of potential results and their verification is executed as in existing algorithms. Previous work was focused primarily on low dimensional pattern matching. The outcome of this work is an algorithm together with time measures and comparison with already existing solutions. Some of the comparing algorithms were only theoretically designed and not implemented until now. The solution also uses binary format used in a com- mercial array database.

Thesis on DSpace

Criminality prediction

Author

Veronika Maurerová

Year

2017

Type

Master thesis

Supervisor

doc. Ing. Pavel Kordík, Ph.D.

Department

Department of Theoretical Computer Science

Summary

Emphasis on work efficiency and the increasing interest in data processing, Machine learning and Artificial Intelligence caused that the predictive analysis becomes part of the police activities especially in the domain of criminality prevention. For example, the police patrols are scheduled based on the predictive analysis the most risk areas in the city. This thesis is focused on supervised learning methods and their capability to find hidden patterns in the real historical crime data. The objective is to predict future crime with a certain probability using the algorithms based on decision trees and neural networks.

Thesis on DSpace

Scalability of Predictive Modeling Algorithms

Author

Tomáš Frýda

Year

2017

Type

Master thesis

Supervisor

doc. Ing. Pavel Kordík, Ph.D.

Reviewers

Ing. Karel Klouda, Ph.D.

Department

Department of Theoretical Computer Science

Summary

This thesis has two main goals - (1) parallelize FAKE GAME by integration into, an open source machine learning framework, H2O, and (2) evaluation of anytime properties of machine learning algorithms and influence of hyper-parameter optimization on them. To meet these objectives, I have integrated FAKE GAME into H2O and, in order to evaluate anytime properties, I have implemented, a new tool, called Benchmarker. The evaluation of anytime properties shows that for some problems FAKE GAME models outperform state-of-the-art models from H2O, in both, accuracy and performance. Moreover, the evaluation of hyper-parameter optimization show little success, when optimizing H2O machine learning algorithms. I hypothesise that the negligible performance improvement, and for some optimized models even lower performance than with default configuration, is caused by hyper-parameter automatic tuning, which is done by default in H2O for some hyper-parameters.

Thesis on DSpace

Neural networks with memory

Author

Ondřej Kužela

Year

2016

Type

Master thesis

Supervisor

doc. RNDr. Ing. Marcel Jiřina, Ph.D.

Reviewers

Ing. Josef Pavlíček, Ph.D.

Department

Department of Theoretical Computer Science

Summary

Neural networks with memory are the family of the neural networks that except the classic memory for the long-term dependencies, in a form of the weights, also contain another form of a memory. Such a memory serves to retain the mid-term, sometimes also called long-short-term, dependencies and can be of two different types, either internal or external. Within this thesis I offer a summarizing overview of the family of the neural networks with memory. Based on the analysis of the existing models I also propose a new model of the Recurrent Neural Modules with External Memory. This model offers a new and innovative approach to the usage of the external memory within the neural networks, since it deploys the external memory on the scope of parts of the network and thus deploys multiple external memories within one network. The performance of the newly proposed model was evaluated on the Air Travel Information System (ATIS) dataset.

Thesis on DSpace

Automatic text summarization

Author

Šimon Hlaváč

Year

2015

Type

Master thesis

Supervisor

doc. RNDr. Ing. Marcel Jiřina, Ph.D.

Reviewers

doc. Ing. Pavel Kordík, Ph.D.

Department

Department of Theoretical Computer Science

Summary

This work presents the basic methods used in automatic text summarization and genetic algorithms. Furthermore, system of automatic summarization based on graph structures and Markov chains was designed, implemented and properly tested. This study also discusses learning of proper setting of importance weights of individual methods used in summarization by naive approach and genetic algorithms, which were also implemented and properly tested. System also includes possibility of parallel processing and use of caching to speed up its process.

Thesis on DSpace

Text signals relevance improvement for full text serch

Author

Jan Hnízdil

Year

2015

Type

Master thesis

Supervisor

Ing. Jan Šedivý, CSc.

Reviewers

doc. Ing. Pavel Kordík, Ph.D.

Department

Department of Theoretical Computer Science

Summary

Although web search has become a standard and often favored source of information finding many years ago, the task of searching relevance documents to given user query has still a lot of weak spaces need to be improved. This thesis is trying to find new text relevance signals to improve full-text search and user satisfaction via datasets provided by Seznam.cz. First of all, there is analyzed and evaluated major LTR algorithms, evaluation metrics and commonly used text signals known from literature. Second, system for testing and evaluation of new signals was designed and implemented and finally bunch of experiments over the new text signals were conducted and results were compared with anonymized baseline signals provided by Seznam.cz.

Thesis on DSpace

Data mining of high school alumni performance at the university

Author

Eliška Hrubá

Year

2014

Type

Master thesis

Supervisor

doc. Ing. Pavel Kordík, Ph.D.

Reviewers

Ing. Stanislav Kuznetsov, Ph.D.

Department

Department of Theoretical Computer Science

Supporting the Diagnosis of Borreliosis by Machine Learning Methods

Author

Jan Motl

Year

2013

Type

Master thesis

Supervisor

doc. Ing. Pavel Kordík, Ph.D.

Reviewers

Ing. Tomáš Bartoň, Ph.D.

Department

Department of Theoretical Computer Science

Theses

Master theses

Explainability in deep learning-based medical image analysis

Machine Learning Techniques for Laser-Plasma Acceleration Optimization

Detection and removal of watermarks from image data

Short-Term Precipitation Forecasting from Satellite Data Using Machine Learning

Constraint Programming in Scheduling for Garage

Scalable Gaussian processes for surrogate modelling in Bayesian optimization

Anomaly detection on the CERN data centre monitoring data

Lazy Compilation in Classical Planning

Power line vegetation management using UAV images

The Study of Linear Self-Attention Mechanism in Transformer

Improving deep learning precipitation nowcasting by using prior knowledge

Bayesian filtering of state-space models with unknown covariance matrices

Anomaly detection using Extended Isolation Forest

Neural Networks Based Domain Adaptation in Spectroscopic Sky Surveys

Recommendations Model Based on Recurrent Neural Networks

Sequential Bayesian Poisson regression

Deep Latent Factor Models for Recommender Systems

Detection of material defects on foamed insulating panels

Approximate Pattern Matching In Sparse Multidimensional Arrays Using Machine Learning Based Methods

Criminality prediction

Scalability of Predictive Modeling Algorithms

Neural networks with memory

Automatic text summarization

Text signals relevance improvement for full text serch

Data mining of high school alumni performance at the university

Supporting the Diagnosis of Borreliosis by Machine Learning Methods