Mgr. Alexander Kovalenko, Ph.D.

alexander.kovalenko@fit.cvut.cz
TH:A-1347

Profile
Publications
Projects
Theses

Theses

Sample theses

Bachelor theses

Self-supervised model for efficient sound recognition trained on aggregated data

Author

Vojtěch Houska

Year

2021

Type

Bachelor thesis

Supervisor

Mgr. Alexander Kovalenko, Ph.D.

Reviewers

doc. Ing. Pavel Kordík, Ph.D.

Department

Department of Applied Mathematics

Summary

The thesis summarizes state-of-the-art approaches in deep learning. It discusses application of self-supervised autoencoders and pre-processing techniques used in sound recognition. YouTube platform served as a source of weakly-labeled data to train such models. Latent space properties of proposed autoencoders were compared and tested using K-means clustering. Implementation of Adversarially Constrained Autoencoder Interpolation failed to outperform randomly initialized autoencoder. The reasons are further discussed and several recommendations for future research are proposed.

Thesis on DSpace

Machine Learning Techniques for Source Code Pattern Recognition

Author

Rudolf Raevskiy

Year

2022

Type

Bachelor thesis

Supervisor

Mgr. Alexander Kovalenko, Ph.D.

Reviewers

Pierre Donat-Bouillud, Ph.D.

Department

Department of Applied Mathematics

Summary

The automated understanding of code semantics is crucial for helping developers write reliable and optimized code. In recent years, there has been a growing interest in applying machine learning to source code, with the aim of automatically discovering bugs, commenting or understanding and improving the code. This work reports deep learning techniques applied to various levels of abstraction of the source code. We experiment with a dataset consisting of R language source code. R language has a large community of mostly statisticians. However, R libraries are prone to have suboptimal code. The main contribution of this work is a model trained on a large R dataset, which is the first step toward an automated tool to write a better R code. We primarily focus on Abstract Syntax Trees (AST), considering other representations forms just as well. Different abstractions add a structure to the input and therefore help to better generalize across the dataset. We train and evaluate several models based on various code representations. The transformer-based architecture was chosen as a backbone model for the current task, as it outperforms its counterparts in this domain. Training a model on a large R dataset is the first step toward automatized tool to write a better R code. As a result of RASTaBERTa, which is, to the best of our knowledge, the state-of-the-art transformer-based model for the R language and can be used for further training for specific tasks such as classification, bugs, and anomalies detection, bug-fix, etc.

Thesis on DSpace

A Machine Learning Approach for Job Posting and CV Alignment

Author

Karolina Zegeryte

Year

2024

Type

Bachelor thesis

Supervisor

Mgr. Alexander Kovalenko, Ph.D.

Reviewers

Ing. Miroslav Čepek, Ph.D.

Department

Department of Applied Mathematics

Summary

The main goal of this Bachelor's thesis is to develop a comprehensive and reliable Machine Learning model designed to normalize the representation of skills in job postings and resumes. The developed system facilitates smoother and more efficient recruitment processes by effectively addressing the discrepancies in how skills and experiences are represented in job advertisements and resumes. This improvement significantly reduces the potential misalignment between job seekers and recruiters. The methodology involves collecting and preprocessing a substantial dataset comprising diverse job postings and resumes. Given the absence of readily available training, testing, and validation data in the public domain, there is a need to manually curate a suitable dataset to fine-tune pre-trained Language Models (LMs). Both real and generated data will be selected and processed for these purposes. The system utilizes Machine Learning techniques to extract skills from text by combining one pre-trained LM BERT and one pre-trained model from SpaCy. Both of the models should be fine-tuned on a curated dataset. After the skills are extracted, the system merges them based on Similarity Metrics and Transformers' predictions for more efficient comparison. These techniques help normalize and match the extracted skills with standardized skill representations. Additionally, the study proposes the development of matching algorithms that leverage Similarity Metrics and Deep Learning techniques to accurately align job postings with corresponding resumes based on standardized skill representations. After the normalization of skills in resumes and job postings, algorithms such as Jaccard Similarity, Cosine Similarity, and Transformers will be applied to match resumes with job vacancies. The performance of these models will be evaluated using metrics including Precision, Recall, Accuracy, Loss, and F1 Score.

Thesis on DSpace

Business Evaluation of ML System for Satellite-Based Urban Hotspots Predictions

Author

Ivan Anikin

Year

2025

Type

Bachelor thesis

Supervisor

Mgr. Alexander Kovalenko, Ph.D.

Reviewers

MSc. Sagnik Bhattacharjee

Department

Department of Software Engineering

Summary

This thesis focuses on the use of machine learning and satellite data (Landsat, Sentinel) for predicting thermal maps in urban environments. Deep learning models, particularly CNN and U-Net architectures, were designed and tested to predict temperature fields based on environmental indicators. Over 200 models were trained and evaluated using different parameter combinations across various cities. The thesis includes an economic assessment of the predictive models benefits for urban planning, along with a business plan and SWOT analysis.

Machine learning techniques for weather events forecasting

Author

Tommy Chu

Year

2025

Type

Bachelor thesis

Supervisor

Mgr. Alexander Kovalenko, Ph.D.

Reviewers

Ing. Jiří Pihrt

Department

Department of Applied Mathematics

Summary

Accurate extreme weather forecasting is crucial for public safety and minimizing damage. Current deep learning methods for short-term weather forecasting often struggle with the rare and complex dynamics of extreme weather, leading to blurry spatial predictions, underprediction of critical events, and inefficient parameter use. This thesis identifies key shortcomings and experimentally verifies solutions for severe precipitation nowcasting. To address these issues, this research explores alternative structure-based losses and physics-informed models to reduce spatial blurriness. It also utilizes weighted losses and mixture models to improve severe event prediction and investigates task decomposition to enhance parameter efficiency in learning atmospheric processes. The findings offer practical insights for improving machine learning in extreme weather forecasting. They suggest methods to increase accuracy by optimizing for visual fidelity or by decomposing complex forecasting tasks into smaller, manageable sub-problems for robust predictions.

Master theses

Developing an automatic speech recognition system based on Czech spoken language

Author

Richard Werner

Year

2020

Type

Master thesis

Supervisor

Mgr. Alexander Kovalenko, Ph.D.

Reviewers

Ing. Mgr. Ladislava Smítková Janků, Ph.D.

Department

Department of Applied Mathematics

Summary

This thesis deals with automatic speech recognition (ASR) using recurrent neural networks (RNN). The goal is to analyze the state-of-the-art in those fields and propose a suitable Czech open-source voice dataset and an RNN model. Next, train the model on the dataset and use to trained model to transcribe another appropriate source of speech data. The output is a trained speech-to-text model, a new open-source dataset, and a system allowing accessible data preprocessing and further extension of datasets. The dataset of choice is the Czech Parliament meetings (CPM) transcribed recordings, and the model used is the DeepSpeech open-source project. The secondary source of speech data is the rest of the recording gathered from the CPM website. Part of the preprocessing relied on the usage of a voice activity detection (VAD) model, which was used as a reference for the audio segmentation. The trained model achieved 12.66 % WER (Word Error Rate) and 4.63 % CER (Character Error Rate), which were sufficient values for the final dataset transcription. After preprocessing, the final dataset consisted of over 580000 speech utterances of ranging length roughly from 1 up to 70 seconds. The project is designed as a Docker image with prepared custom tools and other means to preprocess datasets and feed them to an RNN. Therefore, the output is a trained RNN model, an open-source dataset consisting of labeled recordings, and a ready-to-use Docker image with a toolkit for data preprocessing.

Thesis on DSpace

Anomaly detection on the CERN data centre monitoring data

Author

Antonín Dvořák

Year

2022

Type

Master thesis

Supervisor

Mgr. Alexander Kovalenko, Ph.D.

Reviewers

doc. Ing. Kamil Dedecius, Ph.D.

Department

Department of Applied Mathematics

Summary

One of the many tasks of CERN cloud service operators is to make sure that the desired computational power is delivered to all users of the scientific community. This task is accomplished by carefully setting threshold-based alarming on top of the infrastructure performance time series metrics. In order to maximize the efficiency of the cloud infrastructure and to reduce the monitoring effort for service operators, we have developed a fully automated Anomaly Detection System that leverages unsupervised machine learning methods for time series metrics. Moreover, adopting ensemble methods, we combine traditional (Isolation forest) and deep learning (Gated recurrent unit/Long short-term memory Autoencoders) approaches. This work presents a description of the CERN monitoring infrastructure, problem formulation, design of the Anomaly Detection Pipeline, description of used models, creation of the dataset and performance of the implemented models compared to the performance of the Current Alarming System.

Thesis on DSpace

Machine Learning for Wafer Bin Map Defect Pattern Classification

Author

Jan Šefčík

Year

2022

Type

Master thesis

Supervisor

Mgr. Alexander Kovalenko, Ph.D.

Reviewers

Mgr. Petr Šimánek

Department

Department of Applied Mathematics

Summary

Automatic classification of defect patterns in wafer bin maps is a challenging problem for semiconductor manufacturers. Recently, progress with supervised approaches has been made, but labeled datasets are usually small and of poor quality. The creation of high-quality datasets is expensive and time-consuming, limiting early production. This work analyzes a selfsupervised/semi-supervised learning approaches that use unlabeled data. Based on the resizing problem analysis, this thesis proposed a smaller model that focuses on improving defect classification performance with diverse-sized wafers. The substantial improvement was made with the minor classes, in particular, with Scratch class.

Thesis on DSpace

Dictated numbers recognition model for an interactive voice response (IVR) company

Author

Martin Nykodem

Year

2022

Type

Master thesis

Supervisor

Mgr. Alexander Kovalenko, Ph.D.

Reviewers

doc. Ing. Pavel Kordík, Ph.D.

Department

Department of Applied Mathematics

Summary

This thesis focuses on the problem of automatic speech recognition (ASR). Namely, the specific task of this work is to create a machine learning model to recognize numbers in the Czech language, dictated in a phone call. ASR systems face specific domain-related problems of speech recognition. Therefore, to meet certain requirements peculiar to the Czech language, a custom approach for the preprocessing and model development has to be applied. Based on the survey of the popular state-of-the-art and trending approaches in the ASR field, the model applicable for the above-mentioned task is developed. Specificity of the domain, including data preprocessing and model fine-tuning, is discussed. Additionally, a specific domain dataset extension using the available Czech language datasets is presented. Finally, the development progress and improvements discovered during the development process are described. The results show that an 10-fold improvement in the correct recognition of recordings containing a sequence of dictated numbers is attained. The model vastly outperforms the current best solution from Czech speech recognition companies, as well as solutions from Google and Microsoft. Additionally, the lowest WER score of the available non-commercial models for the domain-agnostic dataset for the Czech language Common Voice 8 is achieved.

Thesis on DSpace

The Study of Linear Self-Attention Mechanism in Transformer

Author

Uladzislau Yorsh

Year

2022

Type

Master thesis

Supervisor

Mgr. Alexander Kovalenko, Ph.D.

Reviewers

doc. Ing. Pavel Kordík, Ph.D.

Department

Department of Applied Mathematics

Summary

As the quadratic complexity of an attention mechanism in the Transformer architecture places a high demand on processing long sequences, the goal of this research is to explore possibilities of linear attention in Transformer-like architecture and implement new methods.

Thesis on DSpace

Machine learning based approach for summarizing governance proposals for decentralized autonomous organizations

Author

Herman Tiumentsev

Year

2024

Type

Master thesis

Supervisor

Mgr. Alexander Kovalenko, Ph.D.

Reviewers

Ing. Miroslav Čepek, Ph.D.

Department

Department of Applied Mathematics

Summary

Decentralized Autonomous Organizations (DAOs) are gaining prominence as decentralized entities operating on smart contracts and blockchain technology. However, the complexity of governance proposals within DAOs poses challenges to accessibility and participation in decision-making processes. Thesis addresses the problem of limited accessibility and participation by developing and evaluating a personalized machine learning-based system for summarizing DAO governance proposals. The goals include exploring current DAO governance structures and decision-making processes, identifying challenges in summarizing proposals, evaluating different summarization approaches, and developing a customized summarization system. The system aims to enhance accessibility and participation by providing concise and understandable summaries of DAO governance proposals. Evaluation metrics such as accuracy, comprehensibility, and relevance are used to assess the system's effectiveness. Results indicate improvements in accessibility, highlighting the importance of tailored summarization systems in enhancing decision-making processes within DAOs.

Thesis on DSpace

Advancing Microrobotics for Biomedical Applications through Machine Learning

Author

Daniil Pastukhov

Year

2024

Type

Master thesis

Supervisor

Mgr. Alexander Kovalenko, Ph.D.

Reviewers

doc. Ing. Kamil Dedecius, Ph.D.

Department

Department of Applied Mathematics

Summary

This thesis explores the integration of machine learning techniques in microrobotics, focusing on biological microrobots utilizing sperm cells as a platform. The investigation includes a detailed analysis of relevant works in microrobotics and machine learning in the biomedical context, laying the groundwork for a multifaceted exploration. Key contributions include curating and annotating datasets tailored for training and evaluating models. Object detection models were developed and considered for precisely identifying sperm cells and their heads, while a keypoint estimation model was employed to detect flagellum keypoints. Additionally, an object-tracking system was implemented and evaluated to track the dynamic movements of sperm cell heads, enhancing the understanding of their interactions in dynamic environments. Further, a trajectory prediction model was trained and evaluated. This study marks a notable advancement in the integration of machine learning and microrobotics, offering innovative perspectives and approaches that can be utilized in various biomedical and technological fields. The work contributes to the current understanding of biological microrobots and lays the foundation for future advancements, unlocking the potential for precise control mechanisms and expanding applications in various fields.

Thesis on DSpace

Machine Learning Techniques for Laser-Plasma Acceleration Optimization

Author

Matěj Jech

Year

2024

Type

Master thesis

Supervisor

Mgr. Alexander Kovalenko, Ph.D.

Reviewers

doc. Ing. Ivan Šimeček, Ph.D.

Department

Department of Applied Mathematics

Summary

The thesis deals with the analysis of data from the laser-plasma particle accelerator in collaboration with the scientific institution ELI Beamlines. In the scope of the work, a data pre-processing process was designed and a generative model simulating the course of physics experiments was developed. The model is conditioned on a vector of experimental parameters and generates image data showing the energy spectrum of the accelerated electron beam. The developed model can be used as a partial substitute for real experiments, which are costly in terms of time and finances. It can also be used as a simulation of real experiments for various optimization methods. This thesis defines the process of training and testing candidate models with three different architectures and based on four hyperparameters. The resulting model can generate data at a rate of 1.8 images per second and has been evaluated based on a number of metrics, including the expert opinion of scientists, as a trustworthy tool to simulate the electron acceleration process.

Thesis on DSpace

Resource-Efficient Domain-Specific Generative Language Models

Author

Oleh Kuznetsov

Year

2025

Type

Master thesis

Supervisor

Mgr. Alexander Kovalenko, Ph.D.

Reviewers

doc. Ing. Pavel Kordík, Ph.D.

Department

Department of Applied Mathematics

Summary

This masters thesis addresses the challenges of deploying Large Language Models (LLMs) on resource-constrained systems, particularly single-GPU setups. It aims to develop a comprehen- sive methodology for enabling smaller LLMs to perform effectively in domain-specific applica- tions while managing high resource demands. The research focuses on the domain of music genre recommendation, proposing a retrieval-based solution. Key contributions include the investiga- tion and implementation of resource-efficient techniques such as Parameter-Efficient Fine-Tuning (PEFT), specifically QLoRa, and black-box knowledge distillation for adapting a compact LLM to a given task. A robust evaluation framework, incorporating a synthetic dataset and blind human A/B testing, was developed to assess the systems performance. The thesis provides an analysis of performance-resource trade-offs and offers practical insights for developing resource- efficient, domain-specific LLM solutions in similar constrained environments.

Local Retrieval Augmented Generation with Large Language Models on Extensive Text Coprora

Author

Jakub Kučera

Year

2025

Type

Master thesis

Supervisor

Mgr. Alexander Kovalenko, Ph.D.

Reviewers

Ing. Miroslav Čepek, Ph.D.

Department

Department of Applied Mathematics

Summary

Large language models (LLMs) excel at many language tasks but struggle with limited context, outdated knowledge, and hallucinations, especially on long, domain-specific texts. This thesis aimed to improve LLM performance in those scenarios by integrating and benchmarking Retrieval-Augmented Generation (RAG) pipelines, comparing vector and knowledge graph retrieval, and developing a custom hybrid RAG solution for decisions of the Czech Supreme Administrative Court. Experiments used both standard QA benchmarks and a new legal dataset created as part of this thesis with accuracy and hallucination rates evaluated by LLM-as-a-Judge. The best results on legal texts were achieved by a custom hybrid retrieval method (vector and keyword search with retrieved documents filtering, alpha=0.8), reaching 61.8% adjusted accuracy out of 77.5% estimated limit with Llama 3.1 8B model. Including referenced paragraphs and entities in both retrieval and prompts improved performance, while retrieving more than five passages offered no additional benefit. The thesis demonstrates that optimized hybrid RAG systems can significantly enhance LLM answer quality for complex, specialized tasks, with potential applications in legal and other domains.

Enhancing LLM Performance Through an Agentic Approach in Retrieval-Augmented Generation

Author

Matúš Botek

Year

2025

Type

Master thesis

Supervisor

Mgr. Alexander Kovalenko, Ph.D.

Reviewers

Ing. Miroslav Čepek, Ph.D.

Department

Department of Applied Mathematics

Summary

This thesis explores the approaches for integrating tool use and agency into Large Language Model (LLM) systems. It presents the design and implementation of an agentic LLM system augmented with tools for geospatial data processing. The main objective was to develop a system that combines the language understanding capabilities of LLMs with external tools to enhance functionality in spatial analysis, data retrieval, and knowledge extraction. The developed system, called GeoChat Assistant, enables interactive querying over a user-defined geographical area and dynamically invokes specialized tools for spatial computations and data retrieval from Web Map Services (WMS), public APIs, and other geospatial data sources. It is built on the LangGraph framework, which orchestrates the tool use, response generation, and decision-making. This thesis also evaluates the systems performance on a custom dataset of geospatial tasks, experimenting with various LLMs and measuring their impact on tool calling performance and response quality. GeoChat Assistant was also tested by domain experts, collecting relevant human assessments of its capabilities. Results demonstrate that tool-augmented LLM systems offer advantages in domain-specific reasoning and handling complex, multimodal data.

Spatiotemporal Data Forecasting Using Advanced Machine Learning Techniques

Author

Martin Šír

Year

2025

Type

Master thesis

Supervisor

Mgr. Alexander Kovalenko, Ph.D.

Reviewers

Ing. Daniel Vašata, Ph.D.

Department

Department of Applied Mathematics

Summary

The thesis explores machine learning models for spatiotemporal data forecasting, focusing on Fourier Neural Operators (FNO), diffusion models (DDIM), and their combination. The goal was to generate accurate and realistic future frames from historical inputs. Two benchmark datasets were used for evaluation: Moving MNIST and SEVIR. The work includes model implementation, hyperparameter tuning, and evaluation using SSIM, RHD, and FSS metrics. The FNO model trained with FACL loss achieved the best overall performance. While DDIM and the combined model achieved good results on simpler data, they struggled on the more complex dataset. However, the combined model reached results comparable to FNO on Moving MNIST, showing its potential.