Bachelor theses
Machine Learning-Based Prediction of Football Match Statistics
Author
Ondřej Herman
Year
2023
Type
Bachelor thesis
Supervisor
Dr. Rodrigo Augusto da Silva Alves
Reviewers
Ing. Petr Kasalický
Department
Summary
Football, the most widely played and followed sport globally, captivates billions of fans worldwide. The significance of predicting match outcomes has garnered attention from statisticians, machine learning researchers, and avid bettors alike. However, while substantial progress has been made in machine learning for outcome prediction, relatively little focus has been placed on forecasting the statistical aspects of the sport. This study aims to address this gap by exploring machine learning methods to analyze and estimate various match statistics as regression problems. Specifically, I investigate six statistics: corners, shots, shots on target, fouls, yellow cards, and red cards. By conducting experiments on four datasets from different football leagues, I evaluate the performance of eight models. My findings reveal that different methods adapt better to certain statistics, and also that some statistics exhibit different behaviors across leagues. Additionally, I observe that certain features, such as the number of corners or shots, are more predictable due to their higher occurrence rates during matches compared to the number of cards.
Multitask Learning for Cognitive Sciences Triplet Analysis
Author
Tsimafei Stambrouski
Year
2024
Type
Bachelor thesis
Supervisor
Dr. Rodrigo Augusto da Silva Alves
Reviewers
Mgr. Alexander Kovalenko, Ph.D.
Department
Summary
This bachelor thesis focuses on the analysis of the triplet problem, whose task is to identify the odd object out of three. There are different methodological approaches to solving this problem, which differ in their basic focus - some methods evaluate which two objects are most similar, while others identify the object that is most dissimilar. The aim of this thesis is to combine these two perspectives and analyze the results. To solve the problem, a neural network has been developed using the TensorFlow library in the Python programming language. The actually research showed that the combination of both approaches did not produce better results than the individual methods alone. The main output of the work is the explanation of how the combination of opposing views affects the final choice of the odd object. For instance, in the set of (Car, Dog and House) one item should be selected as the odd out. House and Car are the most similar or Car is the odd-one-out?
Leveraging Large Language Models for Regionalized Recommender Systems
Author
Adam Čapka
Year
2025
Type
Bachelor thesis
Supervisor
Dr. Rodrigo Augusto da Silva Alves
Reviewers
doc. Ing. Pavel Kordík, Ph.D.
Department
Summary
This study explores the application of LLM-generated embeddings for analyzing unstructured textual data, aiming to uncover regional patterns in human behavior and local characteristics. By leveraging these embeddings, a regionalization algorithm is applied to define spatially contiguous regions that exhibit similar linguistic patterns. This approach enables the identification of organically emerging zones of interest, moving beyond arbitrary administrative boundaries. Furthermore, sparse autoencoders are employed to isolate key themes from the embeddings, allowing regionalization based on specific topics, such environmental conditions, sentiment, or infrastructure. Our findings demonstrate that combining LLM-based embeddings with sparse autoencoders provides a powerful framework for understanding regional variations, with potential applications in recommendation systems, market analysis, and sustainable land planning.
A Large Language Models Framework for Football Event Prediction
Author
Dmytro Borovko
Year
2025
Type
Bachelor thesis
Supervisor
Dr. Rodrigo Augusto da Silva Alves
Reviewers
Ing. Miroslav Čepek, Ph.D.
Department
Summary
This thesis presents a framework for predicting the next event in football matches by leveraging Large Language Models (LLMs) to integrate semantic embeddings from unstructured textual commentary with structured event logs. The research investigates whether LLM-derived embeddings can enhance prediction accuracy compared to traditional sequential models, which rely solely on tabular event data. The methodology encompasses data acquisition and preprocessing, embedding extraction, and the development of a feedforward neural network, evaluated through metrics such as accuracy and Mean Reciprocal Rank (MRR). While the LLM-based models did not outperform conventional approaches, the framework identifies key limitations and demonstrates strong potential for enriching event modeling with semantic context. Its modular design enables extensive experimentation and serves as a reproducible benchmark for future research. This work contributes to the emerging intersection of natural language processing and sports analytics, offering a foundation for further development in using LLMs to enhance understanding and prediction of football match events.
An Artificial Intelligence-Based System for Automatic Reflection Question Generation in Educational Settings
Author
Ondřej Holub
Year
2025
Type
Bachelor thesis
Supervisor
Dr. Rodrigo Augusto da Silva Alves
Reviewers
Bc. Ondřej Brém, MSc.
Department
Summary
This thesis presents a system for generating reflection questions in educational settings using large language models (LLMs). The system uses the Socratic method in a multi-round dialogue between two separate LLM instances each with its own unique parameters to improve the quality of generated questions. For the LLMs the 4o-mini ChatGPT model was used for ease of testing and evaluation. The final system shows promise in generating high-quality reflective questions - the usage of Socratic dialogue improving the quality of the results. Few areas for improvement exist, mainly in the evaluation of quality of final questions and determining the right time to end the dialogue. The prompts created during the development of the system are available in the attachments.
Master theses
Football outcomes prediction with tensor completion embeddings
Author
Martin Kostrubanič
Year
2023
Type
Master thesis
Supervisor
Dr. Rodrigo Augusto da Silva Alves
Reviewers
Ing. Karel Klouda, Ph.D.
Department
Summary
Football is a hugely popular sport with over 3.5 billion fans worldwide, and predicting the outcome of matches has become increasingly important. While several machine learning methods have been used for this purpose, personalized machine learning methods like matrix completion have been neglected. In this thesis, I introduce tensor completion techniques for predicting football match outcomes, using two experimental strands: (1) tensor completion as a prediction method; and (2) tensor completion embeddings extraction. I consider data from five different leagues, four from Europe and one from South America. The results show that tensor completion matches or outperforms other state-of-the-art prediction methods and is capable of improving the performance of Artificial Neural Networks in this task.
Harnessing Spatial Context for Item Recommendation
Author
Vendula Švastalová
Year
2024
Type
Master thesis
Supervisor
Dr. Rodrigo Augusto da Silva Alves
Reviewers
doc. Ing. Pavel Kordík, Ph.D.
Department
Summary
This thesis explores the development of a location-based recommender system. We propose a novel architecture that integrates a general-purpose location encoder with groups of users and their preferences. Our architecture uses separate embeddings for latitude and longitude to dynamically generate recommendations of categories for new Points-of-Interests, such as events, venues, and activities at specific locations. We conduct a series of experiments to demonstrate that the proposed architecture outperforms baseline models. Our models offer more relevant and engaging content that can be used in location-based social networks, where it can increase user engagement and community involvement.
Graph-Based Fraud Detection in Recommender Systems
Author
Daniel Bohuněk
Year
2024
Type
Master thesis
Supervisor
Dr. Rodrigo Augusto da Silva Alves
Reviewers
doc. Ing. Pavel Kordík, Ph.D.
Department
Summary
Fraudsters attempt to camouflage their behavior to remain undetected. This can make it challenging to design models capable of reliably discovering them, as negative samples may be contaminated with hidden positives. Existing research has shown that taking advantage of relationships between instances using graph convolution improves the detection ability. This work proposes a siamese graph neural network that can be trained in a semi-supervised fashion using a small set of known fraudsters. It shows improved performance over existing methods and increased resilience against camouflaged fraudsters.
Human Alignment of Natural Language Processing Models
Author
Anastasiia Solomiia Hrytsyna
Year
2024
Type
Master thesis
Supervisor
Dr. Rodrigo Augusto da Silva Alves
Reviewers
Mgr. Alexander Kovalenko, Ph.D.
Department
Summary
Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo. Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos qui ratione voluptatem sequi nesciunt. Neque porro quisquam est, qui dolorem ipsum quia dolor sit amet, consectetur, adipisci velit, sed quia non numquam eius modi tempora incidunt ut labore et dolore magnam aliquam quaerat voluptatem. Ut enim ad minima veniam, quis nostrum exercitationem ullam corporis suscipit laboriosam, nisi ut aliquid ex ea commodi consequatur? Quis autem vel eum iure reprehenderit qui in ea voluptate velit esse quam nihil molestiae consequatur, vel illum qui dolorem eum fugiat quo voluptas nulla pariatur?
Segment-Based Recommendations
Author
Patrik Malý
Year
2025
Type
Master thesis
Supervisor
Dr. Rodrigo Augusto da Silva Alves
Reviewers
doc. Ing. Pavel Kordík, Ph.D.
Department
Summary
We present an approach to segment recommendation, addressing a gap in recommendation systems literature regarding segment-level predictions. Our main contributions include the Segment Preference Distribution Calculation (SPDC) with integrated masking process and systematic adaptations of traditional and group recommendation models. Evaluating across three diverse datasets (MovieLens, Amazon Electronics, Food.com), we demonstrate that model effectiveness varies significantly based on dataset characteristics. Our adapted models outperform baselines.
Investigating Scoring and Ordering in Multi-Stage Recommender Systems for Book Recommendations Using Large Language Models
Author
Maksim Spiridonov
Year
2025
Type
Master thesis
Supervisor
Dr. Rodrigo Augusto da Silva Alves
Reviewers
doc. Ing. Pavel Kordík, Ph.D.
Department
Summary
This thesis presents a multi-stage recommender system that combines classical Collaborative Filtering approach with Large Language Models to retrieve and rerank the candidate items based on user's preferences. The Large Language Model is tuned using Prompt Tuning and Fine Tuning with LoRA. The final reranking also provides explanations. The results of the experiments show that the proposed method outperforms selected baselines such as an untuned model and others. Finally, future work is proposed and discussed.