Dr. Rodrigo Augusto da Silva Alves

rodrigo.alves@fit.cvut.cz
TH:A-1354

Profile
Publications
Teaching
Theses

Theses

Sample theses

Bachelor theses

Machine Learning-Based Prediction of Football Match Statistics

Author

Ondřej Herman

Year

2023

Type

Bachelor thesis

Supervisor

Dr. Rodrigo Augusto da Silva Alves

Reviewers

Ing. Petr Kasalický

Department

Department of Applied Mathematics

Summary

Football, the most widely played and followed sport globally, captivates billions of fans worldwide. The significance of predicting match outcomes has garnered attention from statisticians, machine learning researchers, and avid bettors alike. However, while substantial progress has been made in machine learning for outcome prediction, relatively little focus has been placed on forecasting the statistical aspects of the sport. This study aims to address this gap by exploring machine learning methods to analyze and estimate various match statistics as regression problems. Specifically, I investigate six statistics: corners, shots, shots on target, fouls, yellow cards, and red cards. By conducting experiments on four datasets from different football leagues, I evaluate the performance of eight models. My findings reveal that different methods adapt better to certain statistics, and also that some statistics exhibit different behaviors across leagues. Additionally, I observe that certain features, such as the number of corners or shots, are more predictable due to their higher occurrence rates during matches compared to the number of cards.

Thesis on DSpace

Multitask Learning for Cognitive Sciences Triplet Analysis

Author

Tsimafei Stambrouski

Year

2024

Type

Bachelor thesis

Supervisor

Dr. Rodrigo Augusto da Silva Alves

Reviewers

Mgr. Alexander Kovalenko, Ph.D.

Department

Department of Applied Mathematics

Summary

This bachelor thesis focuses on the analysis of the triplet problem, whose task is to identify the odd object out of three. There are different methodological approaches to solving this problem, which differ in their basic focus - some methods evaluate which two objects are most similar, while others identify the object that is most dissimilar. The aim of this thesis is to combine these two perspectives and analyze the results. To solve the problem, a neural network has been developed using the TensorFlow library in the Python programming language. The actually research showed that the combination of both approaches did not produce better results than the individual methods alone. The main output of the work is the explanation of how the combination of opposing views affects the final choice of the odd object. For instance, in the set of (Car, Dog and House) one item should be selected as the odd out. House and Car are the most similar or Car is the odd-one-out?

Thesis on DSpace

Leveraging Large Language Models for Regionalized Recommender Systems

Author

Adam Čapka

Year

2025

Type

Bachelor thesis

Supervisor

Dr. Rodrigo Augusto da Silva Alves

Reviewers

doc. Ing. Pavel Kordík, Ph.D.

Department

Department of Applied Mathematics

Summary

This study explores the application of LLM-generated embeddings for analyzing unstructured textual data, aiming to uncover regional patterns in human behavior and local characteristics. By leveraging these embeddings, a regionalization algorithm is applied to define spatially contiguous regions that exhibit similar linguistic patterns. This approach enables the identification of organically emerging zones of interest, moving beyond arbitrary administrative boundaries. Furthermore, sparse autoencoders are employed to isolate key themes from the embeddings, allowing regionalization based on specific topics, such environmental conditions, sentiment, or infrastructure. Our findings demonstrate that combining LLM-based embeddings with sparse autoencoders provides a powerful framework for understanding regional variations, with potential applications in recommendation systems, market analysis, and sustainable land planning.

A Large Language Models Framework for Football Event Prediction

Author

Dmytro Borovko

Year

2025

Type

Bachelor thesis

Supervisor

Dr. Rodrigo Augusto da Silva Alves

Reviewers

Ing. Miroslav Čepek, Ph.D.

Department

Department of Applied Mathematics

Summary

This thesis presents a framework for predicting the next event in football matches by leveraging Large Language Models (LLMs) to integrate semantic embeddings from unstructured textual commentary with structured event logs. The research investigates whether LLM-derived embeddings can enhance prediction accuracy compared to traditional sequential models, which rely solely on tabular event data. The methodology encompasses data acquisition and preprocessing, embedding extraction, and the development of a feedforward neural network, evaluated through metrics such as accuracy and Mean Reciprocal Rank (MRR). While the LLM-based models did not outperform conventional approaches, the framework identifies key limitations and demonstrates strong potential for enriching event modeling with semantic context. Its modular design enables extensive experimentation and serves as a reproducible benchmark for future research. This work contributes to the emerging intersection of natural language processing and sports analytics, offering a foundation for further development in using LLMs to enhance understanding and prediction of football match events.

An Artificial Intelligence-Based System for Automatic Reflection Question Generation in Educational Settings

Author

Ondřej Holub

Year

2025

Type

Bachelor thesis

Supervisor

Dr. Rodrigo Augusto da Silva Alves

Reviewers

Bc. Ondřej Brém, MSc.

Department

Department of Applied Mathematics

Summary

This thesis presents a system for generating reflection questions in educational settings using large language models (LLMs). The system uses the Socratic method in a multi-round dialogue between two separate LLM instances each with its own unique parameters to improve the quality of generated questions. For the LLMs the 4o-mini ChatGPT model was used for ease of testing and evaluation. The final system shows promise in generating high-quality reflective questions - the usage of Socratic dialogue improving the quality of the results. Few areas for improvement exist, mainly in the evaluation of quality of final questions and determining the right time to end the dialogue. The prompts created during the development of the system are available in the attachments.

Master theses

Football outcomes prediction with tensor completion embeddings

Author

Martin Kostrubanič

Year

2023

Type

Master thesis

Supervisor

Dr. Rodrigo Augusto da Silva Alves

Reviewers

Ing. Karel Klouda, Ph.D.

Department

Department of Applied Mathematics

Summary

Football is a hugely popular sport with over 3.5 billion fans worldwide, and predicting the outcome of matches has become increasingly important. While several machine learning methods have been used for this purpose, personalized machine learning methods like matrix completion have been neglected. In this thesis, I introduce tensor completion techniques for predicting football match outcomes, using two experimental strands: (1) tensor completion as a prediction method; and (2) tensor completion embeddings extraction. I consider data from five different leagues, four from Europe and one from South America. The results show that tensor completion matches or outperforms other state-of-the-art prediction methods and is capable of improving the performance of Artificial Neural Networks in this task.

Thesis on DSpace

Harnessing Spatial Context for Item Recommendation

Author

Vendula Švastalová

Year

2024

Type

Master thesis

Supervisor

Dr. Rodrigo Augusto da Silva Alves

Reviewers

doc. Ing. Pavel Kordík, Ph.D.

Department

Department of Applied Mathematics

Summary

This thesis explores the development of a location-based recommender system. We propose a novel architecture that integrates a general-purpose location encoder with groups of users and their preferences. Our architecture uses separate embeddings for latitude and longitude to dynamically generate recommendations of categories for new Points-of-Interests, such as events, venues, and activities at specific locations. We conduct a series of experiments to demonstrate that the proposed architecture outperforms baseline models. Our models offer more relevant and engaging content that can be used in location-based social networks, where it can increase user engagement and community involvement.

Thesis on DSpace

Graph-Based Fraud Detection in Recommender Systems

Author

Daniel Bohuněk

Year

2024

Type

Master thesis

Supervisor

Dr. Rodrigo Augusto da Silva Alves

Reviewers

doc. Ing. Pavel Kordík, Ph.D.

Department

Department of Applied Mathematics

Summary

Fraudsters attempt to camouflage their behavior to remain undetected. This can make it challenging to design models capable of reliably discovering them, as negative samples may be contaminated with hidden positives. Existing research has shown that taking advantage of relationships between instances using graph convolution improves the detection ability. This work proposes a siamese graph neural network that can be trained in a semi-supervised fashion using a small set of known fraudsters. It shows improved performance over existing methods and increased resilience against camouflaged fraudsters.

Thesis on DSpace

Human Alignment of Natural Language Processing Models

Author

Anastasiia Solomiia Hrytsyna

Year

2024

Type

Master thesis

Supervisor

Dr. Rodrigo Augusto da Silva Alves

Reviewers

Mgr. Alexander Kovalenko, Ph.D.

Department

Department of Applied Mathematics

Summary

Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo. Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos qui ratione voluptatem sequi nesciunt. Neque porro quisquam est, qui dolorem ipsum quia dolor sit amet, consectetur, adipisci velit, sed quia non numquam eius modi tempora incidunt ut labore et dolore magnam aliquam quaerat voluptatem. Ut enim ad minima veniam, quis nostrum exercitationem ullam corporis suscipit laboriosam, nisi ut aliquid ex ea commodi consequatur? Quis autem vel eum iure reprehenderit qui in ea voluptate velit esse quam nihil molestiae consequatur, vel illum qui dolorem eum fugiat quo voluptas nulla pariatur?

Thesis on DSpace

Segment-Based Recommendations

Author

Patrik Malý

Year

2025

Type

Master thesis

Supervisor

Dr. Rodrigo Augusto da Silva Alves

Reviewers

doc. Ing. Pavel Kordík, Ph.D.

Department

Department of Applied Mathematics

Summary

We present an approach to segment recommendation, addressing a gap in recommendation systems literature regarding segment-level predictions. Our main contributions include the Segment Preference Distribution Calculation (SPDC) with integrated masking process and systematic adaptations of traditional and group recommendation models. Evaluating across three diverse datasets (MovieLens, Amazon Electronics, Food.com), we demonstrate that model effectiveness varies significantly based on dataset characteristics. Our adapted models outperform baselines.

Thesis on DSpace

Investigating Scoring and Ordering in Multi-Stage Recommender Systems for Book Recommendations Using Large Language Models

Author

Maksim Spiridonov

Year

2025

Type

Master thesis

Supervisor

Dr. Rodrigo Augusto da Silva Alves

Reviewers

doc. Ing. Pavel Kordík, Ph.D.

Department

Department of Applied Mathematics

Summary

This thesis presents a multi-stage recommender system that combines classical Collaborative Filtering approach with Large Language Models to retrieve and rerank the candidate items based on user's preferences. The Large Language Model is tuned using Prompt Tuning and Fine Tuning with LoRA. The final reranking also provides explanations. The results of the experiments show that the proposed method outperforms selected baselines such as an untuned model and others. Finally, future work is proposed and discussed.