doc. Ing. Pavel Kordík, Ph.D.

Theses

Dissertation theses

Algorithms and architectures of recommender systems

Level
Topic of dissertation thesis
Topic description

In the recommendation systems, we are currently focusing research on a few open problems that have deep theoretical underpinnings, but whose solutions also have very concrete practical applications. We are exploring the utilization of deep neural networks to reduce the cold start problem of recommender systems, the design of transformers to predict shopping baskets. In the field of general machine learning, we focus on reinforced learning to optimize longer-term metrics such as user satisfaction, on transfer learning methods to incorporate new referral databases, and on using AutoML to optimize the architecture and hyperparameters of recommender systems.

Bachelor theses

Text to query using large language models on local infrastructure

Author
Patrik Laurinc
Year
2023
Type
Bachelor thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Mgr. Alexander Kovalenko, Ph.D.
Summary
In a world teeming with digital data, simplifying human interaction with databases is an ever-pressing challenge. This undergraduate thesis delves into the intriguing potential of large language models (LLMs) to transform natural text into query languages, such as SQL.

Question Answering Algorithms in Natural Language

Author
Matúš Žilinec
Year
2019
Type
Bachelor thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Magda Friedjungová
Summary
This bachelor's thesis surveys state-of-the-art algorithms for natural language question answering, focusing on machine reading comprehension and deep learning based models. The transformer architecture is explored and evaluated on the newly released NaturalQuestions dataset. The thesis analyzes particular errors and limitations of current algorithms and discusses their possible improvements.

Deep learning for algorithmic trading

Author
Vojtěch Mikšů
Year
2014
Type
Bachelor thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Zdeněk Konfršt, Ph.D.

Lip Reading using Deep Neural Networks

Author
Jan Horák
Year
2018
Type
Bachelor thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
MSc. Juan Pablo Maldonado Lopez, Ph.D.
Summary
The problem of lip reading, which means a skill of guessing one's uttered word or whole sentence only out of a visual information, is a very hard - yet interesting task, due to variety of people, their languages and articulations. In this bachelor thesis I analyze the known methods of lip reading, I find their accuracy and my aim is to verify whether the use of artificial intelligence methods, namely Deep Neural Network, is a suitable candidate for solving this problem. In the practical part, I focus on presenting the results both in terms of the accuracy of the trained neural network on test data and by creating and publishing a web application to find out how difficult it would really be to use such a tool for a real-time speech recognition using the lip reading method.

Classification of URLs Using Deep Neural Networks

Author
Matyáš Skalický
Year
2019
Type
Bachelor thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
MSc. Juan Pablo Maldonado Lopez, Ph.D.
Summary
My work explores the field of automatic URL classification with particular attention to character-level deep neural networks. It summarizes recent advancements in the field and proposes a working model which outperforms the enterprise baseline on a real world dataset.

Music Recommender System

Author
Ondřej Šofr
Year
2018
Type
Bachelor thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Tomáš Kalvoda, Ph.D.
Summary
This thesis deals with the field of personalized recommendation of music. Modern approaches are described and analyzed, especially the methods of collaborative filtering. The main focus is the processing of temporal dimension data of user actions and its usage to improve recommendation systems. The most important part is the analysis of models predicting the activity of users. Prediction accuracy and efficiency of solutions are compared with emphasis on the usability in practice. This thesis contains experimental results of presented methods tested on real-world data.

Scalability of Nearest Neighbor Algorithm

Author
Antonín Dvořák
Year
2019
Type
Bachelor thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
doc. Ing. Ivan Šimeček, Ph.D.
Summary
This thesis deals with efficient algorithms used for k nearest neighbors. We introduced three different approaches. First of them is k-means clustering for kNN, second one is an approximate kNN graph construction using locality sensitive hashing and last one kNN on graphical processing unit. Most parts of implementations are writen in C++. In the end we measure real efficiency of given algorithms.

Software module for skill based search of students

Author
Adam Jankovec
Year
2019
Type
Bachelor thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Michal Valenta, Ph.D.
Summary
This thesis focuses on the development process of a web application for a skill-based search of students, which generates student skill sets from grades, collected during students' study at Faculty of Information Technology at Czech Technical University in Prague. The result is a proof of concept, which brings a possibility to filter students based on their skill set, which can be used by companies and teachers to search for candidates for their assignments.

Efficient training algorithms for polynomial models

Author
Martin Procházka
Year
2013
Type
Bachelor thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
doc. RNDr. Ing. Marcel Jiřina, Ph.D.
Summary
This thesis contains a compilation of several polynomial regression methods. The research focuses mainly on the self-organization methods, evolutionary techniques and crossovers of the two. I have implemented a set of algorithms which I have then compared on three different benchmarks. I have observed the effects of the complexification and the effects of the number of training instances on the algorithms' performances. For 2-D datasets there are comprehensible visualisations of models' predictions and their interpretation.

Optimization of Recommender Systems

Author
Radek Bartyzal
Year
2016
Type
Bachelor thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
prof. Ing. RNDr. Martin Holeňa, CSc.
Summary
This thesis will review classification of recommender systems, the most frequently used algorithms behind them and approaches how to evaluate them with a special attention paid to the online methodologies. It will continue with description of a new SAOOA algorithm inspired by evolutionary strategies that is capable of optimizing the recommender systems online using a Gaussian mixture model. The exact functionality of the algorithm will be explained using simulated experiments and tested on real world recommender systems. Methods to visualize the quality of different parameter settings of a recommender system or the state of the algorithm will be presented as well.

Algorithms for Origami Folding

Author
Jiří Nádvorník
Year
2013
Type
Bachelor thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
doc. Ing. Ivan Šimeček, Ph.D.

Improving Learning to Rank Algorithms

Author
Huy Hoang Vu
Year
2018
Type
Bachelor thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
MSc. Juan Pablo Maldonado Lopez, Ph.D.
Summary
In this thesis I explore existing approaches to the learning to rank problem and collaborative filtering methods, and apply them to Yandex's dataset provided in the Personalized Web Search Challenge competition on Kaggle.com. I build on the existing submissions by replicating the top competitor's feature extraction from the dataset. Then I implement and apply ES-Rank and matrix factorization on these features and test if matrix factorization based collaborative filtering significantly increases the overall performance of the algorithm. Then I compare the performance of the implemented algorithms to other submissions on Kaggle. Lastly I analyze the time complexity of my solution.

Estimating Customer Lifetime Value for Media Houses

Author
Bekbolot Khudaiberdiev
Year
2022
Type
Bachelor thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Mgr. Ladislava Smítková Janků, Ph.D.
Summary
The goal of this work is to explore methods of predicting CLV in a subscription- based business setting, compare these methods’ performance on a publicly available dataset and implement an open-source extension that will help me- dia houses estimate CLV. The sBG/NBD, gradient boosting regressor, and neural network models are compared on a publicly available dataset from music streaming service KKBox. Based on experiments the best model in predicting CLV on individual and cohort levels is gradient boosting regressor, which is subsequently integrated into open-source extension. The end product of this work is the widget extension of project REMP - open-source software that helps media houses to monetize their content.

Job-board predictions

Author
Martin Šmíd
Year
2017
Type
Bachelor thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
RNDr. Kateřina Trlifajová, Ph.D.
Summary
This thesis looks into the area of predictive modeling and text mining methods. Based on the job advertisements data, the goal is to predict number of responses to the job offer. Individual attributes are examined and interesting datarelationsareshown. Subsequently,predictivemodelsareproposed,built, and compared. By using model confidence, the model performance is tuned. Lastly, variable importances for the best model are discussed.

Personalization of Data Visualization User Interfaces

Author
Martin Endršt
Year
2016
Type
Bachelor thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Tomáš Řehořek, Ph.D.
Summary
This Bachelor thesis explores the possibility of using interactive evolutionary algorithm to personalize data visualisation. First step was to research some applications using interactive evolutinary algorithms and analyse their approach to user interaction with the evolution. Based on research of other applications using the interactive evolution a web application demonstrating three different approaches to this problem was developed using JavaScript as a primary programming language along with D3.js library to render the data visualisations.

User subaccounts detection for better recommendation

Author
Tomáš Vopat
Year
2019
Type
Bachelor thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
MSc. Juan Pablo Maldonado Lopez, Ph.D.
Summary
Account sharing in online streaming services has a negative impact on recommendation systems and consequently on the quality of services provided to the users. This thesis aims to design a method capable of detecting such accounts. To address this issue, we designed and implemented algorithms based on collaborative filtering and sliding window method revealing shared accounts. The proposed algorithms allow detection of other user's activity with high accuracy, which can be filtered and recommendation system gets only relevant data for a given user. Furthermore shared accounts can be restricted since there is a subscription that users try to avoid.

Web service for advanced text analysis

Author
Jan Švejda
Year
2017
Type
Bachelor thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Karel Klouda, Ph.D.
Summary
Many statistical and machine learning models, which tackle the problems of natural language processing, have the potential to assist humans in various domains. In this thesis, an application is created, that gives access to these models applied to the field of Internet news through a web service and a website. This provides writers and editors of articles with more information, makes integration of such models with other systems possible and allows people interested in natural language processing to interactively experiment with them and learn about them, too. Both the web service and the website were successfully designed and implemented with security and scalability in mind. The application is designed in such a way, that extending it with new functionalities in the future will be easy.

Grammar interactive evolution of graph interfaces

Author
Petr Hanzl
Year
2019
Type
Bachelor thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
doc. RNDr. Pavel Surynek, Ph.D.
Summary
This bachelor thesis investigates the utilization of evolutionary algorithms for generating graphical interfaces, from a user's preferences. These graphical interfaces can be described by grammar. Grammar is a set of rules that describes all of the feasible settings of the graphical interfaces. Additionally, a prototype was created from the information obtained in the initial investigation. The prototype was designed to generate graph visualizations for given datasets. The prototype measures the number of iterations that were used to produce the desired visualization. The number of iterations does not have any higher meaning as the interactive evolutionary algorithms are dependent on randomness and the user's preferences. Although, the measurements act as a proof-of-concept, indicating that the prototype is functioning as intended.

Recommendation of new movies to obtain

Author
Tatiana Lekýrová
Year
2019
Type
Bachelor thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Summary
This bachelor's thesis explores the problem of movie content acquisition in a recommendation domain. It proposes two solutions to tackle the cold-start setup of predicting popularity of newly obtained content. The first method it proposes is content-based filtering and the second is neural network embeddings. The implementation is supported by the state-of-the-art analysis. The evaluation of the methods shows that content-based filtering has better results in contrast to neural network embeddings.

Master theses

Machine Learning in Game Playing using Visual Input

Author
Martin Brázdil
Year
2016
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
doc. RNDr. Ing. Marcel Jiřina, Ph.D.
Summary
Game playing using visual input is complex problem which touches almost every topic in machine learning. Nevertheless it is a simplified version of ultimate goal of creating universally intelligent agent. This thesis has many goals. Create consistent fundamental theory. Description of relevant modern techniques. Proposal of novel technique and its experimental verification.

Visualizing behavior of model ensembles

Author
Jan Fabián
Year
2013
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Jaroslav Kuchař, Ph.D.

Recommender system powered by text mining methods

Author
Radovan Lupták
Year
2013
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Mgr. Ondřej Háva, Ph.D.

Educational performance dashboard

Author
Marie Remešová
Year
2015
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Michal Valenta, Ph.D.
Summary
This thesis deals with methods for porlet development for the Liferay portal. It covers the description, acquisition and following visualisation of data from the faculty's employee load database. The theoretical part examines visualisation methods convenient for this type of application and the choice of colors for different colors scales. The mainstay of the thesis consists of processing and the subsequent visualisation of data using the D3.js library. During the work on the thesis, a set of dashboards was created to enable the employees of the faculty to better orient in their work loads and thanks to the newly implemented comparison mechanism will help the managers with running the faculty and its departments.

Visual search in big graph structures

Author
Petr Šuták
Year
2016
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Mgr. Martin Podloucký
Summary
The aim of this work is to explore methods for searching in large graphs. Based on their analysis, then select a suitable algorithm to be part of the implementation of a visual search over graphs. Search modul will be part of tool SVAT, which is product used for visual data analysis. Based on the requirements for the system, there will be proposed a solution, whose success will be tested. The outcome of this work will be functional module, ready for use in industry.

Strengthening feedback loop from industry to university

Author
Aleš Fišer
Year
2014
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Marcel Hlopko

Sharing and electronic document circulation infrastructure

Author
Filip Rajnoch
Year
2013
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Vladimír Mezera

Analytic framework for eshop interactions data

Author
Lukáš Dvořák
Year
2016
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Tomáš Řehořek, Ph.D.
Summary
This final thesis deals with identifying a specificity of e-shops and includes a research on suitable methods for analysing e-shop transactional data. The outcome is a design of an analytical framework for reporting transactional and sales data to the owners, that is able to be applied to e-shops of an arbitrary type, size and independently of implementation. Two real-world case studies using the framework design are part of the paper.

Faculty Data Warehouse

Author
Stanislav Kuznetsov
Year
2012
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Michal Valenta, Ph.D.

3D Printing Automation Approaches

Author
Marek Žehra
Year
2014
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Jan Trávníček, Ph.D.

Frequent Sequence Mining Algorithm

Author
Martin Hadáček
Year
2012
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Miroslav Čepek, Ph.D.

Comparison of statistical and data mining approaches to chemometrical analysis of oil derivatives

Author
František Hána
Year
2014
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Tomáš Bartoň, Ph.D.

Faculty Portal Aplication Interface

Author
Tomáš Králík
Year
2012
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Vladimír Mezera

Recommendation based on product images

Author
Kristýna Tauchmanová
Year
2020
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Tomáš Kalvoda, Ph.D.
Summary
The aim of this master thesis is the analysis of recommendation systems and the design of such systems using product images. The work includes analysis of the provided data and design of recommendation algorithms for various scenarios. A part of the work is also a theoretical introduction to the issues of recommendation systems and image processing. At the end, the thesis also deals with the offline evaluation of the proposed models.

Algorithms for better understanding of recommended content and user clusters

Author
Pavel Hlubík
Year
2021
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Daniel Vašata, Ph.D.
Summary
Studying the relationship between audience and content they consume is vital for content creators. In this thesis, we propose and experimentally evaluate various methods to visualize aspects of the audience. We introduce a new technique of embedding users and items into the same latent vector space, which achieves promising results on the well-known Movielens dataset. To visualize the audience we further utilize self-organizing maps, usage of which for this type of task is according to our best knowledge a novel approach. The last method of this kind is a newly published framework MDE, which overcomes many drawbacks of t-SNE, yet it does not compromise quality, which we show experimentally. We also address other views of user interactions. Users and items are connected with the help of Sankey diagrams which offer a complex insight into which groups of users interact with what content and are more informative than simply classifying users into a single category. We also propose an approach to visualize user's interactions over time which may help to analyze time-dependent behavior.

Distributed computation environment

Author
Adam Činčura
Year
2014
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
prof. Ing. Pavel Tvrdík, CSc.

Portal for support of educational activities and student evaluation

Author
Zdeněk Balák
Year
2017
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Jiří Hunka
Summary
This thesis is concerned with learning management system, more specifically system to be used in FIT CTU. It describes creation from analysis through design to implementation. It uses software engineering methods as well as web development technologies. The result is functioning prototype that can be used to support courses. Practical part of the thesis can be used to further extend and develop the learning management system. The text of the thesis can be used as inspiration for web application development.

R&D data mining

Author
Vojtěch Medonos
Year
2013
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Martin Šlapák, Ph.D.

Training set Construction Methods

Author
Tomáš Borovička
Year
2012
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
doc. RNDr. Ing. Marcel Jiřina, Ph.D.

Modern Methods for News Classification

Author
Martin Pirkl
Year
2014
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Tomáš Bartoň, Ph.D.

Understanding Documents with Text Mining Methods

Author
Sergii Stamenov
Year
2017
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Mgr. Petr Paščenko
Summary
Keyword is a term that captures topic of a document. They can be used for quick article summarization or search optimization. In this work we test whether keywords can improve quality of document clustering and document classification compared to unigrams. Futher we describe text mining system that can extract keywords from a document and cluster documents as the part of back-end of news web portal. We also explore possible methods of organizing keywords into hierarchy and visualize them in time.

Personalized recommendations for students

Author
Čeněk Žid
Year
2022
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Rodrigo Augusto da Silva Alves, Ph.D.
Summary
This thesis provides an overview on job recommendation for students. The research part describes current state-of-the-art methods of recommender systems and analyses current research for student profiling. The experimental part focusses on the implementation of several different methods described in the research part. These methods are tested and one of those methods is selected, specifically a method based on the term frequency-inverse document frequency algorithm with a custom set of keywords. The general model deals with recommendation based on interactions and is extended with recommendation based on student profiles using the selected method. The presented recommendation system is tested in two experiments. Overall, the results suggest a significant improvement with recommendation using the presented method.

Predictive models in Logistics: Comparison of traditional time series techniques with an artificial neural network model approach

Author
Ruhi Ravichandran
Year
2015
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Jan Motl
Summary
Demand forecasting is a crucial part of managing any supply chain network, since inaccurate forecasting often leads to inventory mismanagement which in-turn amounts to big losses for companies.Though most of the companies have some forecasting techniques in place, it is equally important to know if the forecasting techniques being used are best suited for their requirements. This thesis provides a comparative study of traditional time series methods namely: \textit{Holt Winters, Exponential smoothing and ARIMA} with an artificial neural network model in order to forecast inventory levels of multiple SKUs at the last mile of the supply chain, which is a retail store. Comparison is performed using various forecasting accuracy measures. The study provides insights to the company that manufactures number of SKUs, as to which forecasting techniques would best suit their need for managing inventory at the store level and why. It also sheds light on the factors that affect the performance of models, for example sequence of events linked to sales like promotions, holiday season, weekends etc. or the granularity at which the forecasting is being done, whether it is weekly or monthly. Modelling and analysis was performed in R programming environment. The data used for this study is real world point of sales data provided by a leading fast moving consumer goods company. It is an industry based research to help the company improve their demand forecasting techniques.

Educational and Reseach Performance Dashboards

Author
Petr Dušek
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Jiří Mlejnek

Vehicle Routing Problem with Time Windows solved via Machine Learning and Optimization Heuristics

Author
Adam Zvada
Year
2021
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Mgr. Ladislava Smítková Janků, Ph.D.
Summary
A novel approaches in the field of machine learning has been proposed to solve the vehicle routing problem, but yet a variant of vehicle routing with soft constrained time windows has received almost no attention. Even though it is a must for any production ready logistics planner. This thesis proposes a new method for solving a vehicle routing problem with soft constrained time windows (VRPTW) using deep reinforcement learning. The model is built upon Transformer architecture utilizing Graph Attention Network for embedding the input instance. The model is using the proposed reward function that incorporates the time window constraint. The thesis also explores other metaheuristics methods for solving VRPTW, which is used to benchmark the model performance. The result of this thesis is end-to-end deep learning model solving VRPTW but it is still outperformed by metaheuristics solvers.

Reporting for Efficient Management of Faculty

Author
Adam Kučera
Year
2012
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Michal Valenta, Ph.D.

Algorithms for time-space behaviour analysis of suspects

Author
Ján Tkačík
Year
2016
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Zdeněk Buk, Ph.D.
Summary
Many studies have shown that human mobility patterns have a high degree of both spatial and temporal regularity. This fact assures us that it is possible to create a model of human behaviour for mobility pattern extraction and location prediction. Detectives empowered with such models will be able to detect potential threats faster as well as detect suspicious behaviour of suspects automatically. We have implemented application for personally important place detection as well as mobility prediction just from personal location history. For place detection new algorithm AgarClust has been proposed. Sequence learning potential of recurrent neural networks has been used for mobility prediction. We implemented and used Neural Turing Machine to model person`s behaviour and to predict his future locations. We showed that NTM predictor is able to model many mobility patterns with accuracy almost equal to maximum predictability. Created application will help detectives within Police of the Czech Republic to notice threats faster and make their work more efficient.

Supporting the Diagnosis of Borreliosis by Machine Learning Methods

Author
Jan Motl
Year
2013
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Tomáš Bartoň, Ph.D.

Alumni portal

Author
Tomáš Janda
Year
2013
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Daniel Matocha, MSc.

Analytic web services for the academia industry collaboration portal

Author
Jiří Maroušek
Year
2016
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Aleš Fišer
Summary
The work deals with the implementation of analytical methods using text mining algorithms, which are designed to facilitate and automate the user activity of faculty Portal collaboration with industry. The second part is the design and implementation of web services that give you access to the outputs of analytical methods. Text thesis deals with the theoretical part used text-mining algorithms, analyzing external data usable for analytical methods, documents the implementation of both parts and labor finally focuses the debate over the quality of results.

Recommending related images to articles

Author
Matouš Pištora
Year
2019
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Karel Klouda, Ph.D.
Summary
This diploma thesis focuses on the analysis of state-of-the-art algorithms in image processing and text mining including modern deep learning and neural networks. A system capable of recommending images related to an article based on the text of the article has been designed and implemented with the use of supervised learning. Multiple image and text feature algorithms have been evaluated along with numerous regression algorithms. The system was extended to multiple languages and domains.

Neural Autoencoders in Recommender Systems

Author
Michal Bajer
Year
2019
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Jaroslav Kuchař, Ph.D.
Summary
This thesis is concerned with the potential usage of neural autoencoders in recommender systems, their ability to predict user behaviour and the differences between variants of the models. The goal of the thesis is to explore the possible solutions, determine suitable metrics for measuring the quality of recommendations, implement the promising solutions and compare their performance on available datasets. The result of the thesis is an analysis and a discussion of possible solutions, experimental study of the effects of hyperparameters on the quality of recommendations and the choice of the most suitable model based on performed experiments.

Machine learning based query analysis for an online medical clinic

Author
Adam Jankovec
Year
2021
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Jitka Hrabáková, Ph.D.
Summary
This thesis focuses on machine learning automation of processes of an online medical clinic, while following the Cross-industry standard process for data mining. The result is a successful deployment, into a production environment, of a service, which predicts medical fields based on patients' queries, and the~establishment of cooperation between the Faculty of Information Technology and the uLékaře.cz company.

Tools and approaches for new generation journalism

Author
Jakub Bartel
Year
2015
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
prof. Dr. Ing. Petr Kroha, CSc.
Summary
This thesis deals with practical implementation of the methods used in the field of machine learning into the area of online journalism. It explains collection, processing and further usage of text data, social signals and data received from analytic systems for web traffic monitoring. The system described in the thesis is able to personalize the process of following the development of up-to-date world events - topics and events that are currently popular - as well as to predict potential popularity of individual topics in the future. For this purpose it uses a tool for content recommendation, which is combined with a tool for predictive modelling in a nontrivial manner.

Data mining of high school alumni performance at the university

Author
Eliška Hrubá
Year
2014
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Stanislav Kuznetsov, Ph.D.

Visual detection of relations in databases

Author
Matyáš Krutský
Year
2012
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Jaroslav Kuchař, Ph.D.

Anytime Learning with Auto-Sizing Neural Networks

Author
Vojtěch Cahlík
Year
2022
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
prof. Ing. RNDr. Martin Holeňa, CSc.
Summary
Anytime algorithms produce approximative results whose quality improves with computation time. The thesis focuses on applying anytime algorithms on machine learning tasks with use of auto-sizing neural networks, which are deep learning models that can efficiently prune their components during training and are trainable with gradient-based optimization methods. As part of the thesis, auto-sizing is extended into a novel technique called dynamic auto-sizing, which allows to dynamically change the size and structure of the models during training according to the applied regularization strength, and the technique is incorporated into several anytime learning algorithms. The experimental evaluation shows that dynamic auto-sizing models can successfully be used in various classification and regression tasks and often provide an improvement in predictive performance over traditional approaches.

Information support for clinical studies

Author
Václav Čadek
Year
2013
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Tomáš Borovička

Application to analysis of text messages for investigators

Author
Štěpán Škorpil
Year
2015
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Summary
This thesis describes the designing and implementation of a text file processing software in the Czech language which could be used by Police of the Czech Republic to fight crime. Of the complex solution, the first part has been developed here, designated for scanning text documents, clustering them and searching for similar ones. The aims of this thesis have been achieved through searching suitable algorithms for stemming the Czech text.

Methods for Metallurgical Samples Identification

Author
Juraj Leškanič
Year
2012
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
prof. Ing. Michal Haindl, DrSc.

Learning to Rank Query Suggestions

Author
František Hejl
Year
2016
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
doc. RNDr. Ing. Marcel Jiřina, Ph.D.
Summary
This thesis studies the problem of ranking results in start and end point auto- completion in a mobile public transit application. Several metrics for eval- uating autocompletion quality were suggested and applied for learning an autocompletion algorithm using a dataset of historical real-world searches. The best autocompletion algorithm can predict both start and end points with 29 % accuracy and just the start or end point alone with 53 % and 50 % ixaccuracies respectively. It has been implemented as an efficient library and integrated into the mobile application.

Recommendation algorithms optimization

Author
Jakub Drdák
Year
2018
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Tomáš Řehořek, Ph.D.
Summary
Various recommendation algorithms have been proposed in recent years. However, each of them has one thing in common. It is essential to tune their hyper-parameters in order to achieve good results. This work has focused on selecting modern and scalable algorithms. The aim has been to design and implement an optimization procedure capable of fine-tuning their hyper-parameters and evaluate the results on real-world datasets.

Criminality prediction

Author
Veronika Maurerová
Year
2017
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Summary
Emphasis on work efficiency and the increasing interest in data processing, Machine learning and Artificial Intelligence caused that the predictive analysis becomes part of the police activities especially in the domain of criminality prevention. For example, the police patrols are scheduled based on the predictive analysis the most risk areas in the city. This thesis is focused on supervised learning methods and their capability to find hidden patterns in the real historical crime data. The objective is to predict future crime with a certain probability using the algorithms based on decision trees and neural networks.

Anomaly detection methods

Author
Jan Krejcar
Year
2014
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
doc. RNDr. Ing. Marcel Jiřina, Ph.D.
Summary
In some cases it is more valuable to detect an outlier in system than searching for normal patterns (medical diagnoses, intrusion detection, fraud detection). Therefore searching of outliers - anomaly detection - is one of the basic part of datamining. The aim of this thesis is to taxonomy anomaly detection methods, then to choose and implement suitable methods for needs of a telecommunication company and to improve these methods for semi-supervised system of anomaly detections.

Liferay communication infrastructure

Author
Marcel Mika
Year
2013
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Mgr. Monika Součková

Historic Landmarks Identification

Author
Peter Bábics
Year
2017
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Tomáš Kalvoda, Ph.D.
Summary
I analyzed existing approaches to recognizing objects from photos. I analyzed the possible sources of data for historical buildings in Prague, and in a suitable way I created a set of photographs for machine learning, then I tested various models and approaches for recognizing historical sites using deep convolutional networks. Finally, the selected model was implemented into a simple framework, which I then connected to the chatbot Golem and tested overall integration using the mobile application Messanger

Portlets opening access to university courseware

Author
Daniel Heglas
Year
2015
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
doc. Ing. Jan Schmidt, Ph.D.

UI design for accessing educational resources

Author
Jan Pavlovský
Year
2013
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
doc. Ing. Jan Schmidt, Ph.D.

Regression modelling for fulltext search engine

Author
Jakub Jirků
Year
2013
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Jan Černý

User autentization and role management for aplications of the faculty

Author
Martin Matuška
Year
2012
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Martin Bílý

Scalability of Predictive Modeling Algorithms

Author
Tomáš Frýda
Year
2017
Type
Master thesis
Supervisor
doc. Ing. Pavel Kordík, Ph.D.
Reviewers
Ing. Karel Klouda, Ph.D.
Summary
This thesis has two main goals - (1) parallelize FAKE GAME by integration into, an open source machine learning framework, H2O, and (2) evaluation of anytime properties of machine learning algorithms and influence of hyper-parameter optimization on them. To meet these objectives, I have integrated FAKE GAME into H2O and, in order to evaluate anytime properties, I have implemented, a new tool, called Benchmarker. The evaluation of anytime properties shows that for some problems FAKE GAME models outperform state-of-the-art models from H2O, in both, accuracy and performance. Moreover, the evaluation of hyper-parameter optimization show little success, when optimizing H2O machine learning algorithms. I hypothesise that the negligible performance improvement, and for some optimized models even lower performance than with default configuration, is caused by hyper-parameter automatic tuning, which is done by default in H2O for some hyper-parameters.