Ing. Tomáš Řehořek, Ph.D.

Theses

Bachelor theses

Framework for evaluation of frequent sequences algorithms in Recommender Systems

Author
Michal Bajer
Year
2017
Type
Bachelor thesis
Supervisor
Ing. Tomáš Řehořek, Ph.D.
Reviewers
doc. Ing. Pavel Kordík, Ph.D.
Summary
This bachelor thesis deals with the analysis of sequential data, the search of frequent patterns in time-based data and its use for the recommendation of products to the customers of Internet shops based on their history of purchases. The aim of the thesis is to design and implement a framework that allows the processing of collected data, to find patterns and to determine the probable continuation of the customer purchase sequence. Another goal is to allow the evaluation of the success of several different algorithms and to choose the most suitable for a specific purpose. The result of the thesis is the design, discussion of options and implementation of a functional framework for product recommendation and success evaluation. At the end, the quality of the recommendations was analysed using the implemented algorithms. The implementation of the work was done in Java.

Evaluation of Distributed Database Systems in Context of Personalized Recommendation

Author
Filip Křesťan
Year
2016
Type
Bachelor thesis
Supervisor
Ing. Tomáš Řehořek, Ph.D.
Summary
The aim of this thesis is to compare distributed database systems in terms of performance on two use cases which closely follow the needs of item-based collaborative filtering recommender algorithm. The first use case is a cache of precomputed predictions. The second use case is a sparse matrix representation with focus on selection of whole matrix row and column. In the first part, the recommendation algorithm is presented and set in the context of the specified use cases. In the next part, possible representations of sparse matrix with focus on specified operations are explored. Based on the obtained information, four distributed database systems suitable for the specified use cases were selected: Aerospike, Couchbase, Riak and MemSQL. Also, in this part, Yahoo! Cloud Serving Benchmark was chosen as benchmarking framework suitable for the specified use cases and selected database systems. The central part of this thesis describes in detail the methodology used for performance comparison, the benchmark framework extensions required for this experiment and the environment the test was conducted in. Finally, the methodology is applied in the last part of this section which describes the execution of the performance comparison. The results of the measurement, which are presented in the last chapter, show that Aerospike did outperform the rest of the selected database systems in both use cases in terms of throughput of operations per second and operations latency. The results also suggest the possible combination of both workloads in one Aerospike database cluster with great advantage.

Web environment for algorithms on binary trees

Author
Jakub Melezínek
Year
2013
Type
Bachelor thesis
Supervisor
Ing. Tomáš Řehořek, Ph.D.
Reviewers
Ing. Jiří Chludil

Content Recommendation using Text Mining Methods

Author
Martin Barus
Year
2014
Type
Bachelor thesis
Supervisor
Ing. Tomáš Řehořek, Ph.D.
Reviewers
doc. Ing. Pavel Kordík, Ph.D.

Analyzing Impact of Interaction Context in Collaborative Filtering

Author
Martin Scheubrein
Year
2019
Type
Bachelor thesis
Supervisor
Ing. Tomáš Řehořek, Ph.D.
Reviewers
Ing. Kamil Dedecius, Ph.D.
Summary
Collaborative filtering is one of the most successful techniques used in recommender systems. The basic algorithms utilize history of interactions between users and items. However, recommenders deployed in production often have at least one more dimension of data available--timestamp of the interaction. These interaction circumstances are collectively referred to as the context. This thesis exploits the additional information in order to improve overall recommender accuracy. Several novel approaches to incorporating context into traditional collaborative filtering are proposed. An evaluation framework is designed and proposed algorithms are extensively evaluated with different parameters and contexts on multiple datasets. Results show that even mostly static datasets benefit from the proposed context-aware approach. About 5-35 % recall increase was observed in comparison with traditional collaborative filtering algorithms.

Interactive Geometric Sketchpad for Web Client

Author
David Šenkýř
Year
2014
Type
Bachelor thesis
Supervisor
Ing. Tomáš Řehořek, Ph.D.
Reviewers
Ing. Pavel Štěpán

Tool for Analysis of Artificial Neural Networks

Author
Vít Steklý
Year
2013
Type
Bachelor thesis
Supervisor
Ing. Tomáš Řehořek, Ph.D.
Reviewers
Ing. Ivo Petr, Ph.D.

Design and Implementation of Feature Selection Module in Modgen Recommender System

Author
Michal Režnický
Year
2016
Type
Bachelor thesis
Supervisor
Ing. Tomáš Řehořek, Ph.D.
Reviewers
doc. Ing. Pavel Kordík, Ph.D.
Summary
This bachelor thesis is dealing with expansion of existing recommender system by new functionality in the form of a module based on requirements from customer. The thesis should represent common usage of software engineering. In this particular case has customer an intention to expand recommender system Modgen by module implementing feature selection. The work starts by assignment and collection of requirements for expanding module by using methods of software engineering. Followed by analysis of existing system for better image about the environment, where the modul will be implemented. Further analysis is performed on possible methods and algorithms, which can be used. Based on analysis, I will introduce design of new module and its placement in the system, the module is implemented on the basis of the design. Then various testing data are evaluated, finally, the results are discussed and summarized.

Recommender system for online gaming discount portal

Author
Tomáš Fedor
Type
Bachelor thesis
Supervisor
Ing. Tomáš Řehořek, Ph.D.
Reviewers
doc. Ing. Pavel Kordík, Ph.D.

Evaluation of Local Sensitive Hashing (LSH) Algorithm in Recommender Systems

Author
Ladislav Martínek
Year
2018
Type
Bachelor thesis
Supervisor
Ing. Tomáš Řehořek, Ph.D.
Reviewers
doc. Ing. Pavel Kordík, Ph.D.
Summary
This thesis deals with the approximation of the user-based k-nearest neighbor algorithm using locality-sensitive hashing methods and the application of these methods in recommender systems. First the thesis describes the recommendation systems, the collaborative filtering, the k-nearest neighbors algorithms and the method of their approximation using locality-sensitive hashing methods. Based on the analysis, framework was designed and implemented to test various parameterizations of locality-sensitive hashing methods. The precision of nearest neighbors algorithm or the success rate of recommending (by using the recall rate depending on the catalog coverage) can be tested in the framework. Described methods are tested on two separed databases using the framework. From the tests of individual methods and parameterizations, the possibilities of their combinations are derived to achieve optimal models. During the testing of the optimal models I was able to achieve very satisfactory results. At around 3 % of the reference solution time, the reached success rate was between 97 % and 99 %. Finally, I discussed the results and different findings from the testing of the individual methods.

Recommendation Models Based on Images

Author
Martin Pavlíček
Year
2018
Type
Bachelor thesis
Supervisor
Ing. Tomáš Řehořek, Ph.D.
Reviewers
doc. Ing. Pavel Kordík, Ph.D.
Summary
The aim of this thesis is to design, implement and compare set of new content-based recommender models, using image processing methods for item similarity extraction with focus on web and e-commerce recommendations. Proposed models are meant as an alternative for a widely used collaborative filtering-based recommender systems, which have set of problems, including cold-start problem. In this thesis, for image similarity extraction there will be used modern methods like ORB algorithm or artificial neural network. Proposed models will be offline tested in the latter part of this thesis on the recall and catalog coverage metrics on a real world e-shop datasets. Artificial neural network-based recommender model had the best results in offline tests out of all proposed models and took place in online A/B test against production collaborative filtering-based recommender model. Total number of 7435 users attended this test and proposed model based on artificial neural network showed comparable click through rate as the production collaborative filtering-based model.

Advanced gradient optimization methods of artificial neural networks in RapidMiner 5

Author
Tomáš Richtr
Year
2014
Type
Bachelor thesis
Supervisor
Ing. Tomáš Řehořek, Ph.D.
Reviewers
prof. Dr. Ing. Petr Kroha, CSc.
Summary
This bachelor thesis solves creating a plugin into a program RapidMiner 5, which is realising an operator of artificial neural network with advanced learning algorithms. The created operator extends the RapidMiner 5 capabilities of learning an artificial neural network by algorithms Delta-Bar-Delta, Quickprop and Rprop. This thesis contains an analyze of learning algorithms and a current solution in the program RapidMiner 5, a design of the plugin, an implementation of the plugin and final testing of created plugin with different data sets.

Web Framework for Stepping and Visualization of Algorithms

Author
David Šolc
Year
2013
Type
Bachelor thesis
Supervisor
Ing. Tomáš Řehořek, Ph.D.
Reviewers
prof. Ing. Pavel Tvrdík, CSc.

Comparison of various success measures in collaborative filtering

Author
Martin Hak
Year
2014
Type
Bachelor thesis
Supervisor
Ing. Tomáš Řehořek, Ph.D.
Reviewers
doc. Ing. Pavel Kordík, Ph.D.

Client-Server Application for Two Player Games using WebSocket

Author
Radmir Usmanov
Year
2013
Type
Bachelor thesis
Supervisor
Ing. Tomáš Řehořek, Ph.D.
Reviewers
Ing. Petr Jendele

Content-Based Recommendation Model Trained Using Interaction Similarity

Author
Petr Kasalický
Year
2018
Type
Bachelor thesis
Supervisor
Ing. Tomáš Řehořek, Ph.D.
Reviewers
doc. Ing. Pavel Kordík, Ph.D.
Summary
This bachelor's thesis describes the recommendation system and two major approaches, Collaborative filtering and Content-based recommendation. The new hybrid approach, which combines these two methods, is proposed. This method increases recall of content-based recommendation by up to 216% and allows more precise recommendation for newly added items, which suffers from the cold-start problem. This designed and implemented approach uses machine learning methods such as embedding or artificial neural networks, which will also be briefly introduced along with a way of evaluating the quality of the recommendation.

Vector Graphics Editor for Web Client

Author
Michal Kvasnička
Year
2013
Type
Bachelor thesis
Supervisor
Ing. Tomáš Řehořek, Ph.D.
Reviewers
Mgr. Peter Franek, Ph.D.

Comparison of different neuroevolution algorithms on several problem domains

Author
Tomáš Frýda
Type
Bachelor thesis
Supervisor
Ing. Tomáš Řehořek, Ph.D.
Reviewers
Ing. Jan Drchal, Ph.D.

Master theses

Analysis of Collaborative Filtering Algorithms on Multimedial Data

Author
Jakub Pištěk
Year
2013
Type
Master thesis
Supervisor
Ing. Tomáš Řehořek, Ph.D.
Reviewers
doc. Ing. Pavel Kordík, Ph.D.

Recommendations Model Based on Recurrent Neural Networks

Author
Ladislav Martínek
Year
2020
Type
Master thesis
Supervisor
Ing. Tomáš Řehořek, Ph.D.
Reviewers
Ing. Mgr. Ladislava Smítková Janků, Ph.D.
Summary
This diploma thesis deals with matters of recommendation systems. The aim is to use recurrent neural networks (LSTM, GRU) to predict the subsequent interactions using sequential data from user behavior. Matrix factorization adapted for datasets with implicit feedback is used to create a representation of items (embeddings). An algorithm for creating recurrent models using the embeddings is designed and implemented in this thesis. Furthermore, an evaluation method respecting the sequential nature of the data is proposed. This evaluation method uses recall and catalog coverage metrics. Experiments are performed systematically to determine the dependencies on the observed methods and hyperparameters. The measurements were performed on three datasets. On the most extensive dataset, I managed to achieve more than double recall against other recommendation techniques, which were represented by collaborative filtering, reminder model, and popularity model. The findings, possible improvement by hyper-parametrization, and different possible means of model improvement are discussed at the end of the work.

Counterfactual Learning-to-Rank in Personalized Search

Author
Michael Kolínský
Year
2023
Type
Master thesis
Supervisor
Ing. Tomáš Řehořek, Ph.D.
Reviewers
Rodrigo Augusto da Silva Alves, Ph.D.
Summary
This thesis explores the field of search engines, with a particular emphasis on Counterfactual Learning to Rank, position bias, and document selection bias in historical interactions, personalization of search results, and success metrics for offline ranking evaluation. The study aims to design and implement a framework to learn suitable models utilizing Counterfactual Learning to Rank methods that are used to compare the ranking performance of the models and train unbiased models. Additionally, some document-specific search features as well as user-specific features are proposed to enhance the performance of these models. Offline experiments were conducted on two significantly different provided industrial datasets to assess the retrieval performance of various models using the selected methods. Part of the experiments are dedicated to the comparison of different personalization approaches. The performance of these models was evaluated using appropriate success metrics for offline counterfactual evaluation, as well as other offline evaluation metrics. In conclusion, this research contributes to search engine optimization. The study's findings have implications for the personalization of search results and the development of more effective search engine algorithms.

Recommendation Algorithms for Hierarchical Data

Author
Kryštof Zindulka
Year
2023
Type
Master thesis
Supervisor
Ing. Tomáš Řehořek, Ph.D.
Reviewers
doc. Ing. Pavel Kordík, Ph.D.
Summary
Classic recommendation approaches consist of recommending items to users. Often there is some hierarchical structure, or multiple such structures, above these items. This paper presents a method for recommending elements of a hierarchical structure. Parts of such structure are refered to as segments. There are several practical situations where it makes sense to recommend elements of a hierarchical structure, such as recommending series instead of episodes. In this paper, a new way of converting items into segments is proposed. For recommending segments, modified classical recommendation algorithms are used. The quality of the proposed solution is presented through experiments conducted on two industrial datasets from streaming platforms.

Evaluation of Matrix Factorization Algorithms in Collaborative Filtering

Author
Tomáš Richtr
Year
2016
Type
Master thesis
Supervisor
Ing. Tomáš Řehořek, Ph.D.
Reviewers
doc. Ing. Pavel Kordík, Ph.D.
Summary
This diploma thesis is concerned with matrix factorization methods in collaborative filtering. After the initial analysis of matrix factorization methods a procedure for evaluation of factorization models in recall and catalog coverage metrics is designed. This design is implemented and the set of experiments on the MovieLens and GoOut datasets is done with stochastic gradient descent algorithm implemented by LIBMF and Apache Mahout libraries. The results of this thesis are outputs of done experiments and prepared process for effective evaluation of factorization algorithms.

Deep Latent Factor Models for Recommender Systems

Author
Radek Bartyzal
Year
2019
Type
Master thesis
Supervisor
Ing. Tomáš Řehořek, Ph.D.
Reviewers
MSc. Juan Pablo Maldonado Lopez, Ph.D.
Summary
Recommendation systems help users discover relevant items. One of the types of models used to generate the recommendations are latent factor models. We survey the state of the art neural network based latent factor models and implement four of them. We also design and implement a novel architecture of a deep latent factor model called Hybrid cSDAE that is able to process both the rating and attribute information. We comprehensively evaluate the implemented models on standard datasets.

Porovnání online a offline evaluačních metrik v doporučovacích systémech

Author
Petr Kasalický
Year
2021
Type
Master thesis
Supervisor
Ing. Tomáš Řehořek, Ph.D.
Reviewers
Ing. Karel Klouda, Ph.D.
Summary
The goal of this work is to explore Recommender Systems and methods of evaluating them. The focus is on comparing online and offline approaches of evaluation, as their relationship is highly questionable. In research, recall is commonly used to optimize recommendation algorithms. However, recall can suffer from various problems and may not always correspond to online metrics such as click-through rate. This claim is experimentally verified by measuring the correlation between recall and the click-through rate using a production Recommender System. As shown by a set of exhaustive experiments with a~large number of models and metrics on several industrial datasets, the correlation between recall and the click-through rate is not always guaranteed. This raises a big question about current methods of comparing models in research. As a partial improvement of offline metrics, a new approach of measuring recall is introduced to reflect better the sequence nature of interactions as well as their non-random distribution and increase correlation with the click-through rate.