Mgr. Martin Jureček, Ph.D.

Profile
Publications
Projects
Teaching
Theses

Theses

Sample theses

Bachelor theses

Real Time Attack of A5/1

Author

Tomáš Hradský

Year

2015

Type

Bachelor thesis

Supervisor

Mgr. Martin Jureček

Reviewers

doc. Ing. Ivan Šimeček, Ph.D.

Department

Department of Computer Systems

Summary

This bachelor thesis is focused on analysis of cipher A5/1 and on ways of exploiting its weaknesses. A5/1 is a stream cipher which is used to provide over-the-air privacy for mobile telecommunication. We focus mainly on time-memory tradeoff attacks which allow the attack to be conducted within tens of seconds on the expense of a vast precomputation phase. The result of the practial part of this thesis is the possibility to uncover the secret key from captured data.

Thesis on DSpace

Correlation Attack on A5/1

Author

Martin Holec

Year

2015

Type

Bachelor thesis

Supervisor

Mgr. Martin Jureček

Reviewers

doc. Ing. Ivan Šimeček, Ph.D.

Department

Department of Computer Systems

Summary

Description of A5/1 cipher, basics of cryptographic attacks and analysis of correlation attack. This thesis is helpful for everyone who is intrested in stream ciphers, basic ideas of cryptographic attacks and correlation attack on A5/1 in particular.

Thesis on DSpace

Static detection of malicious PE files

Author

Jakub Ács

Year

2018

Type

Bachelor thesis

Supervisor

Mgr. Martin Jureček

Reviewers

prof. Ing. Róbert Lórencz, CSc.

Department

Department of Computer Systems

Summary

Since the classical used approaches for malicious software (malware) detection are failing to provide sufficient level of protection, it is becoming clear that these will have to be substituted or at least enhanced by new, inovative methods in the future. This thesis focuses on utilizing machine learning techniques for malware detection. Using static features extracted from the PE files like imported functions, we were able to train various machine learning models for malware detection. The best performing model reached almost 95\% accuracy. This model can be used for instance, for preliminary detection of malicious PE files. Another purpose of the thesis can be found in the following research, it could serve as another input for future automatic malware analysis studies.

Thesis on DSpace

Cryptanalysis of RSA based on factorization

Author

Petr Horák

Year

2020

Type

Bachelor thesis

Supervisor

Mgr. Martin Jureček

Reviewers

prof. Ing. Róbert Lórencz, CSc.

Department

Department of Computer Systems

Summary

This bachelor's thesis deals with RSA cryptanalysis based on integer factorization. Described are some of the factorization methods. Emphasis is laid on their computational complexities and comparison in real-life scenarios of use. Another topic in this work is a discussion of security issues of RSA which can occur by incautious use of this cryptographic method. Selected algorithms are implemented in Magma language for speed testing on the sample data.

Thesis on DSpace

Static malware detection using recurrent neural networks

Author

Matouš Kozák

Year

2020

Type

Bachelor thesis

Supervisor

Mgr. Martin Jureček

Reviewers

prof. Ing. Róbert Lórencz, CSc.

Department

Department of Theoretical Computer Science

Summary

An ever-growing number of malicious attacks on our IT infrastructure calls for new and better methods of protection. In this thesis, we focus on the use of recurrent neural networks as an agile and accurate way of detecting malware. We only used features extracted from files in the PE file format to represent the suspicious programs which we used to train various types of recurrent neural networks. In this work, we present unique neural network architecture combining dense and stacked LSTM layers to classify PE files. We worked with our dataset of 30,154 files collected from available resources with which we achieved an accuracy of 98.41%, while only 0.5% of benign samples were misclassified as malware on our balanced dataset. All this was accomplished with only 250 epochs of training. These results prove that machine-learning algorithms, especially LSTM networks, can be used as a quick and reliable tool for malware detection.

Thesis on DSpace

Interpretability of machine learning-based results of malware detection using a set of rules

Author

Jan Dolejš

Year

2021

Type

Bachelor thesis

Supervisor

Mgr. Martin Jureček, Ph.D.

Reviewers

prof. Ing. Róbert Lórencz, CSc.

Department

Department of Computer Systems

Summary

Machine learning methods have been quite successful in a variety of applications. Antivirus companies use them for quick and reliable malware detection, providing their users with a safer environment from ceaseless daily threats. However, machine learning methods such as deep neural networks are often considered black boxes as the reasoning behind their decisions may often be unclear. Their interpretability is important and helps understand potential errorful decisions. This thesis studies rule-learning algorithms and explores their potential to interpret the outcomes of machine learning algorithms. Two publicly available datasets with Portable Executable file attributes and tailor-made implementations of rule-learning algorithms were used throughout the work. Results showed that algorithm RIPPER is mostly successful at this task; it achieved high accuracies while maintaining compact sets of rules, making rule-learning algorithms a useful alternative to signature-based methods.

Thesis on DSpace

Hard Mathematical Problems in Cryptography

Author

Marek Holík

Year

2022

Type

Bachelor thesis

Supervisor

Mgr. Martin Jureček, Ph.D.

Reviewers

RNDr. Jiřina Scholtzová, Ph.D.

Department

Department of Computer Systems

Summary

This thesis describes protocols and ciphers which base their security on integer factorization problem, quadratic residuosity problem and discrete logarithm problem. Random instances of mathematical problems are generated based on the described protocols and ciphers for comparison of Magma, SageMath and MATLAB implementations. Efficiency is evaluated in terms of average time needed and success rate for solving instances of a fixed length. The time for solving one instance is limited to one hour.

Thesis on DSpace

Master theses

Semi-supervised learning for malware detection

Author

Michal Buchovecký

Year

2019

Type

Master thesis

Supervisor

Mgr. Martin Jureček

Reviewers

prof. Ing. Róbert Lórencz, CSc.

Department

Department of Information Security

Summary

Nowadays, the use of machine learning for malware detection is not very popular. One of the reasons is that labelling of malware and benign files necessary for machine learning is very expensive process. This thesis is focused on malware detection by semi - supervised learning. Semi-supervised learning is a machine learning technique that makes use of labelled as well as unlabelled samples for training. Information obtained from executable files in PE format was used for training. In the thesis it is showed that it is possible to reach better accuracy using semi - supervised learning, compared to purely supervised approach.

Thesis on DSpace

Simulation of malware detection model

Author

Libor Šlechta

Year

2020

Type

Master thesis

Supervisor

Mgr. Martin Jureček

Reviewers

prof. Ing. Róbert Lórencz, CSc.

Department

Department of Information Security

Summary

The number of harmful programs is still rising, and attackers invent new techniques to avoid detection every day. This thesis focuses on automatic malware classification by using machine learning algorithms. The main difference from other work in this field of study is experiment that mimics behavior of a detection model. Four machine learning algorithms were used for this experiment. Neural networks, K-nearest neighbors, Decision trees and Naive Bayes classifier. The best result was achieved by using a variant of neural network. Multilayer perceptron network simulated the model with accuracy of 98,68 %.

Thesis on DSpace

Algebraic Cryptanalysis of Small Scale Variants of the AES

Author

Marek Bielik

Year

2021

Type

Master thesis

Supervisor

Mgr. Martin Jureček, Ph.D.

Reviewers

prof. Ing. Róbert Lórencz, CSc.

Department

Department of Theoretical Computer Science

Summary

This work proposes and demonstrates new advances in algebraic cryptanalysis of small scale derivatives of the Advanced Encryption Standard (AES). We model the AES as a system of polynomial equations over GF(2), which involves only the variables of the initial key, and we subsequently attempt to solve such a system. We show, for example, that one of the attacks can recover the secret key for one round of the AES-128 under one minute on a contemporary CPU. This attack requires only two known plaintexts and their corresponding ciphertexts.

Thesis on DSpace

Reducing Overdefined Systems of Polynomial Equations Derived from Small Scale Variants of the AES

Author

Jana Berušková

Year

2023

Type

Master thesis

Supervisor

Mgr. Martin Jureček, Ph.D.

Reviewers

prof. Ing. Róbert Lórencz, CSc.

Department

Department of Information Security

Summary

This master thesis deals with the possibility of increasing efficiency of the calculation of the secret key of the AES cipher (or its small scale derivatives) by different types of reduction of overdefined system of polynomial equations, which models the cipher. The reduction is done by selecting and xoring such polynomials, to simplify the overall calculation. One possibility is, for example, xoring polynomials with the same most-significant monomial or finding the most similar polynomials using clustering.

Thesis on DSpace

Algebraic Cryptanalysis of LFSR-based Stream Ciphers

Author

Jiří Soukup

Year

2022

Type

Master thesis

Supervisor

Mgr. Martin Jureček, Ph.D.

Reviewers

prof. Ing. Róbert Lórencz, CSc.

Department

Department of Information Security

Summary

Keystream produced by LFSR-based stream ciphers can be described by a system of polynomial equations. Such ciphers can be cracked by solving the system. Using the F4 algorithm we can effectively solve a system of polynomial equations. Knowing the parameters of an LFSR-based stream cipher we can further speed up the computation by applying the guess-and-determine method. In this thesis, we describe how the F4 algorithm works. And mainly we try to find effective ways of applying the guess-and-determine method to reduce the computation time of the F4 algorithm when used to solve a system of polynomial equations that represent an LFSR-based stream cipher.

Thesis on DSpace

Application of Reinforcement Learning to Creating Adversarial Malware Samples

Author

Matouš Kozák

Year

2023

Type

Master thesis

Supervisor

Mgr. Martin Jureček, Ph.D.

Reviewers

doc. Ing. Ivan Šimeček, Ph.D.

Department

Department of Theoretical Computer Science

Summary

Machine learning is becoming increasingly popular as a go-to approach for many tasks due to its world-class results. As a result, antivirus developers are starting to incorporate machine learning models into their products. While these models improve malware detection capabilities, they also carry the disadvantage of being susceptible to adversarial attacks. Although this sensitivity has been demonstrated for many models in white-box settings, a black-box attack is more applicable in practice for the domain of malware detection. Therefore, we present a black-box scenario in which the attacker only has the predicted label at his disposal and forgoes any other information about the target classifier. Using reinforcement learning algorithms, we implemented an attack against the GBDT classifier trained on the EMBER dataset. We trained several RL agents on a dataset of Windows malware with an emphasis on preserving the original functionality of the malicious samples. We achieved an evasion rate of 58.92% against the targeted classifier using the PPO algorithm. In addition to targeting this detector, we studied how the adversarial attack can be transferred to other models. The agent previously trained against the GBDT classifier scored an evasion rate of 28.91% against MalConv, a model based solely on machine learning. We also tested the generated adversarial examples against top AV programs and achieved an evasion rate ranging from 10.24% to 25.7%. These results prove that not only machine learning-based models are vulnerable to adversarial attacks and that better safeguards need to be taken to protect our systems.

Thesis on DSpace

Generating Malware Family Signatures from Behavioral Graphs using Unsupervised Learning

Author

Tomáš Zvara

Year

2022

Type

Master thesis

Supervisor

Mgr. Martin Jureček, Ph.D.

Reviewers

prof. Ing. Róbert Lórencz, CSc.

Department

Department of Information Security

Summary

The behavioral shield is a component of Avast AV responsible for monitoring the system and identifying suspicious behavior of running processes. The behavior is captured in the form of behavioral graphs. There is ongoing internal research that studies the options to use novel deep learning models, i.e., graph neural networks, to allow high-scale learning on these graphs. This thesis aims to study three different graph embeddings, which were produced by the existing graph neural network models, and verify whether the embedded representations allow distinguishing the malicious behavior of various malware strains. The structure of embedded spaces is analyzed using well-known clustering methods, namely k-means, DBSCAN, and agglomerative clustering. The results of the clustering process are evaluated by intrinsic and extrinsic measures. The hypothesis is that the formed clusters should represent individual malware families and thus can be used to create a behavioral signature to detect them. However, performed experiments show that the applied clustering methods produce low-quality clusters that do not allow separating the selected malware strains. There are two factors that cause the low performance. The first one is the poor expressibility of the behavioral graphs with respect to the individual malware strains. The second one is the low quality of the provided labels.

Thesis on DSpace

Selection of Representative Samples from Datasets for Malware Detection

Author

Lukáš Děd

Year

2024

Type

Master thesis

Supervisor

Mgr. Martin Jureček, Ph.D.

Reviewers

Ing. Matouš Kozák

Department

Department of Information Security

Summary

This thesis focuses on the selection of representative instances for the training set in malware detection. Experiments were conducted on two publicly available datasets containing metadata of Windows PE files, namely the EMBER and SOREL-20M datasets. The theoretical part describes data preprocessing methods, instance selection algorithms, and classification algorithms used in the practical part of this thesis. It also includes a description of the structure of PE files. The practical part outlines the process of preprocessing datasets and main experiments related to the comparison of state-of-the-art instance selection algorithms. As part of the thesis, modifications to the parallel instance selection algorithm PIF were proposed and implemented, and these were also experimentally evaluated and compared with the results of state-of-the-art instance selection algorithms.

Thesis on DSpace

Algebraic Cryptanalysis of Small Scale Variants of the Grain-128AEAD cipher

Author

Daniel Minarovič

Year

2023

Type

Master thesis

Supervisor

Mgr. Martin Jureček, Ph.D.

Reviewers

Ing. Josef Kokeš, Ph.D.

Department

Department of Information Security

Summary

This master thesis deals with algebraic cryptanalysis which converts the Grain-128AEADv2 cipher, hereinafter referred to as Grain, into a system of polynomial equations and then solves it using Groebner bases. Algebraic cryptanalysis will be applied to small scale variants of Grain cipher. For a more efficient calculation of the solution, we will use the Locality Sensitive Hashing method for simplifying the system of polynomial equations. As part of the experiments, we were able to break, for example, a 16-bit small scale variant of the cipher with the number of rounds of 64, as well as the 24-bit variant with number of rounds of 8. The generation time of the equations for the mentioned 24-bit variant took approximately 8 minutes, and the calculation of the solution took 113 seconds. We also applied the Locality Sensitive Hashing algorithm and thereby accelerated the calculation of Groebner bases and solutions. For example, for the 24-bit version with the number of rounds of 8, we were able to reduce the calculation of the solution from 113 seconds to 1,88 seconds.

Thesis on DSpace

A Comparison of Adversarial Learning Techniques for Malware Detection

Author

Pavla Louthánová

Year

2023

Type

Master thesis

Supervisor

Mgr. Martin Jureček, Ph.D.

Reviewers

Ing. Matouš Kozák

Department

Department of Information Security

Summary

Malware is one of the most significant security threats today. Early detection is important for effective malware protection. Machine learning has proven to be a useful tool for automated malware detection. However, research has shown that machine learning models are vulnerable to adversarial attacks. This thesis discusses adversarial learning techniques in malware detection. The aim is to apply some existing methods for generating adversarial malware samples, test their effectiveness against selected malware detectors, and compare the evasion rate achieved and their practical applicability. The thesis begins with an introduction to adversarial machine learning, followed by a description of the portable executable file format and a review of publications that focus on generating adversarial malware samples. The techniques used to generate malware samples for experimental evaluation are then presented. Finally, the experiments performed are described, including observation of the time required to generate samples, changes in sample size after using the generator, testing effectiveness against antivirus programs, combining the use of multiple generators to generate samples, and evaluation of the results. Five generators were selected for the experiments: Partial DOS, Full DOS, GAMMA padding, GAMMA section-injection and Gym-malware. The results showed that applying optimised modifications to previously detected malware can lead to incorrect classification of the file as benign. It was also found that generated malware samples can be successfully used against detection models other than those used to generate them, and that using combinations of generators can create new samples that evade detection. Experiments show that the Gym-malware generator, which uses a reinforcement learning approach, has the greatest practical potential. This generator achieved an average sample generation time of 5.73 seconds and the highest evasion rate of 67%. When used in combination with itself, the evasion rate improved to 78%.

Thesis on DSpace

Unsupersived Instance Selection for Malware Detection

Author

Mehmet Efe Zorlutuna

Year

2023

Type

Master thesis

Supervisor

Mgr. Martin Jureček, Ph.D.

Reviewers

Mgr. Tomáš Rabas

Department

Department of Information Security

Summary

This work proposes a new unsupervised instance selection algorithm for malware detection and evaluates its effectiveness. The proposed algorithm selects informative instances from unlabelled data to train a machine learning model, which is expected to improve its accuracy and efficiency. Experiments show a performance comparison of the algorithm and an existing unsupervised instance selection algorithm.

Thesis on DSpace

Algebraic Cryptanalysis of Small-Scale Variants of Stream Cipher E0

Author

Jan Dolejš

Year

2024

Type

Master thesis

Supervisor

Mgr. Martin Jureček, Ph.D.

Reviewers

prof. Ing. Róbert Lórencz, CSc.

Department

Department of Information Security

Summary

This work introduces and demonstrates innovative progress in the algebraic analysis of the small-scale variants of the stream cipher E0 from the Bluetooth standard. We design the small-scale variants and represent them using a set of polynomial equations. Our work reveals a possible linear relation between the number of keystream bits and the size of the small-scale E0 variants, improving the performance of the used solvers. Our best run finds the initial configuration in $178.5$ seconds for the 22-bit E0 version. Using local sensitivity hashing, we improved the computational time of the SAT solver from 453.1 seconds to 85.3 seconds for the 19-bit E0 version.

Thesis on DSpace

Reducing the Surface for Adversarial Attacks in Malware Detectors

Author

Benjamín Peraus

Year

2024

Type

Master thesis

Supervisor

Mgr. Martin Jureček, Ph.D.

Reviewers

Ing. Matouš Kozák

Department

Department of Information Security

Summary

Adversarial attacks pose a significant problem in malware detection because they allow trivial modifications to already detected malware to recreate undetectable malware and cause misclassification in machine learning models, even in black-box scenarios. The goal of this work is to study defensive techniques and implement a tool that can mitigate the impact of these attacks by preprocessing samples to minimize the attack surface needed to create adversarial samples. Our technique has been subjected to rigorous testing against a number of adversarial generators. The results of this testing have demonstrated the efficacy of our approach, with a notable reduction in the evasion rate of detection for a select few generators to zero percent. This has been achieved without any adverse impact on the detection accuracy of common malware.

Thesis on DSpace

Algebraic Cryptanalysis of Small Scale Variants of the AES Based on Lattice Basis Reduction

Author

Tomáš Faikl

Year

2025

Type

Master thesis

Supervisor

Mgr. Martin Jureček, Ph.D.

Reviewers

Ing. Ivo Petr, Ph.D.

Department

Department of Information Security

Summary

This thesis explores algebraic cryptanalysis of small-scale AES variants, introducing novel preprocessing techniques to simplify polynomial systems. Key contributions include new methods based on linear codes and lattice theory to improve sparsity and reduce polynomial degree, along with a novel application of the Gilbert-Varshamov bound to assess preprocessing effectiveness for pseudo-random ciphers. Additionally, the summary of Lenstra-Lenstra-Lovász (LLL) algorithm's adoption for linear codes is described based on a very recent effort in the area. Three new preprocessing strategies are proposed, which can be combined for enhanced efficiency. Experimental results show significant improvements in solving polynomial systems, achieving up to 40-fold runtime reduction compared to previous works in our line research of algebraic preprocessing. Pseudo-random ciphers remain unaffected by preprocessing methods, however we provide results with improved runtime by tuning parameters of the F4 algorithm.

Thesis on DSpace

Machine Learning-Based Malware Classification by Type and Family

Author

David Bálik

Year

2025

Type

Master thesis

Supervisor

Mgr. Martin Jureček, Ph.D.

Reviewers

Ing. Eliška Krátká

Department

Department of Information Security

Summary

This thesis addresses the challenge of malware classification using machine learning by developing a novel dataset labeled at both the malware type and family levels. Raw binaries were collected from sources such as VirusShare, VX Underground, and MalwareBazaar, and subsequently labeled with family information parsed from binary names and type-level labels integrated from ClarAVy. The dataset was processed using a unified feature extraction pipeline based on static analysis, particularly extracting features from PE headers, to support advanced classification tasks. The evaluation was focused on three key classification tasks. In the binary classification of malware versus benign samples, Random Forest and XGBoost achieved high accuracy on the full datasets, reaching 98.5% for type-based detection and 98.98% for family-based detection. When using truncated datasets of 1,000 samples to assess performance under limited data conditions, both models still performed strongly, achieving 97.6% for type-based detection and 98.66% for family-based detection. For interclass classification, which distinguishes between malware types or families, the models reached up to 97.5% accuracy on type-level tasks and up to 93.7% on family-level tasks. In the multiclass classification setting, which assigns samples to the correct type or family, SVM achieved 81.1% accuracy on type labels, while Random Forest and XGBoost reached approximately 73.4% on family labels. The results highlight practical trade-offs between accuracy and computational cost, and demonstrate that labeling at both the type and family levels enables more fine-grained and insightful malware classification. The thesis establishes a robust foundation for future research on advanced malware detection and classification.