Application of Distance Metric Learning to Automated Malware Detection

Year
2021
Published
IEEE Access. 2021, 2021(9), 96151-96165. ISSN 2169-3536.
Type
Article
Annotation
Distance metric learning aims to find the most appropriate distance metric parameters to improve similarity-based models such as k -Nearest Neighbors or k -Means. In this paper, we apply distance metric learning to the problem of malware detection. We focus on two tasks: (1) to classify malware and benign files with a minimal error rate, (2) to detect as much malware as possible while maintaining a low false positive rate. We propose a malware detection system using Particle Swarm Optimization that finds the feature weights to optimize the similarity measure. We compare the performance of the approach with three state-of-the-art distance metric learning techniques. We find that metrics trained in this way lead to significant improvements in the k -Nearest Neighbors classification. We conducted and evaluated experiments with more than 150,000 Windows-based malware and benign samples. Features consisted of metadata contained in the headers of executable files in the portable executable file format. Our experimental results show that our malware detection system based on distance metric learning achieves a 1.09 % error rate at 0.74 % false positive rate (FPR) and outperforms all machine learning algorithms considered in the experiment. Considering the second task related to keeping minimal FPR, we achieved a 1.15 % error rate at only 0.13 % FPR.

Active Directory Kerberoasting Attack: Detection using Machine Learning Techniques

Authors
Kotlaba, L.; Fornůsek, S.; Lórencz, R.
Year
2021
Published
Proceedings of the 7th International Conference on Information Systems Security and Privacy. Madeira: SciTePress, 2021. p. 376-383. ISSN 2184-4356. ISBN 978-989-758-491-6.
Type
Proceedings paper
Annotation
Active Directory is a prevalent technology used for managing identities in modern enterprises. As a variety of attacks exist against Active Directory environment, its security monitoring is crucial. This paper focuses on detection of one particular attack - Kerberoasting. The purpose of this attack is to gain access to service accounts’ credentials without the need for elevated access rights. The attack is nowadays typically detected using traditional ”signature-based” detection approaches. Those, however, often result in a high number of false alerts. In this paper, we adopt machine learning techniques, particularly several anomaly detection al- gorithms, for detection of Kerberoasting. The algorithms are evaluated on data from a real Active Directory environment and compared to the traditional detection approach, with a focus on reducing the number of false alerts.

Improving Classification of Malware Families using Learning a Distance Metric

Year
2021
Published
Proceedings of the 7th International Conference on Information Systems Security and Privacy. Madeira: SciTePress, 2021. p. 643-652. ISSN 2184-4356. ISBN 978-989-758-491-6.
Type
Proceedings paper
Annotation
The objective of malware family classification is to assign a tested sample to the correct malware family. This paper concerns the application of selected state-of-the-art distance metric learning techniques to malware families classification. The goal of distance metric learning algorithms is to find the most appropriate distance metric parameters concerning some optimization criteria. The distance metric learning algorithms considered in our research learn from metadata, mostly contained in the headers of executable files in the PE file format. Several experiments have been conducted on the dataset with 14,000 samples consisting of six prevalent malware families and benign files. The experimental results showed that the average precision and recall of the k-Nearest Neighbors algorithm using the distance learned on training data were improved significantly comparing when the non-learned distance was used. The k-Nearest Neighbors classifier using the Mahalanobis distance metric learned by the Metric Learning for Kernel Regression method achieved average precision and recall, both of 97.04% compared to Random Forest with a 96.44% of average precision and 96.41% of average recall, which achieved the best classification results among the state-of-the-art ML algorithms considered in our experiments.

Representation of PE Files using LSTM Networks

Authors
Jureček, M.; Kozák, M.
Year
2021
Published
Proceedings of the 7th International Conference on Information Systems Security and Privacy. Madeira: SciTePress, 2021. p. 516-525. ISSN 2184-4356. ISBN 978-989-758-491-6.
Type
Proceedings paper
Annotation
An ever-growing number of malicious attacks on IT infrastructures calls for new and efficient methods of protection. In this paper, we focus on malware detection using the Long Short-Term Memory (LSTM) as a preprocessing tool to increase the classification accuracy of machine learning algorithms. To represent the malicious and benign programs, we used features extracted from files in the PE file format. We created a large dataset on which we performed common feature preparation and feature selection techniques. With the help of various LSTM and Bidirectional LSTM (BLSTM) network architectures, we further transformed the collected features and trained other supervised ML algorithms on both transformed and vanilla datasets. Transformation by deep (4 hidden layers) versions of LSTM and BLSTM networks performed well and decreased the error rate of several state-of-the-art machine learning algorithms significantly. For each machine learning algorithm considered in our experiments, the LSTM-based transformation of the feature space results in decreasing the corresponding error rate by more than 58.60 %, in comparison when the feature space was not transformed using LSTM network.

Automatic Detection and Decryption of AES by Monitoring S-box Access

Authors
Kokeš, J.; Matějka, J.; Lórencz, R.
Year
2021
Published
Proceedings of the 7th International Conference on Information Systems Security and Privacy. Madeira: SciTePress, 2021. p. 172-180. ISSN 2184-4356. ISBN 978-989-758-491-6.
Type
Proceedings paper
Annotation
In this paper we propose an algorithm that can automatically detect the use of AES and automatically recover both the encryption key and the plaintext. It makes use of the fact that we can monitor accesses to the AES S-Box and deduce the desired data from these accesses; the approach is suitable to software-based AES implementations, both naíve and optimized. To demonstrate the feasibility of this approach we designed a tool which implements the algorithm for Microsoft Windows running on the Intel x86 architecture. The tool has been successfully tested against a set of applications using different cryptographic libraries and common user applications.

Comparison of three counter value based ROPUFs on FPGA

Year
2020
Published
Proceedings of the 23rd Euromicro Conference on Digital Systems Design. Los Alamitos, CA: IEEE Computer Soc., 2020. p. 205-212. ISBN 978-1-7281-9535-3.
Type
Proceedings paper
Annotation
This paper extends our previous work, in which we proposed a Ring Oscillator (RO) based Physical Unclonable Function (PUF) on FPGA. Our approach is able to extract multiple output bits from each RO pair in contrary to the classical approach, where the frequencies of ROs are compared. In this work we investigate the behaviour of our proposed PUF design, together with two other similar proposals that are also based on extracting PUF bits from counter values. We evaluate these proposals under stable operating conditions. Furthermore, we compare the behaviour of all of the three designs when mutually asymmetric and symmetric ROs are used. All of the measurements were performed on Digilent Cmod S7 FPGA boards (Xilinx XC7S25-1CSGA225C).

Lightweight Authentication and Secure Communication Suitable for IoT Devices

Year
2020
Published
Proceedings of the 6th International Conference on Information Systems Security and Privacy. Madeira: SciTePress, 2020. p. 75-83. ISSN 2184-4356. ISBN 978-989-758-399-5.
Type
Proceedings paper
Annotation
In this paper we present the protocols for lightweight authentication and secure communication for IoT and embedded devices. The protocols are using a PUF/TRNG combined circuit as a basic building block. The goal is to show the possibilities of securing communication and authentication of the embedded systems, using PUF and TRNG for secure key generation, without requirement to store secrets on the device itself, thus allowing to significantly simplify the problem of key management on the simple hardware devices and microcontrollers, while allowing secure communication.

Active Directory Kerberoasting Attack: Monitoring and Detection Techniques

Authors
Kotlaba, L.; Fornůsek, S.; Lórencz, R.
Year
2020
Published
Proceedings of the 6th International Conference on Information Systems Security and Privacy. Madeira: SciTePress, 2020. p. 432-439. ISSN 2184-4356. ISBN 978-989-758-399-5.
Type
Proceedings paper
Annotation
The paper focus is the detection of Kerberoasting attack in Active Directory environment. The purpose of the attack is to extract service accounts’ passwords without need for any special user access rights or privilege escalation, which makes it suitable for initial phases of network compromise and further pivot for more interesting accounts. The main goal of the paper is to discuss the monitoring possibilities, setting up detection rules built on top of native Active Directory auditing capabilities, including possible ways to minimize false positive alerts.

Distance Metric Learning using Particle Swarm Optimization to Improve Static Malware Detection

Year
2020
Published
Proceedings of the 6th International Conference on Information Systems Security and Privacy. Madeira: SciTePress, 2020. p. 725-732. ISSN 2184-4356. ISBN 978-989-758-399-5.
Type
Proceedings paper
Annotation
Distance metric learning is concerned with finding appropriate parameters of distance function with respect to a particular task. In this work, we present a malware detection system based on static analysis. We use k-nearest neighbors (KNN) classifier with weighted heterogeneous distance function that can handle nominal and numeric features extracted from portable executable file format. Our proposed approach attempts to specify the weights of the features using particle swarm optimization algorithm. The experimental results indicate that KNN with the weighted distance function improves classification accuracy significantly.

Analysis of the CTU Teaching Survey's Anonymity

Author
Eliška Helikarová
Year
2023
Type
Bachelor thesis
Supervisor
Ing. Josef Kokeš, Ph.D.
Reviewers
Ing. Michal Valenta, Ph.D.
Summary
The bachelor's thesis deals with an anonymity analysis of the Anketa CTU application, an online survey system used by the Czech Technical University for an anonymous course and instructor evaluation. The application falls into the category of anonymous survey systems, meaning the data contained in the survey questionnaires submitted by students should not be linked to the student's identity. The thesis examines this property, its implementation within the application, and discusses possible threats and pitfalls that could potentially compromise the users' privacy. The chosen approach for the anonymity analysis involves a threat analysis using the LINDDUN privacy threat modeling framework. The threat modeling process consists of studying and describing the application using both the publicly available information as well as the source code files and other parts of the system which are not accessible to the general public. After the information gathering phase, the next step is to create data flow diagrams of the system which depict individual system components and data flows. The individual parts of the system, as well as the system as a whole, are then examined for privacy threats that fit into any of the LINDDUN threat categories. The listed threats are then evaluated by their impact on the students' anonymity. The thesis describes the application's inner workings as well as a JWT misconfiguration and other vulnerabilities discovered within the system. Finally, the thesis proposes suitable mitigation strategies and anonymity-enhancing solutions.

Protecting Sensitive Data in Memory in .NET

Author
Viktor Dohnal
Year
2023
Type
Master thesis
Supervisor
Ing. Josef Kokeš, Ph.D.
Reviewers
Ing. Jiří Dostál, Ph.D.
Summary
Applications frequently handle sensitive information, such as passwords and encryption keys, which are typically stored in volatile memory alongside other data. This thesis investigates the efficacy and implementation of memory protection techniques within the .NET ecosystem. The findings have been applied to the analysis of the KeePass password manager, which led to a vulnerability discovery. The vulnerability allows an attacker to recover the master password from memory, even when a workspace is locked or KeePass is no longer running.

A Comparison of Adversarial Learning Techniques for Malware Detection

Author
Pavla Louthánová
Year
2023
Type
Master thesis
Supervisor
Mgr. Martin Jureček, Ph.D.
Reviewers
Ing. Matouš Kozák
Summary
Malware is one of the most significant security threats today. Early detection is important for effective malware protection. Machine learning has proven to be a useful tool for automated malware detection. However, research has shown that machine learning models are vulnerable to adversarial attacks. This thesis discusses adversarial learning techniques in malware detection. The aim is to apply some existing methods for generating adversarial malware samples, test their effectiveness against selected malware detectors, and compare the evasion rate achieved and their practical applicability. The thesis begins with an introduction to adversarial machine learning, followed by a description of the portable executable file format and a review of publications that focus on generating adversarial malware samples. The techniques used to generate malware samples for experimental evaluation are then presented. Finally, the experiments performed are described, including observation of the time required to generate samples, changes in sample size after using the generator, testing effectiveness against antivirus programs, combining the use of multiple generators to generate samples, and evaluation of the results. Five generators were selected for the experiments: Partial DOS, Full DOS, GAMMA padding, GAMMA section-injection and Gym-malware. The results showed that applying optimised modifications to previously detected malware can lead to incorrect classification of the file as benign. It was also found that generated malware samples can be successfully used against detection models other than those used to generate them, and that using combinations of generators can create new samples that evade detection. Experiments show that the Gym-malware generator, which uses a reinforcement learning approach, has the greatest practical potential. This generator achieved an average sample generation time of 5.73 seconds and the highest evasion rate of 67%. When used in combination with itself, the evasion rate improved to 78%.

Exploring Vulnerabilities of the Internet of Things Devices

Author
Zdena Tropková
Year
2023
Type
Master thesis
Supervisor
Ing. Jiří Dostál, Ph.D.
Reviewers
Ing. Tomáš Luňák
Summary
We introduce in this thesis a ranking list of the ten most common vulnerabilities in Internet of Things devices. The main aim was to provide ranking lists created from public data with a transparent creation methodology because ranking lists with these requirements currently do not exist. For example, the popular project OWASP published the most recent ranking list in 2018, and other existing up-to-date ranking lists do not provide a transparent creation methodology and used data sources. We introduce in this thesis a ranking list of the ten most common vulnerabilities in Internet of Things devices. Furthermore, we propose a similar ranking list only for camera devices. Also, we present the most common vulnerability for different smart device categories. In addition, the scraping tool for vulnerability collection was implemented in the framework Scrapy, and an analysis of three vulnerabilities in the context of the Internet of Things devices was performed. The selected vulnerability categories are Access Control, Overflow, and Password Management.

Framework for autonomous improvement of network traffic classification

Author
Jaroslav Pešek
Year
2022
Type
Master thesis
Supervisor
Ing. Dominik Soukup
Reviewers
Ing. Simona Fornůsek, Ph.D.
Summary
This diploma thesis deals with the problem of classification of primarily encrypted network traffic by applying machine learning algorithms. Machine learning is a subfield of artificial intelligence which relies heavily on sufficiently large and general datasets. The first goal is to analyze methods that not only improve such classification over time, but also iteratively build the updated dataset. The second goal is to create a prototype of a software framework capable of doing so, while also being able to evaluate the classification. In the analysis part, the reader is introduced to the active learning method and analyzes and discusses the state-of-the-art and relevance of the methods to the network traffic domain. The design part defines the requirements and designs the solution architecture. The final part of the thesis is focused on experiments. The output of the work is a prototype of the software framework and an evaluation of various active learning methods for the network traffic domain.

Linear Cryptanalysis of Baby Rijndael and Implementation Side Channels of AES

Author
Ing. Josef Kokeš
Year
2022
Type
Dissertation thesis
Supervisor
prof. Ing. Róbert Lórencz, CSc.
Reviewers
Assoc. Prof. Brice Colombier, Ph.D.
Mgr. Jakub Breier, Ph.D.
doc. Ing. Zdeněk Martinásek, Ph.D.

Adaptive mitigation of DDoS attacks based on online analysis

Author
Pavel Šiška
Year
2021
Type
Master thesis
Supervisor
doc. Ing. Tomáš Čejka, Ph.D.
Reviewers
Ing. Simona Fornůsek, Ph.D.
Summary
This thesis deals with design and implementation of the tool for online packet analysis of network traffic. Main goal is to provide necessarily informations for administrator to ensure, that he can set defence mechanisms for mitigation of DDoS attacks. Tool provides overview of actual structure of the network traffic. It can also identify and recommend mitigation rules to suppress DDoS attack, based on characteristics of volumetric DDoS attacks. Tool for saving data for analysis is using special probability data structures, called sketch, which can effectively store great amount of data with low memory requirements. Performance and functionality of the tool was tested in lab over test data with speed reaching up to 100 Gb/s.

Side-channel analysis of Rainbow post-quantum signature

Author
David Pokorný
Year
2021
Type
Master thesis
Supervisor
Ing. Petr Socha
Reviewers
Dr.-Ing. Martin Novotný
Summary
Rainbow, a layered multivariate quadratic digital signature, is a candidate for~standardization by National institute of standards and technology (NIST). In~this paper, we present a CPA side-channel attack on the submitted 32-bit reference implementation. We evaluate the attack on an STM32F3 ARM microcontroller. After a successful attack, we propose countermeasures against side-channel attacks. Countermeasures are implemented and evaluated using leakage assessment.

Web skimming analysis

Author
Pavlína Kopecká
Year
2021
Type
Master thesis
Reviewers
Ing. Josef Kokeš, Ph.D.
Summary
This diploma thesis is about attacks on e-commerce websites. It focuses on a method called web skimming, which uses modifications of website source code and steals customers' payment card data directly from the browser. This work analyzes vulnerabilities that are abused for infiltration of websites, the ways how to hide malicious code in the website source code and the methotds of stealing payment data. It proposes ways to defend against web skimming attacks and implements a browser add-on to prevent these attacks.

Security analysis of Drive Snapshot

Author
Michal Bambuch
Year
2021
Type
Master thesis
Supervisor
Ing. Josef Kokeš, Ph.D.
Reviewers
Ing. Jiří Dostál, Ph.D.
Summary
This thesis addresses the security analysis of Drive Snapshot. It presents the results of the reverse analysis of the key parts of the program, describes the used cryptographical algorithms, and evaluates the application security. During the security analysis, several security vulnerabilities were discovered that could weaken the used cryptography or compromise the security of passwords or created backups.

Detection of IoT Malware in Computer Networks

Author
Daniel Uhříček
Year
2021
Type
Master thesis
Supervisor
Ing. Karel Hynek, Ph.D.
Reviewers
Ing. Jiří Dostál, Ph.D.
Summary
This master thesis deals with the problematics of IoT malware and the possibilities of its detection in computer networks using flow-based monitoring concepts. We exhibit solutions for each of the identified critical aspects of IoT malware network behavior separately. Furthermore, we propose a novel method to discover infected devices using a combination of network indicators. The proposed detection method was implemented in the form of a software prototype capable of processing real network traffic as part of the NEMEA system. The final solution was evaluated both on anonymized captures and up-to-date malware samples.

Advanced error control codes using Wolfram Mathematica

Author
Stanislav Koleník
Year
2021
Type
Master thesis
Supervisor
Ing. Pavel Kubalík, Ph.D.
Reviewers
Ing. Jiří Buček, Ph.D.
Summary
Error-control codes are used in digital communication systems to protect data against noise during transmission. There are many methods to achieve this kind of protection, all are mathematical in nature. A set of teaching materials in the Wolfram Mathematica computing system has been developed in the past to demonstrate some of these methods. The aim of this work is to extend the set by adding some more advanced codes.

Security monitoring of Active Directory environment based on Machine Learning techniques

Author
Lukáš Kotlaba
Year
2021
Type
Master thesis
Supervisor
Ing. Simona Fornůsek, Ph.D.
Reviewers
Ing. Jiří Dostál, Ph.D.
Summary
Active Directory is a central point of administration and identity management in many organizations. Ensuring its security is indispensable to protect user credentials, enterprise systems, and sensitive data from unauthorized access. Security monitoring of Active Directory environments is typically performed using signature-based detection rules. However, those are not always effective and sufficient, especially for attacks similar to legitimate activity from the auditing perspective. This thesis applies machine learning techniques for detecting two such attack techniques - Password Spraying and Kerberoasting. Several machine learning algorithms are utilized based on features from Windows Event Log and evaluated on data originating from a real Active Directory environment. Best approaches are implemented as detection rules for practical use in the Splunk platform. In experimental comparison with signature-based approaches, the proposed solution was able to improve detection capabilities, and at the same time, reduce the number of false alarms for both considered attack techniques.

Automatic Malware Detection

Author
Mgr. Martin Jureček
Year
2021
Type
Dissertation thesis
Supervisor
prof. Ing. Róbert Lórencz, CSc.
Reviewers
Prof. Mark Stamp; Assoc. Prof. Carles Mateu, PhD.; Ing. Sebastián García, Ph.D.

Hardware generated keys for cryptographic systems and protocols

Author
Ing. Simona Buchovecká
Year
2021
Type
Dissertation thesis
Supervisor
prof. Ing. Róbert Lórencz, CSc.
Reviewers
Assoc. Prof. Jens-Peter Kaps, PhD.; Assoc. Prof. Florent Bernard; doc. Ing. Dominik Macko, PhD.

The Ring Oscillator based PUF on FPGAs

Author
Ing. Filip Kodýtek
Year
2021
Type
Dissertation thesis
Supervisor
prof. Ing. Róbert Lórencz, CSc.
Reviewers
Prof. Kris Gaj, PhD.; Assoc. Prof. Brice Colombier; doc. Ing. Zdeněk Vašíček, Ph.D.