Department of Theoretical Computer Science

Publications

The complexity landscape of decompositional parameters for ILP: Programs with few global variables and constraints

Authors
Dvorak, P.; Eiben, E.; Ganian, R.; Knop, D.; Ordyniak, S.
Year
2021
Published
Artificial Intelligence. 2021, 300 ISSN 0004-3702.
Type
Article
Annotation
Integer Linear Programming (ILP) has a broad range of applications in various areas of artificial intelligence. Yet in spite of recent advances, we still lack a thorough understanding of which structural restrictions make ILP tractable. Here we study ILP instances consisting of a small number of "global" variables and/or constraints such that the remaining part of the instance consists of small and otherwise independent components; this is captured in terms of a structural measure we call fracture backdoors which generalizes, for instance, the well-studied class of N-fold ILP instances.

Local linear set on graphs with bounded twin cover number

Authors
Year
2021
Published
Information Processing Letters. 2021, 170 ISSN 1872-6119.
Type
Article
Annotation
Inspired by recent results for the so-called fair and local linear problems, we define two problems, LOCAL LINEAR SET and LOCAL LINEAR VERTEX COVER. Here the task is to find a set of vertices X subset of V (which is a vertex cover) of the input graph (V, E) such that the local linear constraints l(v) <= vertical bar X boolean AND N(v)vertical bar <= u(v) hold for all v is an element of V; here, l(v) and u( v) are on the input. We prove that both problems are in FPT for the parameter twin cover number (using integer linear programming).

Parameterized complexity of configuration integer programs

Authors
Knop, D.; Koutecky, M.; Levin, A.; Mnich, M.; Onn, S.
Year
2021
Published
Operations Research Letters. 2021, 49(6), 908-913. ISSN 0167-6377.
Type
Article
Annotation
Configuration integer programs (IP) have been key in the design of algorithms for NP-hard high-multiplicity problems. First, we develop fast exact (exponential-time) algorithms for Configuration IP and matching hardness results. Second, we showcase the implications of these results to bin-packing and facility-location-like problems.

Kernelization of graph hamiltonicity: Proper H-graphs

Authors
Chaplick, S.; Fomin, F.V.; Golovach, P.A.; Knop, D.; Zeman, P.
Year
2021
Published
SIAM Journal on Discrete Mathematics. 2021, 35(2), 840-892. ISSN 0895-4801.
Type
Article
Annotation
We obtain new polynomial kernels and compression algorithms for Path Cover and Cycle Cover, the well-known generalizations of the classical Hamiltonian Path and Hamiltonian Cycle problems. Our choice of parameterization is strongly influenced by the work of Biró, Hujter, and Tuza, who in 1992 introduced H-graphs, intersection graphs of connected subgraphs of a subdivision of a fixed (multi-)graph H. In this work, we turn to proper H-graphs, where the containment relationship between the representations of the vertices is forbidden. As the treewidth of a graph measures how similar the graph is to a tree, the size of graph H is the parameter measuring the closeness of the graph to a proper interval graph. We prove the following results. Path Cover admits a kernel of size O(||H||^8), where ||H|| is the size of graph H. In other words, we design an algorithm that for an n-vertex graph G and integer k >= 1, in time polynomial in n and ||H||, outputs a graph G' of size O(||H||^8) and k' <= |V(G')| such that the vertex set of G is coverable by k vertex-disjoint paths if and only if the vertex set of G' is coverable by k' vertex-disjoint paths. Hamiltonian Cycle admits a kernel of size O(||H||^8). Cycle Cover admits a polynomial kernel. We prove it by providing a compression of size O(||H||^10) into another NP-complete problem, namely, Prize Collecting Cycle Cover, that is, we design an algorithm that, in time polynomial in n and ||H||, outputs an equivalent instance of Prize Collecting Cycle Cover of size O(||H||^10). In all our algorithms we assume that a proper H-decomposition is given as a part of the input.

Graph isomorphism restricted by lists

Authors
Klavík, P.; Knop, D.; Zeman, P.
Year
2021
Published
Theoretical Computer Science. 2021, 860 51-71. ISSN 0304-3975.
Type
Article
Annotation
The complexity of graph isomorphism (GRAPHISO) is a famous problem in computer science. For graphs G and H, it asks whether they are the same up to a relabeling of vertices. In 1981, Lubiw proved that list restricted graph isomorphism (LISTISO ) is NP -complete: for each u∈V(G), we are given a list L(u)⊆V(H) of possible images of u. After 35 years, we revive the study of this problem and consider which results for GRAPHISO can be modified to solve LISTISO. We prove: 1) Under certain conditions, GI -completeness of a class of graphs implies NP -completeness of LISTISO. 2) Several combinatorial algorithms for GRAPHISO can be modified to solve LISTISO: for trees, planar graphs, interval graphs, circle graphs, permutation graphs, and bounded treewidth graphs. 3) LISTISO is NP -complete for cubic colored graphs with sizes of color classes bounded by 8 with all lists of size at most 3.

High-Multiplicity Fair Allocation Made More Practical

Authors
Bredereck, R.; Figiel, A.; Kaczmarczyk, A.; Knop, D.; Niedermeier, R.
Year
2021
Published
AAMAS '21: 20th International Conference on Autonomous Agents and Multiagent Systems, Virtual Event, United Kingdom, May 3-7, 2021. New York: ACM, 2021. p. 260-268. ISSN 1548-8403. ISBN 978-1-7138-3262-1.
Type
Proceedings paper
Annotation
The envy-free, Pareto-efficient allocation of indivisible goods leads to computationally hard problems. There is a big variety of modeling issues, such as agent-specific utility functions or (high numbers of) different types of goods. In recent work, Bredereck et al. [ACM EC~2019] addressed this topic by showing (theoretical) fixed-parameter tractability results for "high-multiplicity fair allocation", exploiting parameters such as the number of agents or maximum absolute utility values. To this end, they used a number of tools from (theoretical) integer linear programming. We "engineer" their work towards practical usefulness, thereby being able to solve all real-world instances from the state-of-art online platform "spliddit.org for provably fair solutions". Besides providing the foundations for a fast tool for fair allocations, we also offer a flexible framework with the possibility to relax fairness or efficiency demands so to, e.g., allow tradeoffs between fairness and social welfare. Moreover, our framework provides ways to interpret and explain "solution paths" which makes it possible to perform further explorations in cases when no envy-free and efficient allocations exist.

Constant Factor Approximation for Tracking Paths and Fault Tolerant Feedback Vertex Set

Authors
Blažej, V.; Choudhary, P.; Knop, D.; Křišťan, J.; Suchý, O.; Valla, T.
Year
2021
Published
Approximation and Online Algorithms - 19th International Workshop, WAOA 2021, Lisbon, Portugal, September 6-10, 2021, Revised Selected Papers. Springer, Cham, 2021. p. 23-38. Lecture Notes in Computer Science (LNCS). vol. 12982. ISSN 0302-9743. ISBN 978-3-030-92701-1.
Type
Proceedings paper
Annotation
Consider a vertex-weighted graph G with a source s and a target t. Tracking Paths requires finding a minimum weight set of vertices (trackers) such that the sequence of trackers in each path from s to t is unique. In this work, we derive a factor 66-approximation algorithm for Tracking Paths in weighted graphs and a factor 4-approximation algorithm if the input is unweighted. This is the first constant factor approximation for this problem. While doing so, we also study approximation of the closely related r -Fault Tolerant Feedback Vertex Set problem. There, for a fixed integer r and a given vertex-weighted graph G, the task is to find a minimum weight set of vertices intersecting every cycle of G in at least r+1 vertices. We give a factor O(r^2) approximation algorithm for r -Fault Tolerant Feedback Vertex Set if r is a constant.

Fine-Grained View on Bribery for Group Identification

Authors
Boehmer, N.; Bredereck, R.; Knop, D.; Luo, J.
Year
2020
Published
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 2020. p. 67-73. ISBN 978-0-9992411-6-5.
Type
Proceedings paper
Annotation
Given a set of individuals qualifying or disqualifying each other, group identification is the task of identifying a socially qualified subgroup of individuals. Social qualification depends on the specific rule used to aggregate individual qualifications. The bribery problem in this context asks how many agents need to change their qualifications in order to change the outcome. Complementing previous results showing polynomial-time solvability or NP-hardness of bribery for various social rules in the constructive (aiming at making specific individuals socially qualified) or destructive (aiming at making specific individuals socially disqualified) setting, we provide a comprehensive picture of the parameterized computational complexity landscape. Conceptually, we also consider a more fine-grained concept of bribery cost, where we ask how many single qualifications need to be changed, and a more general bribery goal that combines the constructive and destructive setting.

Balancing the Spread of Two Opinions in Sparse Social Networks (Student Abstract)

Year
2022
Published
Proceedings of the 36th AAAI Conference on Artificial Intelligence. Menlo Park: AAAI Press, 2022. p. 12987-12988. ISSN 2159-5399. ISBN 978-1-57735-876-3.
Type
Proceedings paper
Annotation
We propose a new discrete model for simultaneously spreading two opinions within a social network inspired by the famous Target Set Selection problem. We are given a social network, a seed-set of agents for each opinion, and two thresholds per agent. The first threshold represents the willingness of an agent to adopt an opinion if she has no opinion at all, while the second threshold states the readiness to acquire a second opinion. The goal is to add as few agents as possible to the initial seed-sets such that, once the process started with these seed-set stabilises, each agent has either both opinions or none. We perform an initial study of its computational complexity. It is not surprising that the problem is NP-hard even in quite restricted settings. Therefore, we investigate the complexity of the problem from the parameterized point-of-view with special focus on sparse networks, which appears often in practice. Among other things, we show that the proposed problem is in the FPT complexity class if we parameterize by the vertex cover number of the underlying graph.

Automorphisms of the cube n^d

Authors
Dvořák, P.; Valla, T.
Year
2021
Published
Discrete Mathematics. 2021, 2021(344(3)), ISSN 0012-365X.
Type
Article
Annotation
Consider a hypergraph n^d where the vertices are points of the d-dimensional cube [n]^d and the edges are all sets of n points such that they are in one line. We study the structure of the group of automorphisms of n^d, i.e., permutations of points of [n]^d preserving the edges. In this paper we provide a complete characterization. Moreover, we consider the Colored Cube Isomorphism problem of deciding whether for two colorings of the vertices of n^d there exists an automorphism of n^d preserving the colors. We show that this problem is GI-complete.

A Parameterized Complexity View on Collapsing k-Cores

Authors
Luo, J.; Molter, H.; Suchý, O.
Year
2021
Published
Theory of Computing Systems. 2021, 65(8), 1243-1282. ISSN 1432-4350.
Type
Article
Annotation
We study the NP-hard graph problem Collapsed k-Core where, given an undirected graph G and integers b, x, and k, we are asked to remove b vertices such that the k-core of remaining graph, that is, the (uniquely determined) largest induced subgraph with minimum degree k, has size at most x. Collapsed k-Core was introduced by Zhang et al. (2017) and it is motivated by the study of engagement behavior of users in a social network and measuring the resilience of a network against user drop outs. Collapsed k-Core is a generalization of r-Degenerate Vertex Deletion (which is known to be NP-hard for all r >= 0) where, given an undirected graph G and integers b and r, we are asked to remove b vertices such that the remaining graph is r-degenerate, that is, every its subgraph has minimum degree at most r. We investigate the parameterized complexity of Collapsed k-Core with respect to the parameters b, x, and k, and several structural parameters of the input graph. We reveal a dichotomy in the computational complexity of Collapsed k-Core for k <= 2 and k >= 3. For the latter case it is known that for all x >= 0 Collapsed k-Core is W[P]-hard when parameterized by b. For k <= 2 we show that Collapsed k-Core is W[1]-hard when parameterized by b and in FPT when parameterized by (b + x). Furthermore, we outline that Collapsed k-Core is in FPT when parameterized by the treewidth of the input graph and presumably does not admit a polynomial kernel when parameterized by the vertex cover number of the input graph.

Minimum Eccentricity Shortest Path Problem with Respect to Structural Parameters

Authors
Kučera, M.; Suchý, O.
Year
2021
Published
Combinatorial Algorithms. Cham: Springer International Publishing, 2021. p. 442-455. Lecture Notes in Computer Science. vol. 12757. ISSN 0302-9743. ISBN 978-3-030-79986-1.
Type
Proceedings paper
Annotation
The Minimum Eccentricity Shortest Path Problem consists in finding a shortest path with minimum eccentricity in a given undirected graph. The problem is known to be NP-complete and W[2]-hard with respect to the desired eccentricity. We present fpt algorithms for the problem parameterized by the modular width, distance to cluster graph, the combination of distance to disjoint paths with the desired eccentricity, and maximum leaf number.

On Kernels for d-Path Vertex Cover

Authors
Červený, R.; Choudhary, P.; Suchý, O.
Year
2022
Published
47th International Symposium on Mathematical Foundations of Computer Science (MFCS 2022). Wadern: Schloss Dagstuhl - Leibniz Center for Informatics, 2022. p. 29:1-29:14. Leibniz International Proceedings in Informatics (LIPIcs). vol. 241. ISSN 1868-8969. ISBN 978-3-95977-256-3.
Type
Proceedings paper
Annotation
In this paper we study the kernelization of the d-Path Vertex Cover (d-PVC) problem. Given a graph G, the problem requires finding whether there exists a set of at most k vertices whose removal from G results in a graph that does not contain a path (not necessarily induced) with d vertices. It is known that d-PVC is NP-complete for d≥ 2. Since the problem generalizes to d-Hitting Set, it is known to admit a kernel with O(dk^d) edges. We improve on this by giving better kernels. Specifically, we give kernels with O(k^2) vertices and edges for the cases when d = 4 and d = 5. Further, we give a kernel with O(k^4d^{2d+9}) vertices and edges for general d.

Automata Approach to Inexact Tree Pattern Matching Using 1-degree Edit Distance

Year
2021
Published
Proceedings of the Prague Stringology Conference 2021. Praha: CESKE VYSOKE UCENI TECHNICKE V PRAZE, 2021. p. 1-15. ISBN 978-80-01-06869-4.
Type
Proceedings paper
Annotation
We compare labeled ordered trees based on unit cost 1-degree edit distance that uses operations vertex relabeling, leaf insertion, and leaf deletion. Given an input tree T and a tree pattern P, we find all subtrees in T that match P with up to k errors. We show that this problem can be solved by finite automaton when T and P are represented in linear, prefix bar, notation. First, we solve this problem by a pushdown automaton. Then, we show that it can be transformed into a nondeterministic finite automaton due to its restricted use of the pushdown store. We also show a simulation of the nondeterministic finite automaton by dynamic programming.

Hierarchical Bitmap Indexing for Range Queries on Multidimensional Arrays

Authors
Krčál, L.; Ho, S.-S.; Holub, J.
Year
2022
Published
Database Systems for Advanced Applications. Springer, Cham, 2022. p. 509-525. Lecture Notes in Computer Science. vol. 13245. ISSN 0302-9743. ISBN 978-3-031-00122-2.
Type
Proceedings paper
Annotation
Bitmap indices are widely used in commercial databases for processing complex queries, due to their efficient use of bit-wise operations. Bitmap indices apply natively to relational and linear datasets, with distinct separation of the columns or attributes, but do not perform well on multidimensional array scientific data. We propose a new method for multidimensional array indexing that considers the spatial component of multidimensional arrays. The hierarchical indexing method is based on sparse n-dimensional trees for dimension partitioning, and bitmap indexing with adaptive binning for attribute partitioning. This indexing performs well on range queries involving both dimension and attribute constraints, as it prunes the search space early. Moreover, the indexing is easily extensible to membership queries. The indexing method was implemented on top of a state of the art bitmap indexing library Fastbit, using tables partitioned along any subset of the data dimensions. We show that the hierarchical bitmap index outperforms conventional bitmap indexing, where an auxiliary attribute is required for each dimension. Furthermore, the adaptive binning significantly reduces the amount of bins and therefore memory requirements.

Backward Pattern Matching on Elastic Degenerate Strings

Authors
Procházka, P.; Cvacho, O.; Krčál, L.; Holub, J.
Year
2021
Published
Proceedings of 14th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2021). Lisboa: SCITEPRESS – Science and Technology Publications, Lda, 2021. p. 50-59. vol. 3. ISSN 2184-4305. ISBN 978-989-758-490-9.
Type
Proceedings paper
Annotation
Recently, the concept of Elastic Degenerate Strings (EDS) was introduced as a way of representing a sequenced population of the same species. Several on-line Elastic Degenerate String Matching (EDSM) algorithms were presented so far. Some of them provide practical implementation. We propose a new on-line EDSM algorithm BNDM-EDS. Our algorithm combines two traditional algorithms BNDM and the Shift-And that were adapted to the specifics needed by Elastic Degenerate Strings. BNDM-EDS is running in O (Nmd m w e) worst-case time. This implies O (Nm) time for small patterns, where m is the length of the searched pattern, N is the size of EDS, and w is the size of the computer word. The algorithm uses O (N + n) space, where n is the length of EDS. BNDM-EDS requires a simple preprocessing step with time and space O (m). Experimental results on real genomic data show superiority of BNDM-EDS over state-of-the-art algorithms.

Data Structures to Represent a Set of k-long DNA Sequences

Authors
Chikhi, R.; Holub, J.; Medvedev, P.
Year
2021
Published
ACM Computing Surveys. 2021, 54(1), 17:1-17:22. ISSN 0360-0300.
Type
Article
Annotation
The analysis of biological sequencing data has been one of the biggest applications of string algorithms. The approaches used in many such applications are based on the analysis of k-mers, which are short fixed-length strings present in a dataset. While these approaches are rather diverse, storing and querying a k-mer set has emerged as a shared underlying component. A set of k-mers has unique features and applications that, over the past 10 years, have resulted in many specialized approaches for its representation. In this survey, we give a unified presentation and comparison of the data structures that have been proposed to store and query a k-mer set. We hope this survey will serve as a resource for researchers in the field as well as make the area more accessible to researchers outside the field.

PFP Compressed Suffix Trees

Authors
Boucher, C.; Cvacho, O.; Gagie, T.; Holub, J.; Manzini, G.; Navarro, G.; Rossi, M.
Year
2021
Published
Proceedings of the Workshop on Algorithm Engineering and Experiments (ALENEX2021). Philadelphia: SIAM, 2021. p. 60-72. ISSN 2164-0300. ISBN 978-1-61197-647-2.
Type
Proceedings paper
Annotation
Prefix-free parsing (PFP) was introduced by Boucher et al. (2019) as a preprocessing step to ease the computation of Burrows-Wheeler Transforms (BWTs) of genomic databases. Given a string S, it produces a dictionary D and a parse P of overlapping phrases such that BWT(S) can be computed from D and P in time and workspace bounded in terms of their combined size |PFP(S)|. In practice D and P are significantly smaller than S and computing BWT(S) from them is more efficient than computing it from S directly, at least when S is the concatenation of many genomes. In this paper, we consider PFP(S) as a data structure and show how it can be augmented to sup- port full suffix tree functionality, still built and fitting within O(|PFP(S)|) space. This entails the efficient computation of various primitives to simulate the suffix tree: computing a longest common extension (LCE) of two positions in S; reading any cell of its suffix array (SA), of its inverse (ISA), of its BWT, and of its longest common prefix array (LCP); and computing minima over ranges and next/previous smaller value queries over the LCP. Our experimental results show that the PFP suffix tree can be efficiently constructed for very large repetitive datasets and that its operations perform competitively with other compressed suffix trees that can only handle much smaller datasets.

Conversion of Finite Tree Automata to Regular Tree Expressions By State Elimination

Year
2020
Published
Proceedings of the Prague Stringology Conference 2020. Praha: Czech Technical University in Prague, 2020. p. 11-22. ISBN 978-80-01-06749-9.
Type
Proceedings paper
Annotation
Regular tree languages can be accepted and described by finite tree automata and regular tree expressions, respectively. We describe a new algorithm that converts a finite tree automaton to an equivalent regular tree expression. Our algorithm is analogous to the well-known state elimination method of the conversion of a string finite automaton to an equivalent string regular expression. We define a generalised finite tree automaton, the transitions of which read the sets of trees described by regular tree expressions. Our algorithm eliminates states of the generalised finite tree automaton, which is analogous to the elimination of states in converting the string finite automaton.

Forward Linearised Tree Pattern Matching Using Tree Pattern Border Array

Authors
Year
2020
Published
Proceedings of the Prague Stringology Conference 2020. Praha: Czech Technical University in Prague, 2020. p. 61-73. ISBN 978-80-01-06749-9.
Type
Proceedings paper
Annotation
We define a tree pattern border array as a property of linearised trees analogous to border arrays from the string domain. We use it to define a new forward tree pattern matching algorithm for ordered trees, which finds all occurrences of a single given linearised tree pattern in a linearised input tree. As with the classical Morris-Pratt algorithm, the tree pattern border array is used to determine shift lengths in the matching phase of the tree pattern matching algorithm. We compare the new algorithm with the best performing previously existing algorithms based on backward linearised tree pattern matching algorithms, (non-)linearised tree pattern matching algorithms using finite tree automata or stringpath matchers. We show that the presented algorithm outperforms these for single tree pattern matching.

Adapting Stable Matchings to Evolving Preferences

Authors
Bredereck, R.; Chen, J.; Knop, D.; Luo, J.; Niedermeier, R.
Year
2020
Published
Proceedings of the AAAI Conference on Artificial Intelligence. Menlo Park: AAAI Press, 2020. p. 1830-1837. ISSN 2159-5399. ISBN 978-1-57735-835-0.
Type
Proceedings paper
Annotation
Adaptivity to changing environments and constraints is key to success in modern society. We address this by proposing “incrementalized versions” of Stable Marriage and Stable Roommates. That is, we try to answer the following question: for both problems, what is the computational cost of adapting an existing stable matching after some of the preferences of the agents have changed. While doing so, we also model the constraint that the new stable matching shall be not too different from the old one. After formalizing these incremental versions, we provide a fairly comprehensive picture of the computational complexity landscape of Incremental Stable Marriage and Incremental Stable Roommates. To this end, we exploit the parameters “degree of change” both in the input (difference between old and new preference profile) and in the output (difference between old and new stable matching). We obtain both hardness and tractability results, in particular showing a fixed-parameter tractability result with respect to the parameter “distance between old and new stable matching”.

Parameterized Algorithms for Finding a Collective Set of Items

Authors
Bredereck, R.; Faliszewski, P.; Kaczmarczyk, A.; Knop, D.; Niedermeier, R.
Year
2020
Published
Proceedings of the AAAI Conference on Artificial Intelligence. Menlo Park: AAAI Press, 2020. p. 1838-1845. ISSN 2159-5399. ISBN 978-1-57735-835-0.
Type
Proceedings paper
Annotation
We extend the work of Skowron et al. (AIJ, 2016) by considering the parameterized complexity of the following problem. We are given a set of items and a set of agents, where each agent assigns an integer utility value to each item. The goal is to find a set of k items that these agents would collectively use. For each such collective set of items, each agent provides a score that can be described using an OWA (ordered weighted average) operator and we seek a set with the highest total score. We focus on the parameterization by the number of agents and we find numerous fixed-parameter tractability results (however, we also find some W[1]-hardness results). It turns out that most of our algorithms even apply to the setting where each agent has an integer weight.

Multidimensional Stable Roommates with Master List

Authors
Bredereck, R.; Heeger, K.; Knop, D.; Niedermeier, R.
Year
2020
Published
Web and Internet Economics - 16th International Conference, WINE 2020, Beijing, China, December 7-11, 2020, Proceedings. Cham: Springer, 2020. p. 59-73. ISBN 978-3-030-64945-6.
Type
Proceedings paper
Annotation
Since the early days of research in algorithms and complexity, the computation of stable matchings is a core topic. While in the classic setting the goal is to match up two agents (either from different “gender” (this is Stable Marriage) or “unrestricted” (this is Stable Roommates)), Knuth [1976] triggered the study of three- or multidimensional cases. Here, we focus on the study of Multidimensional Stable Roommates, known to be 𝖭𝖯 -hard since the early 1990’s. Many 𝖭𝖯 -hardness results, however, rely on very general input instances that do not occur in at least some of the specific application scenarios. With the quest for identifying islands of tractability, we look at the case of master lists. Here, as natural in applications where agents express their preferences based on “objective” scores, one roughly speaking assumes that all agent preferences are “derived from” a central master list, implying that the individual agent preferences shall be similar. Master lists have been frequently studied in the two-dimensional (classic) stable matching case, but seemingly almost never for the multidimensional case. This work, also relying on methods from parameterized algorithm design and complexity analysis, performs a first systematic study of Multidimensional Stable Roommates under the assumption of master lists.

Analyzing Large Code Repositories

Author
Ing. Petr Máj
Year
2023
Type
Dissertation thesis
Supervisor
doc. Ing. Jan Janoušek, Ph.D.
Reviewers
Prof. Tobias Wrigstad
Max Schaefer, MSc., DPhil.
Mgr. Tomáš Petříček, Ph.D.

A string automata approach to tree pattern matching and indexing

Author
Ing. Eliška Šestáková
Year
2023
Type
Dissertation thesis
Supervisor
doc. Ing. Jan Janoušek, Ph.D.
Reviewers
prof. RNDr. Alexandr Meduna, CSc.
prof. Philip Bille
prof. Giovanni Pighizzini

Application of Reinforcement Learning to Creating Adversarial Malware Samples

Author
Matouš Kozák
Year
2023
Type
Master thesis
Supervisor
Mgr. Martin Jureček, Ph.D.
Reviewers
doc. Ing. Ivan Šimeček, Ph.D.
Summary
Machine learning is becoming increasingly popular as a go-to approach for many tasks due to its world-class results. As a result, antivirus developers are starting to incorporate machine learning models into their products. While these models improve malware detection capabilities, they also carry the disadvantage of being susceptible to adversarial attacks. Although this sensitivity has been demonstrated for many models in white-box settings, a black-box attack is more applicable in practice for the domain of malware detection. Therefore, we present a black-box scenario in which the attacker only has the predicted label at his disposal and forgoes any other information about the target classifier. Using reinforcement learning algorithms, we implemented an attack against the GBDT classifier trained on the EMBER dataset. We trained several RL agents on a dataset of Windows malware with an emphasis on preserving the original functionality of the malicious samples. We achieved an evasion rate of 58.92% against the targeted classifier using the PPO algorithm. In addition to targeting this detector, we studied how the adversarial attack can be transferred to other models. The agent previously trained against the GBDT classifier scored an evasion rate of 28.91% against MalConv, a model based solely on machine learning. We also tested the generated adversarial examples against top AV programs and achieved an evasion rate ranging from 10.24% to 25.7%. These results prove that not only machine learning-based models are vulnerable to adversarial attacks and that better safeguards need to be taken to protect our systems.

Structured Committee Election

Author
Terezie Hrubanová
Year
2023
Type
Bachelor thesis
Supervisor
doc. RNDr. Dušan Knop, Ph.D.
Reviewers
Mgr. Michal Opler, Ph.D.
Summary
This thesis focuses on winner selection in committee election. We provide an overview of voting systems for single-winner and multi-winner election. Winner selection is often computationally hard, therefore we introduce election with structured preferences. In election with structured preferences, the votes are in some way restricted which may help to create more efficient winner selection algorithms. We describe some of the known structured preferences. We provide an overview of the current literature on the topic of structured and nearly structured preferences. We also review current work on committee election with structured preferences and usage of ordered weighted average (OWA) in committee election. We propose polynomial-time algorithms for finding a winning committee for approval election with OWA vector (0, ..., 0, 1) and interval preference restrictions. In such election, a voter approves a committee only if she approves all of its members. We use this property and show two approaches for finding a winning committee.

Implementation of a statically typed, lazy, pure functional programming language

Author
Jan Liam Verter
Year
2022
Type
Master thesis
Supervisor
Ryan Michael Culpepper
Reviewers
doc. Ing. Filip Křikava, Ph.D.
Summary
This thesis presents an implementation of a functional, statically typed programming language inspired by Haskell. It mainly focuses on the challenges of implementing the type system for such an language. The implementation is based on multiple resources covering implementations of different type system features. The thesis also covers other aspects of the language implementation--lexical and syntactic analysis, translation into a smaller functional core language, and non-strict evaluation.

Randomized Indexing for Approximate Selection Queries on Multidimensional Arrays

Author
Ing. Luboš Krčál
Year
2022
Type
Dissertation thesis
Supervisor
prof. Ing. Jan Holub, Ph.D.
Reviewers
prof. Ing. Michal Krátký, Ph.D.
Kwo-Sen Kuo, PhD.
Dimitar Mišev, PhD.

Complexity of Games on Graphs

Author
Ing. Václav Blažej
Year
2022
Type
Dissertation thesis
Supervisor
doc. RNDr. Tomáš Valla, Ph.D.
Reviewers
Prof., Dr. rer. nat. Robert Bredereck
RNDr. Martin Balko, Ph.D.
Paweł Rzążewski, PhD.

Haskell Dynamic Tracing

Author
Ondřej Kvapil
Year
2021
Type
Bachelor thesis
Supervisor
doc. Ing. Filip Křikava, Ph.D.
Reviewers
Vitaly Bragilevsky, MSc.
Summary
Haskell is one of the most well-known instances of a programming language that uses non-strict semantics. On the one hand, this brings the convenience of infinite data structures, user-defined control flow, and the possibility to avoid unnecessary computation. On the other hand, these benefits are hampered by the runtime overhead and hard-to-predict the behaviour of call-by-need. This begs the question: Is laziness worth it? To answer this question, we need to understand how laziness is used in the wild. To this end, we develop a tool for dynamic analysis used to trace the evaluation of function parameters. It is implemented as a compiler plugin for the Glasgow Haskell Compiler.

Stability in Hedonic Games with Structured Diversity Preferences

Author
Jan Šmolík
Year
2021
Type
Bachelor thesis
Supervisor
doc. RNDr. Dušan Knop, Ph.D.
Reviewers
Ing. Václav Blažej, Ph.D.
Summary
In the Roommate diversity problem, agents that belong to one of the two types have to be allocated to fixed size rooms based on their preferences, which depend solely on the fraction of agents of their own type among their potential roommates. In this work we study the stability of outcomes with respect to the room size and structured preferences. We provide a new randomized algorithm for finding core stable outcomes and show the results which we have obtained with the algorithm. We have found a small single-peaked instance with empty core and have verified, that every instance of the Roommate diversity problem with room size = 3, all preferences single peaked and number of agents &amp;lt;= 30, room size = 3 and number of agents &amp;lt;= 15, room size = 4, all preferences single peaked and number of agents = 30, admits an outcome that is core stable. We present arguments why we believe that every instance of this problem with room size = 3, room size = 4 and all preferences single peaked, admits an outcome that is core stable.

Algorithms for Combinatorics on Words

Author
Martin Rejmon
Year
2021
Type
Master thesis
Supervisor
doc. Ing. Štěpán Starosta, Ph.D.
Reviewers
doc. Ing. Jan Janoušek, Ph.D.
Summary
In this thesis, several algorithms for problems from the mathematical field combinatorics on words are thoroughly explained. The algorithms are also implemented in the free and open-source mathematics software system SageMath with the goal of their eventual integration into the same system. The algorithms include: classifying mortal and bounded letters of a morphism, deciding whether a morphism is injective, finding a simplification of a morphism, finding all infinite repetitions in a D0L system, and finding all factors (up to a certain length) of words of the language of a PD0L system. Furthermore, the problem of deciding whether a morphism of a D0L system is injective on the set of factors of words of the language of the system is discussed. The decidability of this problem is an open question, which is not answered in this thesis, but it is shown why it is nontrivial and where some approaches to solving this problem fail.

Low-Latency Heap Profiling

Author
Jan Tománek
Year
2021
Type
Master thesis
Supervisor
Ing. Daniel Langr, Ph.D.
Reviewers
doc. Ing. Ivan Šimeček, Ph.D.
Summary
Frequent heap allocations can have a significant impact on the overall performance of a program. The problem becomes even more critical in multithreaded applications, where too many interactions with heap can grow into the main bottleneck and effectively diminish scalability. Profiling such applications with existing heap profilers is almost impossible due to the enormous runtime overhead they add. This thesis describes the typical memory layout of a Linux process, briefly shows the internals of the GNU C memory allocator, analyzes available heap profiling tools, and guides the reader through the process of constructing a~custom heap profiler. Further, the thesis proposes a concept of low-latency heap profiling. Based on that, a new profiler Heaprof has been developed and compared with other heap profiling tools. Experimental results revealed that Heaprof was the only tool capable of profiling without further compromising the program's scalability. For 8-threaded benchmarks, Heaprof achieved up to 8 times profiling speedup.

Tiny x86 - Architecture Simulator for Educational Purposes

Author
Ivo Strejc
Year
2021
Type
Master thesis
Supervisor
Ing. Petr Máj, Ph.D.
Reviewers
Ing. Tomáš Pecka
Summary
This thesis presents tiny x86 architecture and virtual machine designed to help students understand various compiler techniques and their effect on the program performance. Compared to existing instruction set architectures, tiny x86 is simpler, easier to use as it comes with a C++ API as opposed to binary encodings and does not limit itself to single design principles (both CISC and RISC features are supported). The VM also offers extensive configuration options, allowing it to (de-)emphasize various architecture features (register pressure, memory latency, instruction timings, etc.). The VM is already used in the NI-GEN (Code Generation) course at FIT CTU, where its simplicity allows the students to write full compiler pipeline during the term.