Ing. Ondřej Guth, Ph.D.

Publikace

Automata Approach to Inexact Tree Pattern Matching Using 1-degree Edit Distance

Rok
2021
Publikováno
Proceedings of the Prague Stringology Conference 2021. Praha: CESKE VYSOKE UCENI TECHNICKE V PRAZE, 2021. p. 1-15. ISBN 978-80-01-06869-4.
Typ
Stať ve sborníku
Anotace
We compare labeled ordered trees based on unit cost 1-degree edit distance that uses operations vertex relabeling, leaf insertion, and leaf deletion. Given an input tree T and a tree pattern P, we find all subtrees in T that match P with up to k errors. We show that this problem can be solved by finite automaton when T and P are represented in linear, prefix bar, notation. First, we solve this problem by a pushdown automaton. Then, we show that it can be transformed into a nondeterministic finite automaton due to its restricted use of the pushdown store. We also show a simulation of the nondeterministic finite automaton by dynamic programming.

Co-Teaching Computer Science Across Borders: Human-Centric Learning at Scale

Autoři
Piech, C.; Yan, L.; Einstein, L.; Saavedra, A.; Bozkurt, B.; Šestáková, E.; Guth, O.; McKeown, N.
Rok
2020
Publikováno
L@S '20: Proceedings of the Seventh ACM Conference on Learning @ Scale. New York: Association for Computing Machinery, 2020. p. 103-113. ISBN 978-1-4503-7951-9.
Typ
Stať ve sborníku
Anotace
Programming is fast becoming a required skill set for students in every country. We present CS Bridge, a model for cross-border co-teaching of CS1, along with a corresponding open-source course-in-a-box curriculum made for easy localization. In the CS Bridge model, instructors and student-teachers from different countries come together to teach a short, stand-alone CS1 course to hundreds of local high school students. The corresponding open-source curriculum has been specifically designed to be easily adapted to a wide variety of local teaching practices, languages, and cultures. Over the past six years, the curriculum has been used to teach CS1 material to over 1,000 high school students in Colombia, the Czech Republic, Turkey, and Guinea. A large majority of our students continue on to study CS or CS-related fields in university. More importantly, many of our undergraduate student-teachers stay involved with teaching beyond the program. Joint teaching creates a positive, high-quality learning experience for students around the world and a powerful, high-impact professional development experience for the teaching team---instructors and student-teachers alike.

On approximate enhanced covers under Hamming distance

Autoři
Rok
2020
Publikováno
Discrete Applied Mathematics. 2020, 274 67-80. ISSN 0166-218X.
Typ
Článek
Anotace
A border p of a string x is an enhanced cover of x if the number of positions of x that lie within some occurrence of p is the maximum among all borders of x. (String p is a border of x if p is both a proper prefix and a suffix of x.) In this paper, two more general notions based on enhanced covers are introduced: a k-approximate enhanced cover and a relaxed k-approximate enhanced cover, where a fixed maximum number of errors k under the Hamming distance is considered. The k-approximate enhanced cover of x is its border, and its k-approximate occurrences are also considered in the covered number of positions of x. The relaxed k-approximate enhanced cover of x is a factor of x and a k-approximate border of x. Algorithms that compute all the variations of k-approximate enhanced covers mentioned above are presented in this paper.

Computing All Approximate Enhanced Covers with the Hamming Distance

Autoři
Rok
2016
Publikováno
Proceedings of the Prague Stringology Conference 2016. Praha: Czech Technical University in Prague, 2016. p. 146-157. ISBN 978-80-01-05996-8.
Typ
Stať ve sborníku
Anotace
A border p of a string x is an enhanced cover of x if the number of positions of x that lie within some occurrence of p is the maximum among all borders of x. In this paper, more general notion based on the enhanced cover is introduced: a k-approximate enhanced cover, where fixed maximum number of errors k in the Hamming distance is considered. The k-approximate enhanced cover of x is its border and its k-approximate occurrences are also considered in the covered number of positions of x. An O(n^2)-time and a O(n)-space algorithm that computes all k-approximate enhanced covers of a string of length n is presented.

On left and right seeds of a string

Autoři
Christou, M.; Crochemore, M.; Guth, O.; Iliopoulos, C.S.; Pissis, S.P.
Rok
2012
Publikováno
Journal of Discrete Algorithms. 2012, 17 31-44. ISSN 1570-8667.
Typ
Článek
Anotace
We consider the problem of finding the repetitive structure of a given string y of length n. A factor u of y is a cover of y, if every letter of y lies within some occurrence of u in y. A string v is a seed of y, if it is a cover of a superstring of y. A left seed of y is a prefix of y, that is a cover of a superstring of y. Similarly, a right seed of y is a suffix of y, that is a cover of a superstring of y. An integer array LS is the minimal left-seed (resp. maximal left-seed) array of y, if LS[i] is the minimal (resp. maximal) length of left seeds of y[0..i]. The minimal right-seed (resp. maximal right-seed) arrayRS of y is defined in a similar fashion. In this article, we present linear-time algorithms for computing all left and right seeds of y, a linear-time algorithm for computing the minimal left-seed array of y, a linear-time solution for computing the maximal left-seed array of y, an O(n log n)-time algorithm for computing the minimal right-seed array of y, and a linear-time solution for computing the maximal right-seed array of y. All algorithms use linear auxiliary space.

On the Right-Seed Array of a String

Autoři
Christou, M.; Crochemore, M.; Guth, O.; Iliopoulos, CS; Pissis, SP
Rok
2011
Publikováno
Proceedings of the 17th Annual International Computing and Combinatorics Conference (COCOON 2011). Berlin: Springer-Verlag, 2011. Lecture Notes in Computer Science. ISSN 0302-9743. ISBN 978-3-642-22684-7.
Typ
Stať ve sborníku
Anotace
We consider the problem of finding the repetitive structure of a given fixed string y. A factor u of y is a cover of y, if every letter of y falls within some occurrence of u in y. A factor v of y is a seed of y, if it is a cover of a superstring of y. There exist linear-time algorithms for solving the minimal cover problem. The minimal seed problem is of much higher algorithmic difficulty, and no linear-time algorithm is known. In this article, we solve one of its variants - computing the minimal and maximal right-seed array of a given string. A right seed of y is the shortest suffix of y that it is a cover of a superstring of y. An integer array RS is the minimal right-seed (resp. maximal right-seed) array of y, if RS [i] is the minimal (resp. maximal) length of right seeds of y[0..i]. We present an O(n log n) time algorithm that computes the minimal right-seed array of a given string, and a linear-time solution to compute the maximal right-seed array.

Using Finite Automata Approach for Searching Approximate Seeds of Strings

Autoři
Guth, O.; Melichar, B.
Rok
2010
Publikováno
Intelligent Automation and Computer Engineering. New York: Springer, 2010. p. 347-360. ISSN 1876-1100. ISBN 978-90-481-3516-5.
Typ
Kapitola v knize
Anotace
Seed is a type of a regularity of strings. A restricted approximate seed w of string T is a factor of T such that w covers a superstring of T under some distance rule. In this paper, the problem of searching of all restricted seeds with the smallest Hamming distance is studied and a polynomial time and space algorithm for solving the problem is presented. It searches for all restricted approximate seeds of a string with given limited approximation using Hamming distance and it computes the smallest distance for each found seed. The solution is based on a finite (suffix) automata approach that provides a straightforward way to design algorithms to many problems in stringology. Therefore, it is shown that the set of problems solvable using finite automata includes the one studied in this paper.