Ing. Petr Máj, Ph.D.

Theses

Bachelor theses

Serialization of functions and environments in the R language

Author
Michal Vácha
Year
2017
Type
Bachelor thesis
Supervisor
Ing. Petr Máj
Reviewers
Ing. Petr Špaček, Ph.D.
Summary
This thesis is a step towards a greater endeavor: making R code more reliable and maintainable. To do that, a tool for automatic test generation called Genthat has been developed. It captures traces of function calls and then generates test cases from them. To do that is has to be able to serialize arbitrary object a function may take as an argument or return as a result. Although R is a functional language and therefore the usage of closures is abundant, they have not been supported by Genthat. In my work, I have implemented the serialization of closures and improved other areas, namely serialization of language expressions and the code clarity of generated tests.

Implementation of lambda expressions evaluator

Author
Jan Liam Verter
Year
2019
Type
Bachelor thesis
Supervisor
Ing. Petr Máj
Reviewers
RNDr. Jiřina Scholtzová, Ph.D.
Summary
l-calculus is a fundamental concept in computer science and as such is taught at almost all universities with a computer science programme, including FIT CTU. But for many students, learning the l-calculus and understanding its significance and impact on programming languages is a challenging task. This thesis describes a l-calculus evaluator and its front-end designed to help stu- dents understand l-calculus by treating it more like a programming language and by effortless integration with existing course materials.

SECD Virtual Machine Debugger

Author
Vojtěch Rozhoň
Year
2022
Type
Bachelor thesis
Supervisor
Ing. Petr Máj, Ph.D.
Reviewers
Ing. Tomáš Pecka
Summary
This thesis describes an implementation of the debugger of the SECD machine, consisting of the SECD virtual machine and a frontend part. The tiny-lisp programming language is designed to serve as a language that will be evaluated by the virtual machine. The thesis discusses how to compile tiny-lisp expressions, including macros, to the SECD bytecode. The implementation of the debugger focuses on helping students understand key concepts of the SECD machine by interactively showing connections between the source code and the SECD bytecode. The thesis includes a design and implementation of individual parts of the virtual machine and the frontend module.

TinyGo Language

Author
Maksim Gusev
Year
2024
Type
Bachelor thesis
Supervisor
Ing. Petr Máj, Ph.D.
Reviewers
Ing. Tomáš Pecka
Summary
This thesis presents the TinyGo language and its compiler. TinyGo is a simplified version of the real programming language GoLang. It was created for the educational purpose of studying compiler construction and preserving the most important design principles behind Go. The compiler works for a TinyGo language, generating the target code for a tiny x86 VM from a \acrshort{nie-gen} course. The compiler is separated for an easier understanding of parts of it and open for any extension and upgrade.

Master theses

GNU-R Debugger Bytecode Support

Author
Aleš Saska
Year
2019
Type
Master thesis
Supervisor
Ing. Petr Máj
Reviewers
Ing. Konrad Siek, Ph.D.
Summary
First part of this thesis is about analysis and implementation of the bytecode disassembler tool for GNU-R language. The second part of the work is improving a debugging of the GNU-R bytecode subsystem by implementation of the native bytecode debugger without any negative side-effects on the language performance. During this work there was also designed and implemented the bytecode stack printer. At the last phase there was implemented a feature supporting a simulation of conditional breakpoints in the GNU-R language.

Improvements of the RIR bytecode toolchain

Author
Jan Ječmen
Year
2017
Type
Master thesis
Supervisor
Ing. Petr Máj
Reviewers
Ing. Filip Křikava, Ph.D.
Summary
R is a dynamic programming language that, despite being over 20 years old, is still widely used. RIR is an alternative to its bytecode compiler and interpreter that aims to facilitate adding static analyses and optimization passes easily. RIR is under development and does not currently match the performance of standard R. This thesis attempts to amend the situation. It extends the RIR internal representation, adds new features to its compiler and refactors its interpreter. The average slowdown versus standard R is brought down by about one half in the Shootout benchmarks.

Record and Replay debugging in R

Author
Kryštof Slavík
Year
2018
Type
Master thesis
Supervisor
Ing. Petr Máj
Reviewers
Ing. Filip Křikava, Ph.D.
Summary
Non-determinism in programs often causes unwanted behaviour to appear and disappear seemingly randomly. Record and Replay debugger is a tool which helps programmers to isolate such behaviour by recording a program's execution once when the bug appears and then replaying it later multiple times in the exact same way with the bug present while providing classic debugging facilities. This thesis focuses on implementation and integration of such tool into R programming language which is commonly used in mathematics, mainly in statistics.

Swift for Embedded Systems

Author
Alan Dragomirecký
Year
2019
Type
Master thesis
Supervisor
Ing. Petr Máj
Reviewers
Ing. Tomáš Zahradnický, Ph.D.
Summary
Released in 2014, Swift quickly become one of the fastest growing programming languages, and, in addition to its original field of use in application development, is finding its place on servers and recently also in data science. However, no work has been published regarding the use of Swift on the smallest computers with highly-constrained resources - embedded systems. This thesis aims to be the first step in extending Swift's possibilities towards this segment. It describes the process of adding support for a new bare-metal platform to the Swift compiler and its related tools. As a result, Swift is presented as a viable alternative to existing embedded platforms.

A Modular and Extensible Tool for Software Fault Localization

Author
Petr Nevyhoštěný
Year
2020
Type
Master thesis
Supervisor
Ing. Petr Máj
Reviewers
doc. Ing. Ivan Šimeček, Ph.D.
Summary
Localization of faults has been recognized as one of the most tedious and time-consuming tasks in software development. Yet it is still performed mostly manually. Software fault localization (SFL) is a research field that studies the development of automated techniques helping developers with this activity. Despite the amount of effort put into the research, applications of proposed methods do not fully meet the practical needs. To that end, this thesis describes \emph{Aardwolf}, a new fault localization tool and framework whose main goal is to ease the usage of SFL techniques in real-world projects. This is achieved by highly modular design allowing convenient extensibility, and by applying recommendations that can be found in published user studies. To demonstrate Aardwolf's capabilities, three different SFL techniques and frontends for C and Python programming languages were implemented. Integration into two larger software projects was described to present its applicability. Preliminary results show that the effectiveness is comparable with the literature. The focus on user experience and tool's extensibility is however a major improvement over the current state in the field.

SOM Virtual Machine Implementation

Author
Rudolf Rovňák
Year
2021
Type
Master thesis
Supervisor
Ing. Petr Máj, Ph.D.
Reviewers
Ing. Konrad Siek, Ph.D.
Summary
This work presents an implementation of a virtual machine for Simple Object Machine programming language, based on Smalltalk. Additionally, it analyzes existing implementations, along with an analysis of the provided solution. Included is a design and implementation of a parser, bytecode with a compiler and runtime environment with garbage collector to allow executing programs written in SOM.

Tiny x86 - Architecture Simulator for Educational Purposes

Author
Ivo Strejc
Year
2021
Type
Master thesis
Supervisor
Ing. Petr Máj, Ph.D.
Reviewers
Ing. Tomáš Pecka
Summary
This thesis presents tiny x86 architecture and virtual machine designed to help students understand various compiler techniques and their effect on the program performance. Compared to existing instruction set architectures, tiny x86 is simpler, easier to use as it comes with a C++ API as opposed to binary encodings and does not limit itself to single design principles (both CISC and RISC features are supported). The VM also offers extensive configuration options, allowing it to (de-)emphasize various architecture features (register pressure, memory latency, instruction timings, etc.). The VM is already used in the NI-GEN (Code Generation) course at FIT CTU, where its simplicity allows the students to write full compiler pipeline during the term.

Implementation of Object Oriented Languages

Author
Rasul Seidagul
Year
2022
Type
Master thesis
Supervisor
Ing. Petr Máj, Ph.D.
Reviewers
Ing. Konrad Siek, Ph.D.
Summary
This work presents tinyC+, a language extension over a C-like language that supports Object-Oriented Programming. TinyC+ has been developed to aid the Code Generation course taught at FIT CTU. It serves as a demonstration of how the high-level OOP features are compiled. Therefore the primary design goals of tinyC+ were to provide minimal, yet functional object-oriented language so that its addition to the course would take very little explanation. TinyC+ is then transpiled into tinyC, a C-like language already used in the class. Therefore this thesis surveys the object-oriented features of several common programming languages, defines and justifies their minimal viable selection, and describes the transpiler implementation in detail.

TinyC Optimizing Compiler

Author
Martin Slávik
Year
2023
Type
Master thesis
Supervisor
Ing. Petr Máj, Ph.D.
Reviewers
doc. Ing. Ivan Šimeček, Ph.D.
Summary
This thesis is about compiler creation and its internal parts for educational purposes. It should compile programs described with the TinyC language to the instruction set specified by the virtual architecture tiny86. TinyC is a C-like language with a smaller subset of types and a limited set of features. The tiny86 Virtual Machine is inspired by x86 architecture but has a simpler instruction set, so it is more understandable for students. The tiny86 instruction set preserves interesting and important features that the x86 architecture offers. Throughout the compilation, optimization techniques are used to achieve better performance of the compiled code. The thesis compares the naive compilation and compilation with optimization techniques turned on.

Modular Compiler for the TinyC Language

Author
Martin Prokopič
Year
2023
Type
Master thesis
Supervisor
Ing. Petr Máj, Ph.D.
Reviewers
Ing. Tomáš Pecka
Summary
This thesis introduces an ahead-of-time compiler for the tinyC language targeting the tiny86 VM (both used in the NIE-GEN course) implemented in Scala. The compiler seamlessly integrates with the existing toolchain, supports all tinyC features and uses tree covering instruction selection and graph coloring register allocation. The main modules (frontend, middleend, backend) are cleanly separated and can be extended either from Scala, or externally through a custom text-serializable LLVM-like SSA intermediate representation, which encourages use of the compiler for educational purposes.

x86-64 native backend for TinyC

Author
Michal Vlasák
Year
2023
Type
Master thesis
Supervisor
Ing. Petr Máj, Ph.D.
Reviewers
Ing. Konrad Siek, Ph.D.
Summary
This thesis describes a compiler back end compiling TinyC IR to x86-64 native machine code. The goal was to create a compiler that would showcase additional difficulties imposed by the x86-64 architecture, especially compared to the simplified Tiny86 architecture targeted by students in the NI-GEN course. In the theoretical part a close attention is paid to x86-64's history and the true nature of its limitations. Existing compiler back end research is surveyed and evaluated with respect to x86-64's capabilities. A design for a x86-64 TinyC back end is presented. It is based on instruction selection by extensive peephole optimization and graph coloring register allocation. The implementation itself is presented thoroughly and many practical details are considered and explained. The compiled programs are either able to use the system C runtime, or on Linux can be bundled with a custom minimalistic runtime allowing the programs to run without any dependencies. The text of this thesis and the implementation are freely available, and can be used not only by NI-GEN students.

Tiny86 Debugger

Author
Filip Gregor
Year
2023
Type
Master thesis
Supervisor
Ing. Petr Máj, Ph.D.
Reviewers
Ing. Tomáš Pecka
Summary
Programmers often need to inspect the state of their programs at runtime. A special tool called a debugger exists precisely for this purpose. Despite its widespread use, very few know how exactly this tool works. This is partly because it must be supported at multiple layers, like the CPU, the operating system, and the compiler. Most of the courses that teach about these do not delve into debugging. This thesis explores what support must be provided by the CPU and operating system to enable native-level debugging. A small debugger implementation is demonstrated on the x86-64 architecture and the Linux operating system. Focus is then shifted onto compiler support for source-level debugging. Using this knowledge, the thesis presents a debugger for the T86 architecture and the TinyC language, both of which are used by the NI-GEN compilers course at FIT, CTU. This debugger is fully functional, making it easier for students to work with the T86. Additionally, the thesis presents a design and implementation of a novel format of debugging information that keeps the interesting concepts from real-world debuggers while being extremely simple to use on both machine and human level, which is ideally suited for its intended classroom use.