Ing. Daniel Langr, Ph.D.

+420224359859
daniel.langr@fit.cvut.cz
TH:A-1150

Theses

Sample theses

Bachelor theses

Graphical User Interface for Visualization of Very Large Sparse Matrices

Author

Ivo Kolář

Year

2014

Type

Bachelor thesis

Supervisor

Ing. Daniel Langr, Ph.D.

Reviewers

doc. Ing. Ivan Šimeček, Ph.D.

Department

Department of Software Engineering

Master theses

Parallel Sorting in C++11

Author

Klára Horáčková

Year

2019

Type

Master thesis

Supervisor

Ing. Daniel Langr, Ph.D.

Reviewers

doc. Ing. Ivan Šimeček, Ph.D.

Department

Department of Theoretical Computer Science

Summary

A new parallel variant of the quicksort algorithm called C++11Sort is presen- ted in this thesis. C++11Sort is implemented based on the C++11 threads and the respective synchronization techniques, i.e. without the use of the non- standard C++ extensions and external libraries. In the search section, existing implementations of the parallel versions of the quicksort algorithm are descri- bed. These implementations are afterwards experimentally compared to the C++11Sort algorithm. C++11Sort achieves excellent results in those expe- riments. When sorting two billions of integers, it is faster than the fastest existing implementation by 28 %.

Thesis on DSpace

Efficient concurrent memoization system

Author

Viacheslav Kroilov

Year

2020

Type

Master thesis

Supervisor

Ing. Daniel Langr, Ph.D.

Reviewers

Ing. Jiří Kašpar

Department

Department of Theoretical Computer Science

Summary

An automatic memoization system - also known as a software cache - stores a limited number of recently accessed elements and speeds up consequent accesses to them. Least-Recently-Used (LRU) is a popular replacement policy for hardware and software caches. However, its concurrent implementation suffers from high contention due to the list reordering performed both on lookup and insertion. A novel LRU-inspired concurrent software cache, called DeferredLRU, is presented in this thesis. The main goal of the design was to make it scalable and suitable for many-core systems. These properties were achieved by using a different approach to tracking item access order. This approach substantially decreases the number of list reinsertions --- the main factor of the contention in a regular LRU cache. DeferredLRU throughput and hit-rate are sensitive to the meta-parameter setting. By fine-tuning meta parameters for specific inputs, it was possible to achieve higher hit-rate than of a regular LRU cache for every tested input. The relative difference was up to 7.8%. DeferredLRU performance was compared to existing alternatives, including corresponding caches from Intel TBB and Facebook HHVM projects. Tested caches were evaluated with up to 32 threads (on a 16 HW cores CPU). In 32 threads evaluation, DeferredLRU was faster in all 16 tests. When accesses were distributed among multiple smaller caches for better parallelism (this approach is called binning), DeferredLRU was faster in 11 of 16 tests and was close to best-performing caches in 5 other tests. DeferredLRU with binning was up to 28.8 times faster on 32 threads compared to single-threaded performance.

Thesis on DSpace

Performance analysis of the LSU3shell program

Author

Martin Kočička

Year

2019

Type

Master thesis

Supervisor

Ing. Daniel Langr, Ph.D.

Reviewers

Ing. Jiří Kašpar

Department

Department of Theoretical Computer Science

Summary

Ab initio approaches to nuclear structure exploration are at the forefront of current nuclear physics research. LSU3shell is an implementation of the ab initio method called symmetry-adapted no-core shell model (SA-NCSM) optimized for distributed HPC systems. The goal of this thesis was the analysis of the LSU3shell program with focus primarily on dynamic memory allocation, research of methods that can be used to improve performance and memory usage, and their application on LSU3shell. The focus on dynamic memory allocation proved to be the right one, leading to many possible optimizations. I was able to reduce the run time by 41 % on average, thus potentially saving up to 1.4 million core-hours of our total BlueWaters allocation.

Thesis on DSpace

Low-Latency Heap Profiling

Author

Jan Tománek

Year

2021

Type

Master thesis

Supervisor

Ing. Daniel Langr, Ph.D.

Reviewers

doc. Ing. Ivan Šimeček, Ph.D.

Department

Department of Theoretical Computer Science

Summary

Frequent heap allocations can have a significant impact on the overall performance of a program. The problem becomes even more critical in multithreaded applications, where too many interactions with heap can grow into the main bottleneck and effectively diminish scalability. Profiling such applications with existing heap profilers is almost impossible due to the enormous runtime overhead they add. This thesis describes the typical memory layout of a Linux process, briefly shows the internals of the GNU C memory allocator, analyzes available heap profiling tools, and guides the reader through the process of constructing a~custom heap profiler. Further, the thesis proposes a concept of low-latency heap profiling. Based on that, a new profiler Heaprof has been developed and compared with other heap profiling tools. Experimental results revealed that Heaprof was the only tool capable of profiling without further compromising the program's scalability. For 8-threaded benchmarks, Heaprof achieved up to 8 times profiling speedup.

Thesis on DSpace

Survey of Sparse Matrix Storage Formats for Parallel Systems

Author

Radek Bittara

Year

2015

Type

Master thesis

Supervisor

Ing. Daniel Langr, Ph.D.

Reviewers

doc. Ing. Ivan Šimeček, Ph.D.

Department

Department of Computer Systems

Summary

The goal of this thesis is to provide a overview of published sparse matrix storage formats in computer memory. This thesis contains a description of the principles and potential use in a sparse matrix-vector multiplication for modern parallel systems.

Thesis on DSpace