doc. Ing. Daniel Langr, Ph.D.

daniel.langr@fit.cvut.cz
TH:A-1150

Profile
Publications
Projects
Teaching
Theses

Theses

Topics of theses
Sample theses

Dissertation theses

Numerical methods for computer modeling of atomic nuclei

Supervisor

doc. Ing. Daniel Langr, Ph.D.

Level

Topic of dissertation thesis

Topic description

Symmetry-adapted no-core shell model (SA-NCSM) is a state-of-the-art method for ab-initio (from first principles) modeling of atomic nuclei. The principle of this model is based on computation of nuclear wave functions in the form of eigenvectors of matrix operators relevant to a given nucleus. In the current implementation of this model, elements of these matrix operators are explicitly stored in a computer memory. This limits its applicability to only relatively light nuclei, even when the largest contemporary existing supercomputers are utilized. The goal of this dissertation thesis topic is to research and develop suitable parallel and scalable numerical algorithms and corresponding data structures that would enable “on-the-fly” computation of matrix operators in SA-NCSM calculations. Specific research of interest is the development and implementation of scalable parallel algorithms for finding extremal eigenvalues and corresponding eigenvectors of large sparse symmetric matrices.

Bachelor theses

Graphical User Interface for Visualization of Very Large Sparse Matrices

Author

Ivo Kolář

Year

2014

Type

Bachelor thesis

Supervisor

doc. Ing. Daniel Langr, Ph.D.

Reviewers

doc. Ing. Ivan Šimeček, Ph.D.

Department

Department of Software Engineering

Master theses

Survey of Sparse Matrix Storage Formats for Parallel Systems

Author

Radek Bittara

Year

2015

Type

Master thesis

Supervisor

Ing. Daniel Langr, Ph.D.

Reviewers

doc. Ing. Ivan Šimeček, Ph.D.

Department

Department of Computer Systems

Summary

The goal of this thesis is to provide a overview of published sparse matrix storage formats in computer memory. This thesis contains a description of the principles and potential use in a sparse matrix-vector multiplication for modern parallel systems.

Thesis on DSpace

Performance analysis of the LSU3shell program

Author

Martin Kočička

Year

2019

Type

Master thesis

Supervisor

Ing. Daniel Langr, Ph.D.

Reviewers

Ing. Jiří Kašpar

Department

Department of Theoretical Computer Science

Summary

Ab initio approaches to nuclear structure exploration are at the forefront of current nuclear physics research. LSU3shell is an implementation of the ab initio method called symmetry-adapted no-core shell model (SA-NCSM) optimized for distributed HPC systems. The goal of this thesis was the analysis of the LSU3shell program with focus primarily on dynamic memory allocation, research of methods that can be used to improve performance and memory usage, and their application on LSU3shell. The focus on dynamic memory allocation proved to be the right one, leading to many possible optimizations. I was able to reduce the run time by 41 % on average, thus potentially saving up to 1.4 million core-hours of our total BlueWaters allocation.

Thesis on DSpace

Parallel Sorting in C++11

Author

Klára Horáčková

Year

2019

Type

Master thesis

Supervisor

Ing. Daniel Langr, Ph.D.

Reviewers

doc. Ing. Ivan Šimeček, Ph.D.

Department

Department of Theoretical Computer Science

Summary

A new parallel variant of the quicksort algorithm called C++11Sort is presen- ted in this thesis. C++11Sort is implemented based on the C++11 threads and the respective synchronization techniques, i.e. without the use of the non- standard C++ extensions and external libraries. In the search section, existing implementations of the parallel versions of the quicksort algorithm are descri- bed. These implementations are afterwards experimentally compared to the C++11Sort algorithm. C++11Sort achieves excellent results in those expe- riments. When sorting two billions of integers, it is faster than the fastest existing implementation by 28 %.

Thesis on DSpace

Efficient concurrent memoization system

Author

Viacheslav Kroilov

Year

2020

Type

Master thesis

Supervisor

Ing. Daniel Langr, Ph.D.

Reviewers

Ing. Jiří Kašpar

Department

Department of Theoretical Computer Science

Summary

An automatic memoization system - also known as a software cache - stores a limited number of recently accessed elements and speeds up consequent accesses to them. Least-Recently-Used (LRU) is a popular replacement policy for hardware and software caches. However, its concurrent implementation suffers from high contention due to the list reordering performed both on lookup and insertion. A novel LRU-inspired concurrent software cache, called DeferredLRU, is presented in this thesis. The main goal of the design was to make it scalable and suitable for many-core systems. These properties were achieved by using a different approach to tracking item access order. This approach substantially decreases the number of list reinsertions --- the main factor of the contention in a regular LRU cache. DeferredLRU throughput and hit-rate are sensitive to the meta-parameter setting. By fine-tuning meta parameters for specific inputs, it was possible to achieve higher hit-rate than of a regular LRU cache for every tested input. The relative difference was up to 7.8%. DeferredLRU performance was compared to existing alternatives, including corresponding caches from Intel TBB and Facebook HHVM projects. Tested caches were evaluated with up to 32 threads (on a 16 HW cores CPU). In 32 threads evaluation, DeferredLRU was faster in all 16 tests. When accesses were distributed among multiple smaller caches for better parallelism (this approach is called binning), DeferredLRU was faster in 11 of 16 tests and was close to best-performing caches in 5 other tests. DeferredLRU with binning was up to 28.8 times faster on 32 threads compared to single-threaded performance.

Thesis on DSpace

Low-Latency Heap Profiling

Author

Jan Tománek

Year

2021

Type

Master thesis

Supervisor

doc. Ing. Daniel Langr, Ph.D.

Reviewers

doc. Ing. Ivan Šimeček, Ph.D.

Department

Department of Theoretical Computer Science

Summary

Frequent heap allocations can have a significant impact on the overall performance of a program. The problem becomes even more critical in multithreaded applications, where too many interactions with heap can grow into the main bottleneck and effectively diminish scalability. Profiling such applications with existing heap profilers is almost impossible due to the enormous runtime overhead they add. This thesis describes the typical memory layout of a Linux process, briefly shows the internals of the GNU C memory allocator, analyzes available heap profiling tools, and guides the reader through the process of constructing a~custom heap profiler. Further, the thesis proposes a concept of low-latency heap profiling. Based on that, a new profiler Heaprof has been developed and compared with other heap profiling tools. Experimental results revealed that Heaprof was the only tool capable of profiling without further compromising the program's scalability. For 8-threaded benchmarks, Heaprof achieved up to 8 times profiling speedup.

Thesis on DSpace

Evaluation of thread pool implementations by resolution of webserver requests

Author

Martin Mucha

Year

2024

Type

Master thesis

Supervisor

doc. Ing. Daniel Langr, Ph.D.

Reviewers

doc. Ing. Ivan Šimeček, Ph.D.

Department

Department of Software Engineering

Summary

Utilizing a pool of worker threads to scale web servers is a widely adopted approach employed by many server implementations. Individual thread pool implementations can significantly vary in scheduling algorithms and task encapsulation. In this thesis, we design and implement our own thread pool utilizing C++20 coroutines and work-stealing scheduling. To ensure an efficient scheduling, we utilize lock-free structures. Our implementation not only achieves significantly lower scheduling overhead compared to OpenMP implementations but also performs comparably to, and in one test outperforms, the oneTBB thread pool implementation. Finally, we conduct testing of various versions of our thread pool in a server environment to assess performance across different server requests types.

Thesis on DSpace

Eigensolver problem for large sparse symmetric matrices

Author

Matěj Razák

Year

2024

Type

Master thesis

Supervisor

doc. Ing. Daniel Langr, Ph.D.

Reviewers

doc. Ing. Ivan Šimeček, Ph.D.

Department

Department of Theoretical Computer Science

Summary

This work is devoted to selected methods for finding eigenvalues and eigenvectors of symmetric sparse matrices over the MPI+OpenMP hybrid parallel programming model in the C++ language. The aim of the thesis is to improve the existing implementation of the one-vector and block Lanczos algorithm. Alternative implementations developed are single-vector and block versions of the Lanczos algorithm and the LOBPCG method. This is followed by testing and mutual comparisons of individual implementations. The scale of comparison is time efficiency and the number of iterations of the algorithm. A comparison with an existing implementation showed better efficiency of my solution.

Thesis on DSpace