Laboratoř výzkumu programování (PRL@PRG)

V Laboratoři výzkumu programování (PRL@PRG) se zabýváme hlubším pochopením programovacích jazyků a jejich využitím. Laboratoř vede prof. Jan Vitek z Khoury College of Computer Sciences, Northeastern University v Bostnu.

Více o nás

Administrativní pracovníci

Projekty

Evolving Language Ecosystems

Období
2016 - 2021
Popis
The Evolving Language Ecosystems project explores the fundamental techniques and algorithms for evolving programming languages and their ecosystems. Our purpose is to reduce the cost of wide-ranging language changes and obviate the need for devising entirely new languages. Our findings will grant both researchers and practitioners a greater degree of freedom when experimenting with new ideas on how to express computation.

Big Code: Škálovatelná analýza rozsáhlých bází programů

Program
OPVVV - Operační program Výzkum, vývoj a vzdělávání - Strukturální fondy EU
Poskytovatel
Evropská komise
Kód
EF15_003/0000421, CZ.02.1.01/0.0/0.0/15_003/0000421
Období
2019 - 2022
Popis
V rámci projektu BigCode bude na Českém vysokém učení technickém v Praze, Fakultě informačních technologií (FIT) vytvořen Institut pro škálovatelnou analýzu kódu (ISCA), první výzkumné centrum v ČR zaměřené na analýzu rozsáhlých bází programových kódů na internetu, které představují obrovský znalostní potenciál, který ale neumíme zatím využít. Projekt BigCode si klade za cíl tuto bázi pomocí technik programovacích jazyků a statistického strojového učení zanalyzovat a umožnit porozumění získaným informacím. Požadované prostředky budou použity na vybudování hardwarové a softwarové infrastruktury centra a na mzdové prostředky pro mezinárodně uznávané výzkumné pracovníky. Výzkumná náplň projektu je v souladu s již existujícím výzkumem na FIT -- softwarové a znalostní inženýrství, vytěžování dat, paralelní výpočty. Díky mezinárodním kontaktům členů výzkumného týmu projektu BigCode centrum následně předpokládá získávání podpory od předních softwarových firem, jako jsou Google a Oracle.

Publikace

Scala Implicits are Everywhere: A large-scale study of the use of Implicits in the wild

Autoři
Křikava, F.; Vítek, J.; Miller, H.
Rok
2019
Publikováno
Volume 3, Issue OOPSLA October 2019. New York: ACM, 2019. p. 1-28. ISSN 2475-1421.
Typ
Stať ve sborníku
Anotace
The Scala programming language offers two distinctive language features implicit parameters and implicit conversions, often referred together as implicits. Announced without fanfare in 2004, implicits have quickly grown to become a widely and pervasively used feature of the language. They provide a way to reduce the boilerplate code in Scala programs. They are also used to implement certain language features without having to modify the compiler. We report on a large-scale study of the use of implicits in the wild. For this, we analyzed 7,280 Scala projects hosted on GitHub, spanning over 8.1M call sites involving implicits and 370.7K implicit declarations across 18.7M lines of Scala code.

From Macros to DSLs: The Evolution of Racket

Autoři
Culpepper, R.; Felleisen, M.; Flatt, M.; Krishnamurthi, S.
Rok
2019
Publikováno
3rd Summit on Advances in Programming Languages (SNAPL 2019). Saarbrücken: Dagstuhl Publishing,, 2019. p. 1-19. ISSN 1868-8969. ISBN 978-3-95977-113-9.
Typ
Stať ve sborníku
Anotace
The Racket language promotes a language-oriented style of programming. Developers create many domain-specific languages, write programs in them, and compose these programs via Racket code. This style of programming can work only if creating and composing little languages is simple and effective. While Racket’s Lisp heritage might suggest that macros suffice, its design team discovered significant shortcomings and had to improve them in many ways. This paper presents the evolution of Racket’s macro system, including a false start, and assesses its current state.

Všechny publikace

R melts brains: an IR for first-class environments and lazy effectful arguments

Autoři
Ječmen, J.; Vítek, J.; Flückiger, O.; Chari, G.; Yee, M.; Hain, J.
Rok
2019
Publikováno
Proceedings of the 15th ACM SIGPLAN International Symposium on Dynamic Languages. ACM New York, 2019. p. 55-66. ISBN 978-1-4503-6996-1.
Typ
Stať ve sborníku
Anotace
The R programming language combines a number of features considered hard to analyze and implement efficiently: dynamic typing, reflection, lazy evaluation, vectorized primitive types, first-class closures, and extensive use of native code. Additionally, variable scopes are reified at runtime as first-class environments. The combination of these features renders most static program analysis techniques impractical, and thus, compiler optimizations based on them ineffective. We present our work on PIR, an intermediate representation with explicit support for first-class environments and effectful lazy evaluation. We describe two dataflow analyses on PIR: the first enables reasoning about variables and their environments, and the second infers where arguments are evaluated. Leveraging their results, we show how to elide environment creation and inline functions.

On Julia's Efficient Algorithm for Subtyping Union Types and Covariant Tuples (Artifact).

Autoři
Vítek, J.; Nardelli, F.; Chung, B.
Rok
2019
Publikováno
33rd European Conference on Object-Oriented Programming (ECOOP 2019). Wadern: Schloss Dagstuhl - Leibniz Center for Informatics, 2019. p. 1-24. ISSN 1868-8969. ISBN 978-3-95977-111-5.
Typ
Stať ve sborníku
Anotace
The Julia programming language supports multiple dispatch and provides a rich type annotation language to specify method applicability. When multiple methods are applicable for a given call, Julia relies on subtyping between method signatures to pick the correct method to invoke. Julia’s subtyping algorithm is surprisingly complex, and determining whether it is correct remains an open question. In this paper, we focus on one piece of this problem: the interaction between union types and covariant tuples. Previous work normalized unions inside tuples to disjunctive normal form. However, this strategy has two drawbacks: complex type signatures induce space explosion, and interference between normalization and other features of Julia’s type system. In this paper, we describe the algorithm that Julia uses to compute subtyping between tuples and unions – an algorithm that is immune to space explosion and plays well with other features of the language. We prove this algorithm correct and complete against a semantic-subtyping denotational model in Coq

On the Impact of Programming Languages on Code Quality: A Reproduction Study

Autoři
Vítek, J.; Máj, P.; Hollenbeck, C.; Berger, E.; Vitek, O.
Rok
2019
Publikováno
ACM Transactions on Programming Languages and Systems,Vol. 41, No. 4. 2019, 41(4), 1-24. ISSN 0164-0925.
Typ
Článek
Anotace
In a 2014 article, Ray, Posnett, Devanbu, and Filkov claimed to have uncovered a statistically significant association between 11 programming languages and software defects in 729 projects hosted on GitHub. Specifically, their work answered four research questions relating to software defects and programming languages. With data and code provided by the authors, the present article first attempts to conduct an experimental repetition of the original study. The repetition is only partially successful, due to missing code and issues with the classification of languages. The second part of this work focuses on their main claim, the association between bugs and languages, and performs a complete, independent reanalysis of the data and of the statistical modeling steps undertaken by Ray et al. in 2014. This reanalysis uncovers a number of serious flaws that reduce the number of languages with an association with defects down from 11 to only 4. Moreover, the practical effect size is exceedingly small. These results thus undermine the conclusions of the original study. Correcting the record is important, as many subsequent works have cited the 2014 article and have asserted, without evidence, a causal link between the choice of programming language for a given task and the number of software defects. Causation is not supported by the data at hand; and, in our opinion, even after fixing the methodological flaws we uncovered, too many unaccounted sources of bias remain to hope for a meaningful comparison of bug rates across languages.

Can android run on time? Extending and measuring the android platform's timeliness

Autoři
Vítek, J.; Yan, Y.; Gokul, G.; Dantu, K.; Ko, S.Y.; ZIAREK, L.
Rok
2019
Publikováno
ACM Transactions on Embedded Computing Systems. 2019, 17(6), ISSN 1539-9087.
Typ
Článek
Anotace
Time predictability is dificult to achieve in the complex, layered execution environments that are common inmodern embedded devices such as smartphones. We explore adopting the Android programming model for arange of embedded applications that extends beyond mobile devices, under the constraint that changes towidely used libraries should be minimized. The challenges we explore include: the interplay between real-timeactivities and the rest of the system, how to express the timeliness requirements of components, and how wellthose requirements can be met on stock embedded platforms. We detail the design and implementation of ourmodifications to the Android framework along with a real-time VM and OS, and provide experimental datavalidating feasibility over five applications.

 How to Evaluate the Performance of Gradual Type Systems

Autoři
Vítek, J.; Greeman, B.; Takikawa, A.; New, M.S.; Feltey, D.; Findler, R.B.; Felleisen, M.
Rok
2019
Publikováno
JOURNAL OF FUNCTIONAL PROGRAMMING. 2019, 29 ISSN 0956-7968.
Typ
Článek
Anotace
A sound gradual type system ensures that untyped components of a program can never break the guarantees of statically typed components. This assurance relies on runtime checks, which in turn impose performance overhead in proportion to the frequency and nature of interaction between typed and untyped components. The literature on gradual typing lacks rigorous descriptions of methods for measuring the performance of gradual type systems. This gap has consequences for the implementors of gradual type systems and developers who use such systems. Without systematic evaluation of mixed-typed programs, implementors cannot precisely determine how improvements to a gradual type system affect performance. Developers cannot predict whether adding types to part of a program will significantly degrade (or improve) its performance. This paper presents the first method for evaluating the performance of sound gradual type systems. The method quantifies both the absolute performance of a gradual type system and the relative performance of two implementations of the same gradual type system. To validate the method, the paper reports on its application to twenty programs and three implementations of Typed Racket.

Feature-specific profiling

Autoři
Vítek, J.; Andersen, L.; St-Amour, V.; Felleisen, M.
Rok
2019
Publikováno
ACM Transactions on Programming Languages and Systems. 2019, 41(1), ISSN 0164-0925.
Typ
Článek
Anotace
While high-level languages come with significant readability and maintainability benefits, their performance remains difficult to predict. For example, programmers may unknowingly use language features inappropriately, which cause their programs to run slower than expected. To address this issue, we introduce feature-specific profiling, a technique that reports performance costs in terms of linguistic constructs. Feature-specific profilers help programmers find expensive uses of specific features of their language. We describe the architecture of a profiler that implements our approach, explain prototypes of the profiler for two languages with different characteristics and implementation strategies, and provide empirical evidence for the approach's general usefulness as a performance debugging tool. © 2018 Copyright held by the owner/author(s).

Verifying a Concurrent Garbage Collector using a Rely-Guarantee Methodology

Autoři
Vítek, J.; Zakowski, Y; Cachera, D; Demange, D; Petri, G; Pichardie, D; Jagannathan, S
Rok
2017
Publikováno
International Conference on Interactive Theorem Prooving (ITP). Berlin: Springer-Verlag, 2017.
Typ
Stať ve sborníku

Julia: Dynamism and Performance Reconciled by Design

Autoři
Vítek, J.; Bezanson, J.; Chung, B.; Karpinski, S.; Shah, V.B.; Zoubritski, L.
Rok
2018
Publikováno
Journal Proceedings of the ACM on Programming Languages,Volume 2 Issue, OOPSLA. New York: ACM, 2018. ISSN 2475-1421.
Typ
Stať ve sborníku
Anotace
Julia is a programming language for the scientific community that combines features of productivity languages, such as Python or MATLAB, with characteristics of performance-oriented languages, such as C++ or Fortran. Julia's productivity features include: dynamic typing, automatic memory management, rich type annotations, and multiple dispatch. At the same time, Julia allows programmers to control memory layout and leverages a specializing just-in-time compiler to eliminate much of the overhead of those features. This paper details the design choices made by the creators of Julia and reflects on the implications of those choices for performance and usability.

Julia Subtyping: A Rational Reconstruction

Autoři
Vítek, J.; Chung, B.; Belyakova, Y.; Pelenitsyn, A.; Bezanson, J.; Nardelli, F.Z.
Rok
2018
Publikováno
Journal Proceedings of the ACM on Programming Languages,Volume 2 Issue, OOPSLA. New York: ACM, 2018. ISSN 2475-1421.
Typ
Stať ve sborníku
Anotace
Programming languages that support multiple dispatch rely on an expressive notion of subtyping to specify method applicability. In these languages, type annotations on method declarations are used to select, out of a potentially large set of methods, the one that is most appropriate for a particular tuple of arguments. Julia is a language for scientific computing built around multiple dispatch and an expressive subtyping relation. This paper provides the first formal definition of Julia's subtype relation and motivates its design. We validate our specification empirically with an implementation of our definition that we compare against the existing Julia implementation on a collection of real-world programs. Our subtype implementation differs on 122 subtype tests out of 6014476. The first 120 differences are due to a bug in Julia that was fixed once reported; the remaining 2 are under discussion.

Self-contained development environments

Autoři
Vítek, J.; Chari, G.; Pimás, J.; Flückiger, O.
Rok
2018
Publikováno
DLS 2018 Proceedings of the 14th ACM SIGPLAN International Symposium on Dynamic Languages. New York: ACM, 2018. p. 76-87. ISBN 9781450360302.
Typ
Stať ve sborníku
Anotace
Operating systems are traditionally implemented in lowlevel, performance-oriented programming languages. These languages typically rely on minimal runtime support and provide unfettered access to the underlying hardware. Tradition has benefits: developers control the resources that the operating system manages and few performance bottlenecks cannot be overcome with clever feats of programming. On the other hand, this makes operating systems harder to understand and maintain. Furthermore, those languages have few built-in barriers against bugs. This paper is an experiment in side-stepping operating systems, and pushing functionality into the runtime of high-level programming languages. The question we try to answer is how much support is needed to run an application written in, say, Smalltalk or Python on bare metal, that is, with no underlying operating system. We present a framework named NopSys that allows this, and we validate it with the implementation of CogNos a Smalltalk virtual machine running on bare x86 hardware. Experimental results suggest that this approach is promising.

Contextual Equivalence fo probabilistic languages

Autoři
Culpepper, R.; Wand, M.; Cobb, A.; Giannakopoulos, T.
Rok
2018
Publikováno
Journal Proceedings of the ACM on Programming Languages,Volume 2, Issue ICFP. New York: ACM, 2018. ISSN 2475-1421.
Typ
Stať ve sborníku
Anotace
We present a complete reasoning principle for contextual equivalence in an untyped probabilistic language. The language includes continuous (real-valued) random variables, conditionals, and scoring. It also includes recursion, since the standard call-by-value fixpoint combinator is expressible. We demonstrate the usability of our characterization by proving several equivalence schemas, including familiar facts from lambda calculus as well as results specific to probabilistic programming. In particular, we use it to prove that reordering the random draws in a probabilistic program preserves contextual equivalence. This allows us to show, for example, that (let x = e1 in let y = e2 in e0) =ctx (let y = e2 in let x = e1 in e0) (provided x does not occur free in e2 and y does not occur free in e1) despite the fact that e1 and e2 may have sampling and scoring effects.

Tests from traces: Automated unit test extraction for R

Rok
2018
Publikováno
ISSTA 2018 - Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. ACM PRESS, 2018. p. 232-241. ISBN 978-1-4503-5699-2.
Typ
Stať ve sborníku
Anotace
Unit tests are labor-intensive to write and maintain. This paper looks into how well unit tests for a target software package can be extracted from the execution traces of client code. Our objective is to reduce the effort involved in creating test suites while minimizing the number and size of individual tests, and maximizing coverage. To evaluate the viability of our approach, we select a challenging target for automated test extraction, namely R, a programming language that is popular for data science applications. The challenges presented by R are its extreme dynamism, coerciveness, and lack of types. This combination decrease the efficacy of traditional test extraction techniques. We present Genthat, a tool developed over the last couple of years to non-invasively record execution traces of R programs and extract unit tests from those traces. We have carried out an evaluation on 1,545 packages comprising 1.7M lines of code. The tests extracted by Genthat improved code coverage from the original rather low value of 267,496 lines to 700,918 lines. The running time of the generated tests is 1.9 times faster than the code they came from. © 2018 Association for Computing Machinery.

KafKa: Gradual typing for objects

Autoři
Vítek, J.; Chung, B.; Li, P.; Nardelli, F.Z.
Rok
2018
Publikováno
Leibniz International Proceedings in Informatics, LIPIcs. Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik, 2018. ISBN 978-3-95977-079-8.
Typ
Stať ve sborníku
Anotace
A wide range of gradual type systems have been proposed, providing many languages with the ability to mix typed and untyped code. However, hiding under language details, these gradual type systems embody fundamentally different ideas of what it means to be well-typed. In this paper, we show that four of the most common gradual type systems provide distinct guarantees, and we give a formal framework for comparing gradual type systems for object-oriented languages. First, we show that the different gradual type systems are practically distinguishable via a three-part litmus test. We present a formal framework for defining and comparing gradual type systems. Within this framework, different gradual type systems become translations between a common source and target language, allowing for direct comparison of semantics and guarantees. © Benjamin Chung, Paley Li, Francesco Zappa Nardelli, and Jan Vitek.

Correctness of speculative optimizations with dynamic deoptimization

Autoři
Vítek, J.; Fluckiger, O.; Scherer, G.; Yee, M.; Goel, A.; Ahmed, A.
Rok
2018
Publikováno
Proceedings of the ACM on Programming Languages (PACMPL). 2018, 2 ISSN 2475-1421.
Typ
Článek
Anotace
High-performance dynamic language implementations make heavy use of speculative optimizations to achieve speeds close to statically compiled languages. These optimizations are typically performed by a just-in-time compiler that generates code under a set of assumptions about the state of the program and its environment. In certain cases, a program may execute code compiled under assumptions that are no longer valid. The implementation must then deoptimize the program on-the-fly; this entails finding semantically equivalent code that does not rely on invalid assumptions, translating program state to that expected by the target code, and transferring control. This paper looks at the interaction between optimization and deoptimization, and shows that reasoning about speculation is surprisingly easy when assumptions are made explicit in the program representation. This insight is demonstrated on a compiler intermediate representation, named sourir, modeled after the high-level representation for a dynamic language. Traditional compiler optimizations such as constant folding, unreachable code elimination, and function inlining are shown to be correct in the presence of assumptions. Furthermore, the paper establishes the correctness of compiler transformations specific to deoptimization: namely unrestricted deoptimization, predicate hoisting, and assume composition.

Parallelizing Julia with a Non-invasive DSL

Autoři
Vítek, J.; Anderson, T A; Liu, H; Kuper, L; Totoni, E; Shpeisman, T
Rok
2017
Publikováno
31st European Conference on Object-Oriented Programming (ECOOP 2017). Wadern: Schloss Dagstuhl - Leibniz Center for Informatics, 2017. p. 41-429. ISSN 1868-8969. ISBN 9783959770354.
Typ
Stať ve sborníku
Anotace
Computational scientists often prototype software using productivity languages that offer high-level programming abstractions. When higher performance is needed, they are obliged to rewrite their code in a lower-level efficiency language. Different solutions have been proposed to address this trade-off between productivity and efficiency. One promising approach is to create embedded domain-specific languages that sacrifice generality for productivity and performance, but practical experience with DSLs points to some road blocks preventing widespread adoption. This paper proposes a non-invasive domain-specific language that makes as few visible changes to the host programming model as possible. We present ParallelAccelerator, a library and compiler for high-level, high-performance scientific computing in Julia. ParallelAccelerator's programming model is aligned with existing Julia programming idioms. Our compiler exposes the implicit parallelism in high-level array-style programs and compiles them to fast, parallel native code. Programs can also run in "library-only" mode, letting users benefit from the full Julia environment and libraries. Our results show encouraging performance improvements with very few changes to source code required. In particular, few to no additional type annotations are necessary.

Contracts-based Control Integration into Software Systems

Autoři
Křikava, F.; Rouvoy, R.; Collet, P.; Seintourier, L.
Rok
2017
Publikováno
Software Engineering for Self-Adaptive Systems III - Assurances. Berlin: Springer, 2017. p. 251-281. ISSN 0302-9743. ISBN 978-3-319-74182-6.
Typ
Kapitola v knize
Anotace
Among the different techniques that are used to design self-adaptive software systems, control theory allows one to design an adaptation policy whose properties, such as stability and accuracy, can be formally guaranteed under certain assumptions. However, in the case of software systems, the integration of these controllers to build complete feedback control loops is manual. More importantly it requires an extensive handcrafting of non-trivial implementation code. This may lead to inconsistencies and instabilities as no systematic and automated assurance can be obtained on the fact that the initial assumptions for the designed controller still hold in the resulting system. In this chapter, we rely on the principles of design-by-contract to ensure the correction and robustness of a self-adaptive software system built using feedback control loops. Our solution raises the level of abstraction upon which the loops are specified by allowing one to define and automatically verify system-level properties organized in contracts. They cover behavioral, structural and temporal architectural constraints as well as explicit interaction. These contracts are complemented by a first-class support for systematic fault handling. As a result, assumptions about the system operation conditions become more explicit and verifiable in a systematic way.

Hadoop-Benchmark: Rapid Prototyping and Evaluation of Self-Adaptive Behaviors in Hadoop Clusters

Autoři
Křikava, F.; Zhang, B.; Rouvoy, R.; Senturier, L.
Rok
2017
Publikováno
2017 IEEE/ACM 12th International Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEAMS). USA: IEEE Computer Society, 2017. p. 175-181. ISBN 978-1-5386-1550-8.
Typ
Stať ve sborníku
Anotace
Optimizing Hadoop executions has attracted a lot of research contributions in particular in the domain of self- adaptive software systems. However, these research efforts are often hindered by the complexity of Hadoop operation and the difficulty to reproduce experimental evaluations that makes it hard to compare different approaches to one another. To address this limitation, we propose a research acceleration platform for rapid prototyping and evaluation of self-adaptive behavior in Hadoop clusters. Essentially, it provides automated approach to provision reproducible Hadoop environments and execute acknowledged benchmarks. It is based on the state- of-the-art container technology that supports both distributed configurations as well as standalone single-host setups. We demonstrate the approach on a complete implementation of a concrete Hadoop self-adaptive case study.

Making Android Run on Time

Autoři
Vítek, J.; Yan, Y; Dantu, K; Ko, S Y; Ziarek, L
Rok
2017
Publikováno
Real-Time and Embedded technology and Applications Symposium (RTAS 2017). San Francisco: American Institute of Physics and Magnetic Society of the IEEE, 2017.
Typ
Stať ve sborníku

Helenos: A Realistic Benchmark for Distributed Transactional Memory

Autoři
Siek, K.; Kobyliński, P.; Baranowski, J.; Wojciechowski, P.
Rok
2017
Publikováno
Software - Practice and Experience. 2017, 48(3), 528-549. ISSN 0038-0644.

Kontaktní osoba

Ing. Lucie Lerch

Kde nás najdete?

Laboratoř výzkumu programování
Fakulta informačních technologií
České vysoké učení technické v Praze

Místnost TH:A-1250, 1252, 1254
(Budova A, 12. patro)
Thákurova 7
Praha 6 – Dejvice
160 00

Za obsah stránky zodpovídá: doc. Ing. Štěpán Starosta, Ph.D.