Active deep learning method for the discovery of objects of interest in large spectroscopic surveys

Autoři
Škoda, P.; Podsztavek, O.; Tvrdík, P.
Rok
2020
Publikováno
Astronomy & Astrophysics. 2020, 643 ISSN 1432-0746.
Typ
Článek
Anotace
Context. Current archives of the LAMOST telescope contain millions of pipeline-processed spectra that have probably never been seen by human eyes. Most of the rare objects with interesting physical properties, however, can only be identified by visual analysis of their characteristic spectral features. A proper combination of interactive visualisation with modern machine learning techniques opens new ways to discover such objects. Aims. We apply active learning classification methods supported by deep convolutional neural networks to automatically identify complex emission-line shapes in multi-million spectra archives. Methods. We used the pool-based uncertainty sampling active learning method driven by a custom-designed deep convolutional neural network with 12 layers. The architecture of the network was inspired by VGGNet, AlexNet, and ZFNet, but it was adapted for operating on one-dimensional feature vectors. The unlabelled pool set is represented by 4.1 million spectra from the LAMOST data release 2 survey. The initial training of the network was performed on a labelled set of about 13 000 spectra obtained in the 400 Å wide region around Hα by the 2 m Perek telescope of the Ondˇrejov observatory, which mostly contains spectra of Be and related early-type stars. The differences between the Ondˇrejov intermediate-resolution and the LAMOST low-resolution spectrographs were compensated for by Gaussian blurring and wavelength conversion. Results. After several iterations, the network was able to successfully identify emission-line stars with an error smaller than 6.5%. Using the technology of the Virtual Observatory to visualise the results, we discovered 1 013 spectra of 948 new candidates of emission-line objects in addition to 664 spectra of 549 objects that are listed in SIMBAD and 2 644 spectra of 2 291 objects identified in an earlier paper of a Chinese group led by Wen Hou. The most interesting objects with unusual spectral properties are discussed in detail.

An In-sight into How Compression Dictionary Architecture can Affect the Overall Performance in FPGAs

Autoři
Bartík, M.; Beneš, T.; Kubalík, P.
Rok
2020
Publikováno
IEEE Access. 2020, 2020(8), 183101-183116. ISSN 2169-3536.
Typ
Článek
Anotace
This paper presents a detailed analysis of various approaches to hardware implemented compression algorithm dictionaries, including our optimized method. To obtain comprehensive and detailed results, we introduced a method for the fair comparison of programmable hardware architectures to show the benefits of our approach from the perspective of logic resources, frequency, and latency. We compared two generally used methods with our optimized method, which was found to be more suitable for maintaining the memory content via (in)valid bits in any mid-density memory structures, which are implemented in programmable hardware such as FPGAs (Field Programmable Gate Array). The benefits of our new method based on a “Distributed Memory” technique are shown on a particular example of compression dictionary but the method is also suitable for another use cases requiring a fast (re-)initialization of the used memory structures before each run of an algorithm with minimum time and logic resources consumption. The performance evaluation of the respective approaches has been made in Xilinx ISE and Xilinx Vivado toolkits for the Virtex-7 FPGA family. However the proposed approach is compatible with 99% of modern FPGAs.

Designing types for R, empirically

Autoři
Turcotte, A.; Goel, A.; Křikava, F.; Vitek, J.
Rok
2020
Publikováno
Proceedings of the ACM on Programming Languages (PACMPL). 2020, 4(OOPSLA), 1-25. ISSN 2475-1421.
Typ
Článek
Anotace
The R programming language is widely used in a variety of domains. It was designed to favor an interactive style of programming with minimal syntactic and conceptual overhead. This design is well suited to data analysis, but a bad fit for tools such as compilers or program analyzers. In particular, R has no type annotations, and all operations are dynamically checked at run-time. The starting point for our work are the two questions: what expressive power is needed to accurately type R code? and which type system is the R community willing to adopt? Both questions are difficult to answer without actually experimenting with a type system. The goal of this paper is to provide data that can feed into that design process. To this end, we perform a large corpus analysis to gain insights in the degree of polymorphism exhibited by idiomatic R code and explore potential benefits that the R community could accrue from a simple type system. As a starting point, we infer type signatures for 25,215 functions from 412 packages among the most widely used open source R libraries. We then conduct an evaluation on 8,694 clients of these packages, as well as on end-user code from the Kaggle data science competition website.

Partitioning graphs into induced subgraphs

Autoři
Rok
2020
Publikováno
Discrete Applied Mathematics. 2020, 272 31-42. ISSN 0166-218X.
Typ
Článek
Anotace
We study the Partition into H problem from the parameterized complexity point of view. In the Partition into H problem the task is to partition the vertices of a graph G into sets V_1,...,V_r such that the graph H is isomorphic to the subgraph of G induced by each set V_i for i =1,...,r. The pattern graph H is fixed. For the parameterization we consider three distinct structural parameters of the graph G - namely the tree-width, the neighborhood diversity, and the modular-width. For the parameterization by the neighborhood diversity we obtain an FPT algorithm for every graph H. For the parameterization by the tree-width we obtain an FPT algorithm for every connected graph H, thus resolving an open question of Gajarský et al. (2013). Finally, for the parameterization by the modular-width we derive an FPT algorithm for every prime graph H.

Reducing the Impact of Intensive Dynamic Memory Allocations in Parallel Multi-Threaded Programs

Autoři
Langr, D.; Kočička, M.
Rok
2020
Publikováno
IEEE Transactions on Parallel and Distributed Systems. 2020, 31(5), 1152-1164. ISSN 1045-9219.
Typ
Článek
Anotace
Frequent dynamic memory allocations (DyMAs) can significantly hinder the scalability of parallel multi-threaded programs. As the number of threads grows, DyMAs can even become the main performance bottleneck. We introduce modern tools and methods for evaluating the impact of DyMAs and present techniques for its reduction, which include scalable heap implementations, small buffer optimization, and memory pooling. Additionally, we provide a survey of state-of-the-art implementations of these techniques and study them experimentally by using a benchmark program, server simulator software, and a real-world high-performance computing application. As a result, we show that relatively small modifications in parallel program’s source code or a way of its execution may substantially reduce the runtime overhead associated with the use of dynamic data structures.

Time-headway distribution for random-sequential-update TASEP with periodic and open boundaries

Autoři
Rok
2020
Publikováno
Journal of Traffic and Transportation Engineering. 2020, 7(1), 30-41. ISSN 2095-7564.
Typ
Článek
Anotace
The temporal-headway distribution for Totally Asymmetric Simple Exclusion Process (TASEP) with random-sequential update is investigated. Considering the stationary/steady state of the process, exact formula for the step-headway distribution is derived for conditions when the stationary measure is Bernoulli, i.e., for periodic boundaries and for open boundaries with entering boundary rate alpha and leaving boundary rate beta satisfying alpha + beta = 1. The step-headway formula for general values of boundary rates is calculated numerically by means of the matrix product ansatz.

Multi-Agent Path Finding with Mutex Propagation

Autoři
Zhang, H.; Li, J.; Surynek, P.; Koenig, S.; Kumar, S.
Rok
2020
Publikováno
Proceedings of the Thirtieth International Conference on Automated Planning and Scheduling. Menlo Park: AAAI Press, 2020. p. 323-332. ISSN 2334-0835. ISBN 978-1-57735-824-4.
Typ
Stať ve sborníku
Anotace
Mutex propagation is a form of efficient constraint propagation popularly used in AI planning to tightly approximate the reachable states from a given state. We utilize this idea in the context of Multi-Agent Path Finding (MAPF). When adapted to MAPF, mutex propagation provides stronger constraints for conflict resolution in Conflict-Based Search (CBS), a popular optimal MAPF algorithm, and provides it with the ability to identify and reason with symmetries in MAPF.

Fine-Grained View on Bribery for Group Identification

Autoři
Boehmer, N.; Bredereck, R.; Knop, D.; Luo, J.
Rok
2020
Publikováno
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 2020. p. 67-73. ISBN 978-0-9992411-6-5.
Typ
Stať ve sborníku
Anotace
Given a set of individuals qualifying or disqualifying each other, group identification is the task of identifying a socially qualified subgroup of individuals. Social qualification depends on the specific rule used to aggregate individual qualifications. The bribery problem in this context asks how many agents need to change their qualifications in order to change the outcome. Complementing previous results showing polynomial-time solvability or NP-hardness of bribery for various social rules in the constructive (aiming at making specific individuals socially qualified) or destructive (aiming at making specific individuals socially disqualified) setting, we provide a comprehensive picture of the parameterized computational complexity landscape. Conceptually, we also consider a more fine-grained concept of bribery cost, where we ask how many single qualifications need to be changed, and a more general bribery goal that combines the constructive and destructive setting.

Side-Channel Attack on the A5/1 Stream Cipher

Rok
2019
Publikováno
Proceedings of the 22nd Euromicro Conference on Digital Systems Design. Los Alamitos, CA: IEEE Computer Soc., 2019. p. 633-638. ISBN 978-1-7281-2862-7.
Typ
Stať ve sborníku
Anotace
In this paper we present cryptanalysis of the A5/1 stream cipher used in GSM mobile phones. Our attack is based on power analysis where we assume that the power consumption while clocking 3 LFSRs is different than when clocking 2 LFSRs. We demonstrate a simple power analysis (SPA) attack and discuss existing differential power analysis (DPA). We present the attack for recovering secret key based on the information on clocking bits of LFSRs that was deduced from power analysis. The attack has a 100% success rate, requires minimal storage and it does not requires any single bit of a keystream. An average time complexity of our attack based on SPA is around 233 where the computation unit is a resolution of system of linear equations over the Z2. Recovering the secret key using information from the DPA has a constant complexity.

Za obsah stránky zodpovídá: doc. Ing. Štěpán Starosta, Ph.D.