Ing. Pavel Kubalík, Ph.D.

Quantized Neural Network with Linearly Approximated Functions on Zynq FPGA

Autoři

Skrbek, M.; Kubalík, P.; Kohlík, M.; Borecký, J.; Hülle, R.

Rok

2024

Publikováno

2024 13th Mediterranean Conference on Embedded Computing (MECO). Piscataway: Institute of Electrical and Electronic Engineers, 2024. p. 98-101. ISSN 2637-9511. ISBN 979-8-3503-8756-8.

Typ

Stať ve sborníku

DOI

10.1109/MECO62516.2024.10577851

Pracoviště

Katedra číslicového návrhu

Anotace

This paper is focused on neural network implementation on FPGA. Linearly approximated functions combined with quantization are used to efficiently implement neural networks in hardware. Famous benchmarks were used for learning, evaluation, and hardware testing. Approximation-aware and quantization-aware learning were used to obtain weights for neurons in hardware. We implemented a neural network with an 8-bit architecture in VHDL and synthesized it to Zynq FPGA in Vivado. The resulting design running at 100MHz clock frequency was carefully tested against hardware-accurate models written in Wolfram Mathematica and C++. We present a decrease in FPGA resources and chip utilization compared to 16-bit architecture implementation.

Evaluation of the Medium-sized Neural Network using Approximative Computations on Zynq FPGA

Autoři

Skrbek, M.; Kubalík, P.; Kohlík, M.; Borecký, J.; Hülle, R.

Rok

2023

Publikováno

Proceedings of 2023 12th Mediterranean Conference on Embedded Computing (MECO). Piscataway: IEEE, 2023. p. 1-4. ISSN 2637-9511. ISBN 979-8-3503-2291-0.

Typ

Stať ve sborníku

DOI

10.1109/MECO58584.2023.10155065

Pracoviště

Katedra číslicového návrhu

Anotace

Integrating artificial intelligence technologies into embedded systems requires efficient implementation of neural networks in hardware. The paper presents a Zynq 7020 FPGA implementation and evaluation of a middle-sized dense neural network based on approximate computation by linearly approximated functions. Three famous benchmarks were used for classification accuracy evaluation and hardware testing. We use our highly pipelined neural hardware architecture that takes weights from block RAMs to save logic resources and enables their update from the processing system. The architecture reaches excellent design scalability, allowing us to estimate the number of neurons implemented in programmable logic based on single-neuron resources. We reached nearly full chip utilization while preserving the high clock frequency for the FPGA used.

Approximate arithmetic for modern neural networks and FPGAs

Autoři

Skrbek, M.; Kubalík, P.

Rok

2022

Publikováno

Proceedings of the 11th Mediterranean Conference on Embedded Computing (MECO 2022). Institute of Electrical and Electronics Engineers, Inc., 2022. p. 351-354. ISSN 2377-5475. ISBN 978-1-6654-6828-2.

Typ

Stať ve sborníku

DOI

10.1109/MECO55406.2022.9797141

Pracoviště

Katedra číslicového návrhu

Anotace

Approximate arithmetic is a very important approach for implementing neural networks in embedded hardware. The requirements of real-time applications with respect to the size of deep neural networks force designers to simplify the neural processing elements. Not only the reduction of precision of model parameters to a few bits, but also the use of approximate arithmetic increases computational power and saves on-chip resources beyond exact computation. Since it was shown that linearly approximated functions are suitable for implementing neural networks in hardware, FPGAs have improved. So we decided to re-implement the processing element on a modern FPGA and to present implementation results regarding speed and resource consumption. A neural processing element based on linearly approximated functions was implemented in Vivado and tested on an xc7 FPGA. The results show that the architecture saves significant resources and a clock frequency above 100 MHz can be achieved in pipelined design.

Feasibility of a Neural Network with Linearly Approximated Functions on Zynq FPGA

Autoři

Skrbek, M.; Kubalík, P.

Rok

2022

Publikováno

2022 29th IEEE International Conference on Electronics, Circuits and Systems (ICECS). New York: Institute of Electrical and Electronics Engineers, 2022. ISBN 978-1-6654-8823-5.

Typ

Stať ve sborníku

DOI

10.1109/ICECS202256217.2022.9970813

Pracoviště

Katedra číslicového návrhu

Anotace

This paper is focused on the feasibility of a neural network with linearly approximated functions on modern FPGA. An approximate multiplier and linearly approximated activation functions were used for a neural network implemented on Zynq FPGA. We proposed a novel architecture for a fully functional, layered, and configurable neural network.

An In-sight into How Compression Dictionary Architecture can Affect the Overall Performance in FPGAs

Autoři

Bartík, M.; Beneš, T.; Kubalík, P.

Rok

2020

Publikováno

IEEE Access. 2020, 2020(8), 183101-183116. ISSN 2169-3536.

Typ

Článek

DOI

10.1109/ACCESS.2020.3029691

Pracoviště

Katedra číslicového návrhu

Anotace

This paper presents a detailed analysis of various approaches to hardware implemented compression algorithm dictionaries, including our optimized method. To obtain comprehensive and detailed results, we introduced a method for the fair comparison of programmable hardware architectures to show the benefits of our approach from the perspective of logic resources, frequency, and latency. We compared two generally used methods with our optimized method, which was found to be more suitable for maintaining the memory content via (in)valid bits in any mid-density memory structures, which are implemented in programmable hardware such as FPGAs (Field Programmable Gate Array). The benefits of our new method based on a “Distributed Memory” technique are shown on a particular example of compression dictionary but the method is also suitable for another use cases requiring a fast (re-)initialization of the used memory structures before each run of an algorithm with minimum time and logic resources consumption. The performance evaluation of the respective approaches has been made in Xilinx ISE and Xilinx Vivado toolkits for the Virtex-7 FPGA family. However the proposed approach is compatible with 99% of modern FPGAs.

Low Power Wireless Data Transfer for Internet of Things: GSM Network Measuring Results

Autoři

Kubalík, P.; Procházka, V.; Kubátová, H.

Rok

2020

Publikováno

Proceedings of the 9th Mediterranean Conference on Embedded Computing - MECO'2020. Institute of Electrical and Electronics Engineers, Inc., 2020. p. 181-185. ISSN 2637-9511. ISBN 978-1-7281-6949-1.

Typ

Stať ve sborníku

DOI

10.1109/MECO49872.2020.9134348

Pracoviště

Katedra číslicového návrhu
Fakulta informačních technologií

Anotace

This paper describes the properties of wireless data transfer for Internet of Things (IoT). It focuses on low power consumption of the device. The paper presents the results of measuring latency, throughput and power consumption of a GSM module connected to an Arduino during data transfer to a remote server running at a PC. The measuring methodology of obtaining these results is discussed. Power consumption measurements include sending files of various sizes from the GSM module via the GSM network to the server. Conclusions regarding battery lifespan for the GSM module are made in the paper. Throughput over the GSM network for this module is elaborated and a static part of sending time, of a file regardless its size, is identified in this paper. The throughput is measured in order to further analyze the usability of such a device in IoT. The usability of LTE in such a configuration for fast data transfer is also discussed. The latency between the GSM module and the server is approximated because it may influence power consumption.

Novel Partial Correlation Method Algorithm for Acquisition of GNSS Tiered Signals

Autoři

Svatoň, J.; Vejražka, F.; Schmidt, J.; Kubalík, P.; Borecký, J.

Rok

2020

Publikováno

Navigation. 2020, 67(4), 745-762. ISSN 0028-1522.

Typ

Článek

DOI

10.1002/navi.390

Pracoviště

Katedra číslicového návrhu

Anotace

This paper presents a new modified Single Block Zero-Padding (mSBZP) Partial Correlation Method (PCM) Parallel Code Search (PCS) algorithm for effective acquisition of weak GNSS tiered signal using coherent processing of its secondary code (SC) component. Two problems are discussed: acquisition of primary codes with longer period using FFT blocks of limited length, and the utilization of PCS in the presence of SC bit transition. The PCM and SC bit transition forms parasitic fragments in the Cross-Ambiguity-Function (CAF) to devaluate signal detection performance. A novel analysis of this mechanism and its impact is presented. A novel mSBZP-PCM-PCS algorithm is proposed, which does not degrade the CAF. Then, the algorithm is combined with SC bit transition removal schema and sequential search to construct an estimator for weak tiered signal acquisition. The performance of the method is demonstrated by analysis and computer simulation using Galileo E1C and GPS L1C-P signals.

Design of a High-Throughput Match Search Unit for Lossless Compression Algorithms

Autoři

Bartík, M.; Beneš, T.; Kubalík, P.

Rok

2019

Publikováno

The 9th IEEE Annual Computing and Communication Workshop and Conference (CCWC). Piscataway: IEEE, 2019. p. 732-738. ISBN 9781728105543.

Typ

Stať ve sborníku

DOI

10.1109/CCWC.2019.8666521

Pracoviště

Katedra číslicového návrhu

Anotace

This paper presents an attempt to combine recent research in fields of hardware- and software-based high throughput universal lossless compression algorithms and their implementations, resulting into a case study focusing on one of the most critical parts of compression algorithms – a Match Search Unit (MSU) and its parallelization. The presented FPGA design combines ideas of the LZ4 algorithm (which is derived from the most common LZ77) with the state of the art hardware architectures for lossless compression also based on LZ77. This approach might lead to a smaller, better organized or more efficient ”building block” for modern implementations of hardware driven lossless compression algorithms. The presented design focuses on optimization of the main problem of the LZ77 family, namely the construction of and searching in a compression dictionary. Particularly, we combine a Live Value Table (LVT) with multi-ported memory in order to improve the bandwidth of the dictionary and the Fibonacci hashing principle originating from LZ4 algorithm to decrease latency of the MSU and to achieve overall higher throughput rate. For the design synthesis an FPGA of the Xilinx Virtex-7 family was used.

High Throughput and Low Latency LZ4 Compressor on FPGA

Autoři

Beneš, T.; Bartík, M.; Kubalík, P.

Rok

2019

Publikováno

2019 International Conference on ReConFigurable Computing and FPGAs. Piscataway, NJ: IEEE, 2019. ISSN 2640-0472. ISBN 978-1-7281-1957-1.

Typ

Stať ve sborníku

DOI

10.1109/ReConFig48160.2019.8994794

Pracoviště

Katedra číslicového návrhu

Anotace

This paper presents an FPGA design implementing a single LZ4 lossless compression IP block, providing a throughput of 6 Gbps combined with extremely low latency, while still retaining full binary compatibility with the original LZ4 format. The best-known competitor is capable of processing up to 2 Gbps per block/engine with unknown latency. The presented design uses two key features: a low-latency 8-way match search unit and consequently a match buffer which allows encoding LZ4 sequences independently to reduce stalls in the data processing pipeline. The design was evaluated on several compression corpora with an average compression ratio of 1.7.

Ultra High Resolution Jitter Measurement Method for Ethernet Based Networks

Autoři

Hynek, K.; Beneš, T.; Bartík, M.; Kubalík, P.

Rok

2019

Publikováno

The 9th IEEE Annual Computing and Communication Workshop and Conference (CCWC). Piscataway: IEEE, 2019. p. 847-851. ISBN 9781728105543.

Typ

Stať ve sborníku

DOI

10.1109/CCWC.2019.8666446

Pracoviště

Katedra číslicového návrhu

Anotace

This document presents a new approach to network jitter measurement and analysis in asynchronous data networks such as Ethernet. The developed monitoring device is capable to analyze an incoming stream speed of 1 Gb/s with the resolution up to 8 ns. The system architecture supports speeds up to 100 Gb/s networks. The presented architecture can provide several statistical functions such as measuring a network jitter by Interarrival Histograms method providing the mean value and peak-to-peak value as well. The architecture was implemented and tested on Xilinx Kintex UltraScale FPGA chip using Avnet AES-KU040-DB-G development board.

Performance Comparison of Multiple Approaches of Status Register for Medium Density Memory Suitable for Implementation of a Lossless Compression Dictionary

Autoři

Bartík, M.; Ubik, S.; Kubalík, P.; Beneš, T.

Rok

2018

Publikováno

Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. New York: ACM, 2018. p. 290. ISBN 978-1-4503-5614-5.

Typ

Stať ve sborníku

DOI

10.1145/3174243.3174976

Pracoviště

Katedra číslicového návrhu

Anotace

This paper presents a performance comparison of various approaches of realization of status register suitable for maintaining (in)valid bits in mid-density memory structures implemented in Xilinx FPGAs. An example of a such structure with status register could be a dictionary for Lempel-Ziv based lossless compression algorithms where the dictionary has to be initialized before each run of the algorithm with minimum time and logic resources consumption. The performance evaluation of designs has been made in Xilinx ISE and Vivado toolkits for the Virtex-7 FPGA. This research has been partially supported by the CTU project SGS17/017/OHK3/1T/18 "Dependable and attack-resistant architectures for programmable devices" and by the project "E-infrastructure CESNET "modernization" no. CZ.02.1.01/0.0/0.0/16 013/0001797.

Proposal of a Memory Architecture for Pre and Post-Correlation coherent Processing of GNSS Signal with SoC based Acquisition Uni

Autoři

Svatoň, J.; Vejražka, F.; Kubalík, P.; Schmidt, J.

Rok

2018

Publikováno

Proceedings of the 6th Prague Embedded Systems Workshop. ČVUT v Praze, Fakulta informačních technologií, 2018. p. 21-25. ISBN 978-80-01-06456-6.

Typ

Stať ve sborníku

Pracoviště

Katedra číslicového návrhu

Anotace

This contribution describes an architecture of additional system of memories for an existing GNSS (Global Navigation Satellite Systems) signal acquisition unit in frequency domain. The unit is designed for an FPGA-based HW receiver and has three 4K FFT blocks. The receiver is based on the System on Chip (SoC) Xilinx ZYNQ platform. The proposed additional memories are used as accumulators of complex signals samples and are placed in front or after the acquisition unit. They enable to process GNSS signals of different navigation systems more effectively with limited resources

Acquisition of Modern GNSS Signals in SoC ZYNQ with its Limited Computational Resources in Frequency Domain

Autoři

Svatoň, J.; Vejražka, F.; Kubalík, P.; Schmidt, J.

Rok

2017

Publikováno

Proceedings of the 5th Prague Embedded Systems Workshop. Praha: katedra číslicového návrhu, 2017. pp. 64-66. ISBN 978-80-01-06178-7.

Typ

Stať ve sborníku

Pracoviště

Katedra číslicového návrhu

Anotace

The objective of this contribution is a design of optimal algorithms for an universal GNSS acquisition unit. The unit is designed for a FPGA-based HW receiver and is implemented in frequency domain with three 4K FFT blocks. The unit is able to acquire usual civil signals (GPS C/A, BeiDou B1, IRNSS L5/S-band, and GLONASS L1OF) directly and to acquire the Galileo E1 longer code signal with proposed improved algorithm of the partial correlation. Pre- and mainly post-correlation methods are analyzed and selected with respect to implementation on the target System on Chip (SoC) Xilinx ZYNQ platform with limited computing resources.

Design of a Residue Number System Based Linear System Solver in Hardware

Autoři

Buček, J.; Kubalík, P.; Lórencz, R.; Zahradnický, T.

Rok

2017

Publikováno

Journal of Signal Processing Systems. 2017, 87(3), 343-356. ISSN 1939-8018.

Typ

Článek

DOI

10.1007/s11265-016-1146-1

Pracoviště

Katedra číslicového návrhu
Katedra počítačových systémů

Anotace

This paper is focused on error-free solution of dense linear systems using residual arithmetic in hardware. The designed Modular System uses hardware identical Residual Processors (RP)s for solving independent systems of linear congruences and combines their solutions into the solution of the given linear system. This approach uses the residue number system which is based on the Chinese remainder theorem. In order to efficiently exploit parallel processing and cooperation of the individual components, a hardware architecture of the Modular System with several RPs is designed. In order to verify the proposed architecture, a Xilinx FPGA with a MicroBlaze processor was used. Experimental results are obtained for an evaluation FPGA board with Virtex 6. Results from implementation serve for subsequent theoretical analysis of the system performance for various linear system sizes and further improvement of the system. The proposed system can be useful as a special hardware peripheral or a part of an embedded system for solving large nonsingular systems of linear equations with integer, rational or floating-point coefficients with arbitrary precision.

Methods and Hardware achitecture for Multi-constellation GNSS signal acqusition unit in frequency domain

Autoři

Svatoň, J.; Vejražka, F.; Kubalík, P.; Schmidt, J.

Rok

2017

Publikováno

ENC2017_Programme_NonCopyright. Lausanne: The Swiss Institute of Navigation, 2017. pp. 252-261.

Typ

Stať ve sborníku

Pracoviště

Katedra číslicového návrhu

Anotace

The objective of this contribution is a design of universal GNSS acquisition unit for an FPGA-based HW receiver, which is able of direct acquisition of usual civil signals (GPS C/A, BeiDou B1, IRNSS L5/S-band, and GLONASS L1OF). Due to high complexity of calculation and requirements for latency, processing in frequency domain with parallel search in code is adopted. Optimal processing methods even for the long codes of Galileo E1 or future GPS L1C signals are analyzed. For each block of the acquisition unit, a method is selected with respect to implementation on the target System on Chip (SoC) Xilinx ZYNQ platform. The unit is intended as a HW acquisition accelerator with a minimal SW handling requirements for the developed receiver.

A Novel and Efficient Method to Initialize FPGA Embedded Memory Content in Asymptotically Constant Time

Autoři

Bartík, M.; Ubik, S.; Kubalík, P.

Rok

2016

Publikováno

ReConFig’16. Piscataway: IEEE, 2016. ISBN 978-1-5090-3707-0.

Typ

Stať ve sborníku

DOI

10.1109/ReConFig.2016.7857146

Pracoviště

Katedra číslicového návrhu

Anotace

This paper describes analysis and implementation of a new method for maintaining valid content of FPGA memory blocks with an asymptotically constant time synchronous clear ability, that can be useful for (re)initialization to one default value. A particular application can be for high-speed real-time LZ77 lossless compression algorithms, where a dictionary has to be (re)initialized before each run of the implemented compression algorithm. The method is based on two most widely used techniques for clearing the memory content: a linear passage of the memory and clearing each cell by writing a default value and creating a register field providing an (in)valid bit for each memory cell. Our solution combines these two techniques together with the use of FPGA distributed memory blocks implemented in LUTs (Look-Up Tables) to overcome negative features of each previous method without losing the most of positive features. Our solution provides a balance between the two previous techniques and exceeds them in speed, resources utilization and latency of (re)initialization.

Nová a efektivní metoda pro zajištení platnosti dat ve vestavných pamětech FPGA se zaměřením na kompresi IP packetů v reálném čase

Autoři

Bartík, M.; Ubik, S.; Kubalík, P.

Rok

2016

Publikováno

Počítačové Architektury & Diagnostika PAD 2016 - Sborník příspěvků. Brno: Vysoké učení technické v Brně, 2016. p. 89-92. ISBN 978-80-214-5376-0.

Typ

Stať ve sborníku

Pracoviště

Katedra číslicového návrhu
Katedra počítačových systémů

Anotace

Tento článek se zabývá novým a efektivním způsobem zajištěná platnosti dat ve vestavných pamětech uvnitř FPGA, který je vhodný pro realizaci slovníků v bezztrátových kompresních algoritmech realizovaných v hardwaru (FPGA). Klíčem k této inovaci je chytré využití vlastností LUT (Look-Up Table), které umožňuje dosáhnout menšího počtu využitých zdrojů a vyšší frekvence celého systému oproti běžně používaným způsobům realizace. Tato metoda je navržena pro vysokou propustnost a nízkou latenci, což ji činí vhodnou pro kompresi jumbo IP packetů obsahující multimediální data v reálném čase. Použitou metodu je možné aplikovat na další datové struktury, které jsou mapovány do vestavných bloků RAM v FPGA.

LZ4 Compression Algorithm on FPGA

Autoři

Bartík, M.; Ubik, S.; Kubalík, P.

Rok

2015

Publikováno

21st IEEE International Conference on Electronics, Circuits, and Systems. New York: Institute of Electrical and Electronics Engineers, 2015. p. 179-182. ISBN 978-1-4799-2451-6.

Typ

Stať ve sborníku

DOI

10.1109/ICECS.2015.7440278

Pracoviště

Katedra číslicového návrhu
Katedra počítačových systémů

Anotace

This paper describes analysis and implementation of a LZ4 compression algorithm. LZ4 is derived from a standard LZ77 compression algorithm and is focused on the compression and decompression speed. The LZ4 lossless compression algorithm was analyzed regarding its suitability for hardware implementation. The first step of this research is based on software implementation of LZ4 with regard to the future hardware implementation. As a second step, a simple hardware implementation of LZ4 is evaluated for bottlenecks in the original LZ4 code. Xilinx Virtex–6 and 7–Series FPGAs are used to obtain experimental results. These results are compared to the industry competitor.

Rychlé bezztrátové kompresní algoritmy

Autoři

Bartík, M.; Ubik, S.; Kubalík, P.

Rok

2015

Publikováno

Sborník příspěvků PAD 2015. Zlín: Universita Tomáše Bati ve Zlíně, 2015, pp. 31-36. ISBN 978-80-7454-522-1.

Typ

Stať ve sborníku

Pracoviště

Katedra číslicového návrhu
Katedra počítačových systémů

Anotace

Výzkum se zabývá bezztrátovým kompresním algoritmem LZ4 (založeném na LZ77) a jeho vhodnosti pro kompresi multimediálních dat a univerzální paketovou kompresi pro sít’ové spolupráce v reálném čase v oblastech citlivých na zpoždění.

An ASIC Linear Congruence Solver Synthesized with Three Cell Libraries

Autoři

Buček, J.; Kubalík, P.; Lórencz, R.; Zahradnický, T.

Rok

2014

Publikováno

Proceedings of the 21st IEEE International Conference on Electronics Circuits and Systems. Monterey: IEEE Circuits and Systems Society, 2014. pp. 706-709. ISBN 978-1-4799-4242-8.

Typ

Stať ve sborníku

DOI

10.1109/ICECS.2014.7050083

Pracoviště

Katedra číslicového návrhu
Katedra počítačových systémů

Anotace

The paper describes an ASIC implementation of a linear congruence solver, part of a parallel system for solution of linear equations, and presents synthesis results for three different standard cell libraries. The previous VHDL design was adapted to three ASIC technologies (130 nm, 110 nm, and 55 nm) from two different vendors and the synthesized results were mutually compared. The comparison results were further used to obtain a view of design properties in higher density technologies.

System Design of an FPGA Linear Solver

Autoři

Buček, J.; Kubalík, P.; Lórencz, R.; Zahradnický, T.

Rok

2014

Publikováno

Proceedings of the Work in Progress Session held in connection with the 40th EUROMICRO Conference on Software Engineering and Advanced Applications and the 17th EUROMICRO Conference on Digital System Design. Linz: Johannes Kepler University, 2014, ISBN 978-3-902457-40-0.

Typ

Stať ve sborníku

Pracoviště

Katedra číslicového návrhu
Katedra počítačových systémů

Anotace

The work is focused on design of a Modular System performing error-free solution of dense linear systems using residue arithmetic in Xilinx FPGA. The designed system shall use a set of Residual Processors (RP)s for linear system solution in Residue Number System and reconstruct the set's solution afterwards. The currently proposed system's architecture has a single RP, a large DDR memory used for data transfer in between a PC and the system, and a built-in MicroBlaze processor. Future work will focus on extending the architecture to implement the entire Modular System consisting of multiple RPs and performing the backward transformation from residue representation into the rational number set.

System on Chip Design of a Linear System Solver

Autoři

Buček, J.; Kubalík, P.; Lórencz, R.; Zahradnický, T.

Rok

2014

Publikováno

2014 International Symposium on System-on-Chip Proceedings. Piscataway: IEEE, 2014. ISBN 9781479968909.

Typ

Stať ve sborníku

DOI

10.1109/ISSOC.2014.6972445

Pracoviště

Katedra číslicového návrhu
Katedra počítačových systémů

Anotace

This paper is focused on hardware error-free solution of dense linear systems using residual arithmetic on a System on Chip Modular System. The designed Modular System uses Residual Processors (RP)s for solving independent linear systems in residue arithmetic and combines RP solutions into solution of the linear system. A System on Chip architecture of the Modular System with several RPs is designed, each with a large memory unit used for data transfer and storage. A Xilinx FPGA architecture with a MicroBlaze processor is used to verify the proposed architecture. The experimental results are obtained for an evaluation FPGA board with Virtex 6 and a 1GiB DDR memory and serve for further theoretical analysis of the system performance for various linear system sizes and the architecture of the system.

Comparison of FPGA and ASIC Implementation of a Linear Congruence Solver

Autoři

Buček, J.; Kubalík, P.; Lórencz, R.; Zahradnický, T.

Rok

2013

Publikováno

Proceedings of 16th Euromicro Conference on Digital System Design. Piscataway: IEEE Service Center, 2013. p. 284-287. ISBN 978-0-7695-5074-9.

Typ

Stať ve sborníku

DOI

10.1109/DSD.2013.125

Pracoviště

Katedra počítačových systémů

Anotace

Residual processor (RP) is a dedicated hardware for solution of sets of linear congruences. RPs are parts of a larger modular system for error-free solution of linear equations in residue arithmetic. We present new FPGA and ASIC RP implementations, focusing mainly on their memory units being a bottleneck of the calculation and therefore determining the efficiency of the system. First, we choose an FPGA to easily test the functionality of our implementation, then we do the same in ASIC, and finally we compare both implementations together. The experimental FPGA results are obtained for Xilinx Virtex 6, while the ASIC results are obtained from Synopsys tools with a 130 nm standard cell library. Results also present a maximum matrix dimension fitting directly into the FPGA and achieved speed as a function of the dimension.

Dedicated Hardware Implementation of a Linear Congruence Solver in FPGA

Autoři

Buček, J.; Kubalík, P.; Lórencz, R.; Zahradnický, T.

Rok

2012

Publikováno

The 19th IEEE International Conference on Electronics, Circuits, and Systems, ICECS 2012. Monterey: IEEE Circuits and Systems Society, 2012. p. 689-692. ISBN 978-1-4673-1261-5.

Typ

Stať ve sborníku

DOI

10.1109/ICECS.2012.6463632

Pracoviště

Katedra počítačových systémů

Anotace

The residual processor is a dedicated hardware for solving sets of linear congruences. It is a part of the modular system for solving sets of linear equations without rounding errors using Residue Number System. We present a new FPGA implementation of the residual processor, focusing mainly on the memory unit that forms a bottleneck of the calculation, and therefore determines the effectivity of the system. FPGA has been chosen, as it allows us to optimally implement the designed architecture depending on the size of the problem. The proposed memory architecture of the modular system is implemented using the internal FPGA block RAM. Experimental results are obtained for the Xilinx Virtex 6 family. Results present the maximum matrix dimension fitting directly into the FPGA, and achieved speed as a function of the dimension.

Fault Models Usability Study for On-line Tested FPGA

Autoři

Borecký, J.; Kohlík, M.; Kubalík, P.; Kubátová, H.

Rok

2011

Publikováno

Proceedings of the 14th Euromicro Conference on Digital System Design. Los Alamitos: IEEE Computer Society Press, 2011, pp. 287-290. ISBN 978-0-7695-4494-6.

Typ

Stať ve sborníku

DOI

10.1109/DSD.2011.42

Pracoviště

Katedra číslicového návrhu

Anotace

FPGAs are susceptible to many environment effects that can cause soft errors (errors which can be corrected by the reconfiguration ability of the FPGA). Two different fault models are discussed and compared in this paper. The first one - Stuck-at model - is widely used in many applications and it is not limited to the FPGAs. The second one - Bit-flip model - can affect SRAM cells that are used to configure the internal routing of the FPGA and to set up the behavior of the Look-Up Tables (LUTs). The change of the LUT behavior is the only Bit-flip effect considered in this paper. A fault model analysis has been performed on small example designs in order to find the differences between the fault models. This paper discusses the relevance of using two types of models Stuck-at and Bit-flip with respect to the dependability characteristics Fault Security (FS) and Self-Testing (ST). The fault simulation using both fault models has been performed to verify the analysis

Fault-tolerant and fail-safe design based on reconfiguration

Autoři

Kubátová, H.; Kubalík, P.

Rok

2011

Publikováno

Design and Test Technology for Dependable Systems-on-Chip. Hershey, Pennsylvania: IGI Global, 2011. p. 175-194. ISBN 978-1-60960-212-3.

Typ

Kapitola v knize

DOI

10.4018/978-1-60960-212-3

Pracoviště

Katedra číslicového návrhu

Anotace

The main aim of this chapter is to present the way, how to design fault-tolerant or fail-safe systems in programmable hardware (FPGAs) and therefore to use FPGAs in mission-critical applications, too. RAM based FPGAs are usually taken for unreliable due to high probability of transient faults (SEU) and therefore inapplicable in this area. But FPGAs can be easily reconfigured. Our aim is to utilize appropriate type of FPGA reconfiguration and to combine it with well-known methods for fail-safe and fault-tolerant design (duplex, TMR) including on-line testing methods for fault detection and then startup of the reconfiguration process. Dependability parameters' calculations based on reliability models is integral part of proposed methodology. The trade-off between the requested level of dependability characteristics of a designed system and area overhead with respect to FPGA possible faults is main property and advantage of proposed methodology.

Faults Coverage Improvement based on Fault Simulation and Partial Duplication

Autoři

Borecký, J.; Kohlík, M.; Kubátová, H.; Kubalík, P.

Rok

2010

Publikováno

Proceedings of the 13th Euromicro Conference on Digital System Design. Los Alamitos: IEEE Computer Society Press, 2010. pp. 380-386. ISBN 978-0-7695-4171-6.

Typ

Stať ve sborníku

DOI

10.1109/DSD.2010.112

Pracoviště

Katedra číslicového návrhu

Anotace

A method how to improve the coverage of single faults in combinational circuits is proposed. The method is based on Concurrent Error Detection, but uses a fault simulation to find Critical points - the places, where faults are difficult to detect. The partial duplication of the design with regard to these critical points is able to increase the faults coverage with a low area overhead cost. Due to higher fault coverage we can increase the dependability parameters. The proposed modification is tested on the railway station safety devices designs implemented in the FPGA.

Reliable Railway Station System based on Regular Structure implemented in FPGA

Autoři

Borecký, J.; Kubalík, P.; Kubátová, H.

Rok

2009

Publikováno

Proc. of 12th EUROMICRO Conference on Digital System Design. Los Alamitos: IEEE Computer Society, 2009. pp. 348-354. ISBN 978-0-7695-3782-5.

Typ

Stať ve sborníku

DOI

10.1109/DSD.2009.210

Pracoviště

Katedra číslicového návrhu

Anotace

The method how to design a safety device of railway station efficiently and scalable is proposed. The safety device for any configuration of railway station can be built from five basic blocks. These basic blocks are connected together with universal interface. Each block is based on a finite state machine. The finite state machines are "Moore" type. Each state machine is divided into three basic parts, where each part is designed as a self-checking circuit ensuring fault detection. Our methodology is intended for final implementation in FPGA and hence SEU faults occurring in the system is assumed.

Ing. Pavel Kubalík, Ph.D.

Publikace

Quantized Neural Network with Linearly Approximated Functions on Zynq FPGA

Evaluation of the Medium-sized Neural Network using Approximative Computations on Zynq FPGA

Approximate arithmetic for modern neural networks and FPGAs

Feasibility of a Neural Network with Linearly Approximated Functions on Zynq FPGA

An In-sight into How Compression Dictionary Architecture can Affect the Overall Performance in FPGAs

Low Power Wireless Data Transfer for Internet of Things: GSM Network Measuring Results

Novel Partial Correlation Method Algorithm for Acquisition of GNSS Tiered Signals

Design of a High-Throughput Match Search Unit for Lossless Compression Algorithms

High Throughput and Low Latency LZ4 Compressor on FPGA

Ultra High Resolution Jitter Measurement Method for Ethernet Based Networks

Performance Comparison of Multiple Approaches of Status Register for Medium Density Memory Suitable for Implementation of a Lossless Compression Dictionary

Proposal of a Memory Architecture for Pre and Post-Correlation coherent Processing of GNSS Signal with SoC based Acquisition Uni

Acquisition of Modern GNSS Signals in SoC ZYNQ with its Limited Computational Resources in Frequency Domain

Design of a Residue Number System Based Linear System Solver in Hardware

Methods and Hardware achitecture for Multi-constellation GNSS signal acqusition unit in frequency domain

A Novel and Efficient Method to Initialize FPGA Embedded Memory Content in Asymptotically Constant Time

Nová a efektivní metoda pro zajištení platnosti dat ve vestavných pamětech FPGA se zaměřením na kompresi IP packetů v reálném čase

LZ4 Compression Algorithm on FPGA

Rychlé bezztrátové kompresní algoritmy

An ASIC Linear Congruence Solver Synthesized with Three Cell Libraries

System Design of an FPGA Linear Solver

System on Chip Design of a Linear System Solver

Comparison of FPGA and ASIC Implementation of a Linear Congruence Solver

Dedicated Hardware Implementation of a Linear Congruence Solver in FPGA

Fault Models Usability Study for On-line Tested FPGA

Fault-tolerant and fail-safe design based on reconfiguration

Faults Coverage Improvement based on Fault Simulation and Partial Duplication

Reliable Railway Station System based on Regular Structure implemented in FPGA