prof. Dr. Ing. Petr Kroha, CSc.

Automated Semantic Annotation of Data Management Plans: A Systematic Review

Authors

Martínková, J.; Suchánek, M.; Kroha, P.

Year

2025

Published

Codata Science Journal. 2025, 24 ISSN 1683-1470.

Type

Article

DOI

10.5334/dsj-2025-016

Departments

Department of Software Engineering

Annotation

Semantic annotation has emerged as a key technique for transforming human-readable data into machine-actionable formats. It corresponds with the growing emphasis on data reusability and research reproducibility. This paper examines tools for semantic annotation using ontologies and controlled vocabularies, with a focus on their application in data management planning. A systematic review identified 34 relevant tools, which show potential for adaptation to the data management plan (DMP) domain. While these tools meet many requirements, they do not fully address all DMP-specific needs. The paper provides an overview of current tools and suggests directions for future research to adapt them for DMP use.

Quality Measurement of Functional Requirements

Authors

Šenkýř, D.; Kroha, P.

Year

2023

Published

Proceedings of the 18th International Conference on Software Technologies. Porto: SciTePress - Science and Technology Publications, 2023. p. 736-743. vol. 1. ISSN 2184-2833. ISBN 978-989-758-665-1.

Type

Proceedings paper

DOI

10.5220/0012148700003538

Departments

Department of Software Engineering

Annotation

In this contribution, we propose a metric to measure the quality of textual functional requirements specifications. Since the main problem of such requirements specifications is their ambiguity, incompleteness, and inconsistency, we developed textual patterns to reveal shortcomings in these properties. As a component of our analysis, we use not only the text of the requirements but also the UML model that we construct during the text analysis. Combining the results of part-of-speech tagging of the text and the modeled properties, we are able to identify a number of irregularities concerning the properties named above. Then, the text needs human intervention to correct or remove the suspicious formulations. As a measure of the requirements specification quality, we denote the number of necessary human interventions. We implemented a tool called TEMOS that can test ambiguity, incompleteness, and inconsistency, and we use its results to evaluate the quality of textual requirements. In this paper, we summarize our project results.

Expanding Normalized Systems from textual domain descriptions using TEMOS

Authors

Šenkýř, D.; Suchánek, M.; Kroha, P.; Mannaert, H.; Pergl, R.

Year

2022

Published

Journal of Intelligent Information Systems. 2022, 59(2), 391-414. ISSN 0925-9902.

Type

Article

DOI

10.1007/s10844-022-00706-8

Departments

Department of Software Engineering

Annotation

Functional requirements on a software system are traditionally captured as text that describes the expected functionality in the domain of a real-world system. Natural language processing methods allow us to extract the knowledge from such requirements and transform it, e.g., into a model. Moreover, these methods can improve the quality of the requirements, which usually suffer from ambiguity, incompleteness, and inconsistency. This paper presents a novel approach to using natural language processing. We use the method of grammatical inspection to find specific patterns in the description of functional requirement specifications (written in English). Then, we transform the requirements into a model of Normalized Systems elements. This may realize a possible component of the eagerly awaited text-to-software pipeline. The input of this method is represented by textual requirements. Its output is a running prototype of an information system created using Normalized Systems (NS) techniques. Therefore, the system is ready to accept further enhancements, e.g., custom code fragments, in an evolvable manner ensured by compliance with the NS principles. A demonstration of pipeline implementation is also included in this paper. The text processing part of our pipeline extends the existing pipeline implemented in our system TEMOS, where we propose and implement methods of checking the quality of textual requirements concerning ambiguity, incompleteness, and inconsistency.

Problem of Inconsistency and Default Consistency Rules

Authors

Šenkýř, D.; Kroha, P.

Year

2021

Published

New Trends in Intelligent Software Methodologies, Tools and Techniques. Amsterdam: IOS Press, 2021. p. 674-687. Frontiers in Artificial Intelligence and Applications. vol. 337. ISSN 0922-6389. ISBN 978-1-64368-194-8.

Type

Proceedings paper

DOI

10.3233/FAIA210063

Departments

Department of Software Engineering

Annotation

We investigate inconsistency problem in textual functional requirements specifications. We argue that some sources of inconsistency can be revealed during the very first steps of textual requirements analysis. In this paper, we focus on those facts and rules that domain experts find so obvious that they do not even mention them to the analysts during the discussions about the product to be constructed. However, what is very obvious for stakeholders may not be obvious for analysts. We call such rules default consistency rules. We argue that the lack of the default consistency rules leads to incompleteness in the requirements, and it causes inconsistency with all its unpleasant consequences. In this contribution, we describe our approach to the problem of how the missing information can be both identified in the original requirements and found in external sources. We show a motivational example and explain our method.

Problem of Inconsistency in Textual Requirements Specification

Authors

Šenkýř, D.; Kroha, P.

Year

2021

Published

Proceedings of the 16th International Conference on Evaluation of Novel Approaches to Software Engineering. Porto: SciTePress - Science and Technology Publications, 2021. p. 213-220. vol. 1. ISSN 2184-4895. ISBN 978-989-758-508-1.

Type

Proceedings paper

DOI

10.5220/0010421602130220

Departments

Department of Software Engineering

Annotation

In this contribution, we investigate the inconsistency problem in the textual description of functional requirements specifications. In the past, the inconsistency problem was investigated using analysis of the UML models. We argue that some sources of inconsistency can be revealed in the very first steps of textual requirements analysis using linguistic patterns that we developed. We cluster the sentences according to their semantic similarity given by their lexical content and syntactic structure. Our contribution focus on revealing linguistic contradictions (e.g., a combination of passive voice, antonyms, negated synonyms, etc.) of facts and rules described in different parts of requirements together with contradictions of the internally generated model.

Problem of Semantic Enrichment of Sentences Used in Textual Requirements Specification

Authors

Šenkýř, D.; Kroha, P.

Year

2021

Published

Advanced Information Systems Engineering Workshops. Springer, Cham, 2021. p. 69-80. Lecture Notes in Business Information Processing. vol. 423. ISSN 1865-1348. ISBN 978-3-030-79021-9.

Type

Proceedings paper

DOI

10.1007/978-3-030-79022-6_7

Departments

Department of Software Engineering

Annotation

In this paper, we describe our graph-oriented method used to find semantically similar sentences in external information sources that have a semantic enrichment potential in relation to sentences of textual functional requirements specification. Our motivation is to reduce the incompleteness of requirements that may be a source of inconsistency. We found there are some facts and rules so obvious for domain experts that they do not even mention them in requirements. We call such rules default consistency rules. These rules are often not implemented and can not be revealed from the requirements because they are not mentioned there.

Patterns for Checking Incompleteness of Scenarios in Textual Requirements Specification

Authors

Šenkýř, D.; Kroha, P.

Year

2020

Published

Proceedings of the 15th International Conference on Evaluation of Novel Approaches to Software Engineering. Porto: SciTePress - Science and Technology Publications, 2020. p. 289-296. ISSN 2184-4895. ISBN 978-989-758-421-3.

Type

Proceedings paper

DOI

10.5220/0009344202890296

Departments

Department of Software Engineering

Annotation

In this contribution, we investigate the incompleteness problem in textual requirements specifications. Missing alternative scenarios are one of the incompleteness sources, i.e., descriptions of processing in the cases when something runs in another way as expected. We check the text of requirements specification using linguistic patterns, and we try to reveal scenarios and alternative scenarios. After that process is finished, we decide whether the set of alternative scenarios is complete. As a result, we generate warning messages. We illustrate our approach with examples.

ADA: Embracing technology change acceleration

Authors

Dvořák, O.; Pergl, R.; Kroha, P.

Year

2019

Published

CIAO! Doctoral Consortium and EEWC Forum and EEWC Posters 2019. Aachen: CEUR Workshop Proceedings, 2019.

Type

Proceedings paper

Departments

Department of Software Engineering

Annotation

The pace of technology change has accelerated in the past decade. Conceptually similar technologies are introduced on nearly a daily basis. On one hand, IT experts call for applying the most modern approaches and technologies to software projects, on the other, companies suffer from liabilities to a technology used in their legacy solutions. This seems to result in a disturbing situation when a specific technology of an ongoing software project becomes legacy almost before the project successfully hits a production. This poses a continual challenge for software development, and effective ways of technology transition are sought. Affordance-Driven Assembling (ADA) represents such an effort from the standpoint of enterprise engineering theories. In this paper, we formulate a high-level architecture of a software system based on ADA. We demonstrate the architecture on an example of an object-oriented system. We evaluate the qualities of such architecture from the perspective of evolvability using Normalized Systems Theory, and we formulate conclusions on potential of this approach.

Patterns of Ambiguity in Textual Requirements Specification

Authors

Šenkýř, D.; Kroha, P.

Year

2019

Published

New Knowledge in Information Systems and Technologies. Springer, Cham, 2019. p. 886-895. Advances in Intelligent Systems and Computing. vol. 1. ISSN 2194-5357. ISBN 978-3-030-16180-4.

Type

Proceedings paper

DOI

10.1007/978-3-030-16181-1_83

Departments

Department of Software Engineering

Annotation

In this contribution, we investigate the ambiguity problem in textual requirements specifications. We focused on the structural ambiguity and extracted some patterns to indicate this kind of ambiguity. We show that the standard methods of linguistics are not enough in some cases, and we describe a class of ambiguity caused by coreference that needs an underlying domain model or a knowledge base to be solved. Part of our implemented solution is a cooperation of our tool TEMOS with the Prolog inference machine working with facts and rules acquired from OCL conditions of the domain model.

Problem of Incompleteness in Textual Requirements Specification

Authors

Šenkýř, D.; Kroha, P.

Year

2019

Published

Proceedings of the 14th International Conference on Software Technologies. Porto: SciTePress - Science and Technology Publications, 2019. p. 323-330. ISBN 978-989-758-379-7.

Type

Proceedings paper

DOI

10.5220/0007978003230330

Departments

Department of Software Engineering

Annotation

In this contribution, we investigate the incompleteness problem in textual requirements specifications. Incompleteness is a typical problem that arises when stakeholders (e.g., domain experts) hold some information for generally known, and they do not mention it to the analyst. A model based on the incomplete requirements suffers from missing objects, properties, or relationships as we show in an illustrating example. Our presented methods are based on grammatical inspection, semantic networks (ConceptNet and BabelNet), and pre-configured data from on-line dictionaries. Additionally, we show how a domain model has to be used to reveal some missing parts of it. Our experiments have shown that the precision of our methods is about 60–82 %.

Affordance-Driven Software Assembling

Authors

Dvořák, O.; Pergl, R.; Kroha, P.

Year

2018

Published

Advances in Enterprise Engineering XII. Cham: Springer International Publishing AG, 2018. p. 39-54. Lecture Notes in Business Information Processing. vol. 334. ISSN 1865-1356. ISBN 978-3-030-06096-1.

Type

Proceedings paper

DOI

10.1007/978-3-030-06097-8_3

Departments

Department of Software Engineering

Annotation

Nowadays, the pace of technology innovation and disruption accelerates. This poses a challenge of transforming complex functionalities of enterprise systems to a new technological environment. In this paper, we explain how enterprise engineering tau-theory and beta-theory may help to manage the relationship between system function and its construction (F/C), thus facilitating changing technology challenges more rigorously and efficiently. We introduce the notion of Affordance-Driven Assembling (ADA) and its simplified version Objectified Affordance-Driven Assembling (O-ADA), which together with the so-called Semantic Descriptions represent a software-engineering approach enabling reasoning about users and their purposes versus components and their properties. Our experiments show that engineering methods based on these theories may increase reusability of code

Hurst Exponent and Trading Signals Derived from Market Time Series

Authors

Kroha, P.; Škoula, M.

Year

2018

Published

ICEIS 2018 Proceedings. Madeira: SciTePress, 2018. p. 371-378. vol. Volume 1. ISBN 978-989-758-298-1.

Type

Proceedings paper

Departments

Department of Software Engineering

Annotation

In this contribution, we investigate whether it is possible to use chaotic properties of time series in forecasting. Time series of market data have components of white noise without any trend, and they have components of brown noise containing trends. We constructed a new technical indicator MH (Moving Hurst) based on Hurst exponent that describes chaotic properties of time series. Further, we stated and proved a hypothesis that this indicator can bring more profit than the very well known indicator MACD (Moving Averages Convergence Divergence) that is based on moving averages of time series values. In our experiments, we tested and eva- luated our proposal using hypothesis testing. We argue that Hurst exponent can be used as an indicator of technical analysis under considerations discussed in our paper.

Patterns in Textual Requirements Specification

Authors

Šenkýř, D.; Kroha, P.

Year

2018

Published

Proceedings of the 13th International Conference on Software Technologies. Madeira: SciTePress, 2018. p. 197-204. vol. 1. ISBN 978-989-758-320-9.

Type

Proceedings paper

DOI

10.5220/0006827302310238

Departments

Department of Software Engineering

Annotation

In this paper, we investigate methods of grammatical inspection to identify patterns in textual requirements specification. Unfortunately, a text in natural language includes additionally many inaccuracies caused by ambiguity, inconsistency, and incompleteness. Our contribution is that using our patterns, we are able to extract the information from the text that is necessary to fix some of the problems mentioned above. We present our implemented tool TEMOS that is able to detect some inaccuracies in a text and to generate fragments of the UML class model from textual requirements specification. We use external on-line resources to complete the textual information of requirements.

Evaluation of XPath Queries Over XML Documents Using SparkSQL Framework

Authors

Hricov, R.; Šenk, A.; Kroha, P.; Valenta, M.

Year

2017

Published

Beyond Databases, Architectures and Structures. Towards Efficient Solutions for Data Analysis and Knowledge Representation. Springer, Cham, 2017. p. 28-41. ISSN 1865-0929. ISBN 978-3-319-58274-0.

Type

Proceedings paper

DOI

10.1007/978-3-319-58274-0_3

Departments

Department of Software Engineering

Annotation

In this contribution, we present our approach to querying XML document that is stored in a distributed system. The main goal of this paper is to describe how to use Spark SQL framework to implement a subset of expressions from XPath query language. Five different methods of our approach are introduced and compared, and by this, we also demonstrate the actual state of query optimization on Spark SQL platform. It may be taken as the next contribution of our paper. A subset of expressions from XPath query language (supported by the implemented methods) contains all XPath axes except the axes of attribute and namespace while predicates are not implemented in our prototype. We present our implemented system, data, measurements, tests, and results. The evaluated results support our belief that our method significantly decreases data transfers in the distributed system that occur during the query evaluation.

Tackling the Flexibility-Usability Trade-off in Component-Based Software Development

Authors

Dvořák, O.; Pergl, R.; Kroha, P.

Year

2017

Published

Recent Advances in Information Systems and Technologies. Berlin: Springer-Verlag, 2017. p. 861-871. ISSN 2194-5365. ISBN 978-3-319-56535-4.

Type

Proceedings paper

DOI

10.1007/978-3-319-56535-4_84

Departments

Department of Software Engineering

Annotation

Increase flexibility, decrease usability” is a known trade-off influencing the effectiveness of reusing artefacts in many engineering disciplines. We claim that software development is influenced, too. The goal of this paper is to elaborate on flexibility and usability in component-based software development. It explains that equally flexible components can considerably differ in usability costs. Therefore, the architecture of components matters to evaluate final cost on building software. We propose a model of building components that can help to decrease costs on software development, while providing a demanded level of flexibility.

Minimization of Data Transfers during MapReduce Computations in Distributed Wide-Column Stores

Authors

Šenk, A.; Hrstka, M.; Kroha, P.; Valenta, M.

Year

2016

Published

New Trends in Databases and Information Systems. Wien: Springer, 2016. pp. 261-274. 637. ISSN 1865-0929. ISBN 978-3-319-44065-1.

Type

Proceedings paper

DOI

10.1007/978-3-319-44039-2_18

Departments

Department of Software Engineering

Annotation

In this contribution, we present our original approach to distributed wide-column store database tuning based on data locality optimization. The main goal of the optimization is the reduction of communication overhead in distributed environment during Map-Reduce query evaluation. The optimization is realized by the minimisation of the total number of key-value pairs emitted from mappers. To achieve the goal, we combine several Map-Reduce optimization methods, adapt them to wide-column store model and utilize them to overcome architectural limitation. To prove our idea, we implemented the proposed solution in HBase system that represents this class of DBMS. We present our data, measurements, and tests. The evaluated results support our idea that this method can significantly decrease data transfers in the distributed system.

Reducing Cold Start Problems in Educational Recommender Systems

Authors

Kuznetsov, S.; Kordík, P.; Řehořek, T.; Kroha, P.; Dvořák, J.

Year

2016

Published

2016 International Joint Conference on Neural Networks (IJCNN). San Francisco: American Institute of Physics and Magnetic Society of the IEEE, 2016. p. 3143-3149. ISSN 2161-4407. ISBN 978-1-5090-0620-5.

Type

Proceedings paper

DOI

10.1109/IJCNN.2016.7727600

Departments

Department of Theoretical Computer Science
Department of Software Engineering

Annotation

Educational data can help us to personalise university information systems. In this paper, we show how educational data can be used to improve the performance of interaction-based recommender systems. Educational data is transformed to student profiles helping to prevent cold start problems when recommending projects to students with few user interactions. Our results show that our hybrid interaction based recommender boosted by educational profiles significantly outperforms bestseller recommendation, which is a mainstream recommendation method for cold start users.

Confirmation Engine Design Based on PSI Theory

Authors

Dvořák, O.; Pergl, R.; Kroha, P.

Year

2015

Published

Complementary Proceedings of the Workshops TEE, CoBI, and XOC-BPM at IEEE-COBI 2015. CEUR-WS.org, 2015. p. 0-8. CEUR Workshop Proceedings. ISSN 1613-0073.

Type

Proceedings paper

Departments

Department of Software Engineering

Annotation

Design & Engineering Methodology for Organisations (DEMO) is a methodology for (re)designing and (re)engineering of organisations. Having a strong theoretical background in the PSI theory (Performance in Social Interactions), DEMO deals with communication and interaction between subjects (human beings) that play a crucial role within all company processes. Advanced information systems are used to support processes and communications. In these systems, confirmations are very usual patterns. In this paper, we present a design of a confirmation engine based on the transaction axiom of the PSI theory. We discuss a theoretical background of this engine, our implementation, and how this module fits into the IT infrastructure.

Revisiting the BORM OR Diagram Composition Pattern

Authors

Podloucký, M.; Pergl, R.; Kroha, P.

Year

2015

Published

Enterprise and Organizational Modeling and Simulation. Berlin: Springer, 2015. pp. 102-113. Lecture Notes in Business Information Processing. ISSN 1865-1348. ISBN 978-3-319-24625-3.

Type

Proceedings paper

DOI

10.1007/978-3-319-24626-0_8

Departments

Department of Software Engineering

Annotation

This paper addresses the notion of process decomposition as a tool for managing process complexity in BORM Object Relation Diagram. It investigates the composition principle already present in ORD and shows it as ambiguous and mostly unsuitable for that purpose. Substantial changes to the original meta-model of ORD are proposed by introducing a new concept called tasks. The implications of introducing this new concept are then investigated, especially concerning decomposition of communications in a BORM process.

Comparison of Genetic Algorithms for Trading Strategies

Authors

Kroha, P.; Friedrich, M.

Year

2014

Published

SOFSEM 2014: Theory and Practice of Computer Science. Cham: Springer International Publishing AG, 2014, pp. 383-394. LNCS 8327. ISSN 0302-9743. ISBN 978-3-319-04297-8.

Type

Proceedings paper

DOI

10.1007/978-3-319-04298-5

Departments

Department of Software Engineering

Annotation

In this contribution, we describe and compare two genetic systems which create tading strategies. The first systém is based on the idea that the connection weight matrix of a neural network represents the genotype of and individual and can be changed by genetic algorithm. The second systém uses genetic programming to derive trading strategies. As input data in our experiments, we used technical indicators of NASDAQ stocks. As output, the algorithms generate trading strategies, i.e. buy, hold, and sell signals. Our hypothesis that strategies obrained by genetic programming bring better results than buy-and-hold stratégy has been proven as statistically significant. We discuss our results and compare them to our previous experiments with fuzzy technology, fractal approach, and with simple technical indicator strategy.

Fuzzy and Fractal Technology in Market Analysis

Authors

Kroha, P.; Lauschke, M.

Year

2012

Published

Revised and Selected Papers of the International Joint Conference, IJCCI 2010. Berlin: Springer, 2012. pp. 247-260. Computational Intelligence. Studies in Computational Intelligence. ISSN 1860-949X. ISBN 978-3-642-27533-3.

Type

Proceedings paper

DOI

10.1007/978-3-642-27534-0

Departments

Department of Software Engineering

Annotation

In this contribution, we investigate the possibilities of using fuzzy and fractal methods for analysing time series of market data. First, we implemented and tested a fuzzy component that provides fuzzyfication by the Mamdani Larsen inference method with static rules using not only Gauss but also Cauchy and Mandelbrot distribution. Second, we implemented and tested a fractal component that provides fuzzy clustering by the Takagi Sugeno method with dynamic fuzzy rules. Looking for an optimum, we simulated many parameter combinations and compared the results. We present some interesting results of our experiments.