doc. Ing. Tomáš Vitvar, Ph.D.

Anomaly Detection in Log Streams based on Time-Contextual Models

Autoři

Fedotov, D.; Kuchař, J.; Vitvar, T.

Rok

2025

Publikováno

Web Information Systems Engineering – WISE 2024. Springer Nature Singapore Pte Ltd., 2025. p. 19-29. 1. ISSN 0302-9743. ISBN 978-981-96-0575-0.

Typ

Stať ve sborníku

DOI

10.1007/978-981-96-0576-7_2

Pracoviště

Katedra softwarového inženýrství

Anotace

Organisations today heavily rely on complex software systems integrated through multiple layers of middleware. This complexity leads to substantial generation of operational data of structured and semi-structured formats which is recorded in log files. The workload of the system fluctuates according to specific periods of the day which impacts the amount and quality of data generated in log files. In this paper, we propose a new log anomaly detection approach that leverages a collection of smaller models designed to capture workload fluctuations over specific time intervals. We demonstrate its effectiveness in detecting anomalies within log streams. Our evaluation uses log data from servers in a production environment, handling a complex back-end system that processes hundreds of requests per second. We show that our method outperforms traditional and widely used anomaly detection methods in data streams in the context of dynamic and time-sensitive workload scenarios.

Time-Aware Log Anomaly Detection Based on Growing Self-organizing Map

Autoři

Fedotov, D.; Kuchař, J.; Vitvar, T.

Rok

2023

Publikováno

Service-Oriented Computing. Springer, Cham, 2023. p. 169-177. ISSN 0302-9743. ISBN 978-3-031-48420-9.

Typ

Stať ve sborníku

DOI

10.1007/978-3-031-48421-6_12

Pracoviště

Katedra softwarového inženýrství

Anotace

A software system generates extensive log data, reflecting its workload and potential failures during operation. Log anomaly detection algorithms use this data to identify deviations in system behavior, especially when errors occur. Workload patterns can vary with time, depending on factors like the time of day or day of the week, affecting log entry volumes. Thus, it’s essential for log anomaly detection to consider temporal information that captures workload variations. This paper introduces a novel log anomaly detection method that incorporates such time information and demonstrates how smaller models enhance anomaly detection precision. We evaluate this method on a high-throughput production workload of a software system, showcasing its superior performance over conventional log anomaly detection methods.

Linked Web APIs Dataset

Autoři

Dojčinovski, M.; Vitvar, T.

Rok

2018

Publikováno

Semantic Web. 2018, 9(4), 381-391. ISSN 1570-0844.

Typ

Článek

DOI

10.3233/SW-170259

Pracoviště

Katedra softwarového inženýrství

Anotace

Web APIs enjoy a significant increase in popularity and usage in the last decade. They have become the core technology for exposing functionalities and data. Nevertheless, due to the lack of semantic Web API descriptions their discovery, sharing, integration, and assessment of their quality and consumption is limited. In this paper, we present the Linked Web APIs dataset, an RDF dataset with semantic descriptions about Web APIs. It provides semantic descriptions for 11,339 Web APIs, 7,415 mashups and 7,717 developer profiles, which make it the largest available dataset from the Web APIs domain. The dataset captures the provenance, temporal, technical, functional, and non-functional aspects. In addition, we describe the Linked Web APIs Ontology, a minimal model which builds on top of several well-known ontologies. The dataset has been interlinked and published according to the Linked Data principles. Finally, we describe several possible usage scenarios for the dataset and show its potential.

Posouzení vodního režimu malého povodí s kombinovaným využitím stopovačů

Autoři

Jankovec, J.; Šanda, M.; Vitvar, T.

Rok

2017

Publikováno

Podzemní voda a společnost, sborník příspěvků XV. hydrogeologického kongresu. Význam inženýrské geologie ve výstavbě, sborník příspěvků III. inženýrskogeologického kongresu, Brno 4.-7.9.2017. Brno: Přírodovědecká fakulta, 2017. ISBN 978-80-903635-5-7.

Typ

Stať ve sborníku

Pracoviště

Katedra softwarového inženýrství

Anotace

Vodní režim horského povodí byl popsán modelem s využitím stopovačů 3H-3He a 18O. Rozměry zvodně byly stanoveny na základě geofyzikálního zaměření lokality, zatímco její dotace byla stanovena podrobným modelem nenasycené zóny. Díky vzorkovacím vrtům v různých hloubkách pod povrchem byly odvozeny časy dotoku a jejich vertikální distribuce. Ty se pohybují v řádu dekád.

Crowdsourced Corpus with Entity Salience Annotations

Autoři

Dojčinovski, M.; Reddy, D.; Kliegr, T.; Vitvar, T.; Sack, H.

Rok

2016

Publikováno

Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016). Paris: European Language Recources Association (ELRA), 2016. p. 3307-3311. ISBN 978-2-9517408-9-1.

Typ

Stať ve sborníku

Pracoviště

Katedra softwarového inženýrství

Anotace

In this paper, we present a crowdsourced dataset which adds entity salience (importance) annotations to the Reuters-128 dataset, which is subset of Reuters-21578. The dataset is distributed under a free license and publish in the NLP Interchange Format, which fosters interoperability and re-use. We show the potential of the dataset on the task of learning an entity salience classifier and report on the results from several experiments.

Exploiting Temporal Dimension in Tensor-Based Link Prediction

Autoři

Kuchař, J.; Dojčinovski, M.; Vitvar, T.

Rok

2016

Publikováno

Web Information Systems and Technologies. Cham: Springer International Publishing, 2016. pp. 211-231. Lecture Notes in Business Information Processing. ISSN 1865-1348. ISBN 978-3-319-30995-8.

Typ

Stať ve sborníku vyzvaná či oceněná

DOI

10.1007/978-3-319-30996-5_11

Pracoviště

Katedra softwarového inženýrství

Anotace

In the recent years, there is a significant interest in a link prediction - an important task for graph-based data structures. Although there exist many approaches based on the graph theory and factorizations, there is still lack of methods that can work with multiple types of links and temporal information. The creation time of a link is an important aspect: it reflects age and credibility of the information. In this paper, we introduce a method that predicts missing links in RDF datasets. We model multiple relations of RDF as a tensor that incorporates the creation time of links as a key component too. We evaluate the proposed approach on real world datasets: an RDF representation of the ProgrammableWeb directory and a subset of the DBpedia focused on movies. The results show that the proposed method outperforms other link prediction approaches.

Personalised, Serendipitous and Diverse Linked Data Resource Recommendations

Autoři

Dojčinovski, M.; Vitvar, T.

Rok

2015

Publikováno

Knowledge Engineering and Knowledge Management. Cham: Springer International Publishing AG, 2015. pp. 106-110. Lecture Notes in Artificial Intelligence. ISSN 0302-9743. ISBN 978-3-319-17965-0.

Typ

Stať ve sborníku

DOI

10.1007/978-3-319-17966-7_11

Pracoviště

Katedra softwarového inženýrství

Anotace

Due to the huge and diverse amount of information, the actual access to a piece of information in the Linked Open Data (LOD) cloud still demands significant amount of effort. To overcome this problem, number of Linked Data based recommender systems have been developed. However, they have been primarily developed for a particular domain, they require human intervention in the dataset pre-processing step, and they can be hardly adopted to new datasets. In this paper, we present our method for personalised access to Linked Data, in particular focusing on its applicability and its salient features.

Time-aware Link Prediction in RDF Graphs

Autoři

Kuchař, J.; Dojčinovski, M.; Vitvar, T.

Rok

2015

Publikováno

WEBIST 2015 - Proceedings of the 11th International Conference on Web Information Systems and Technologies. Madeira: SciTePress, 2015. ISBN 978-989-758-106-9.

Typ

Stať ve sborníku

DOI

10.5220/0005428403900401

Pracoviště

Katedra softwarového inženýrství

Anotace

When a link is not explicitly present in an RDF dataset, it does not mean that the link could not exist in reality. Link prediction methods try to overcome this problem by finding new links in the dataset with support of a background knowledge about the already existing links in the dataset. In dynamic environments that change often and evolve over time, link prediction methods should also take into account the temporal aspects of data. In this paper, we present a novel time-aware link prediction method. We model RDF data as a tensor and take into account the time when RDF data was created. We use an ageing function to model a retention of the information over the time; lower the significance of the older information and promote more recent. Our evaluation shows that the proposed method improves quality of predictions when compared with methods that do not consider the time information.

Personalised Access to Linked Data

Autoři

Dojčinovski, M.; Vitvar, T.

Rok

2014

Publikováno

Knowledge Engineering and Knowledge Management. Cham: Springer International Publishing AG, 2014. p. 121-136. Lecture Notes in Artificial Intelligence. ISSN 0302-9743. ISBN 978-3-319-13703-2.

Typ

Stať ve sborníku

DOI

10.1007/978-3-319-13704-9_10

Pracoviště

Katedra softwarového inženýrství

Anotace

Recent efforts in the Semantic Web community have been primarily focused on developing technical infrastructure and technologies for efficient Linked Data acquisition, publishing and interlinking. Nevertheless, due to the huge and diverse amount of information, the actual access to a piece of information in the LOD cloud still demands significant amount of effort. In this paper, we present a novel configurable method for personalised access to Linked Data. The method recommends resources of interest from users with similar tastes. To measure the similarity between the users we introduce a novel resource semantic similarity metric, which takes into account the commonalities and informativeness of the resources. We validate and evaluate the method on a real-world dataset from the Web services domain. The results show that our method outperforms the other baseline methods in terms of accuracy, serendipity and diversity.

Matchmaking of IaaS cloud computing offers leveraging linked data

Autoři

Zaremba, M.; Bhiri, S.; Vitvar, T.; Hauswirth, M.

Rok

2013

Publikováno

Proceedings of the 28th Annual ACM Symposium on Applied Computing, SAC '13. New York: ACM, 2013. pp. 383-388. ISBN 978-1-4503-1656-9.

Typ

Stať ve sborníku

DOI

10.1145/2480362.2480440

Pracoviště

Katedra softwarového inženýrství

Anotace

Cloud Computing is an elastic execution environment becoming the dominating solution for scalable and on-demand computing, and a large market of cloud providers has recently emerged. IaaS is a realisation of the Cloud Computing at the level of processing, storage and networking resources. Currently, users lack a consolidated view of the IaaS market and it is time-consuming and cumbersome to identify the most suitable IaaS offers. IaaS services are highly configurable and their properties are often request-dependent and change dynamically. In this paper we introduce a service matchmaking approach for IaaS. We present models to define expressive search requests and IaaS descriptions which are grounded in lightweight semantic formalisms of RDF and SPARQL, and use Linked Data. Our approach supports dynamic generation of IaaS offers, and their filtering and ranking. We provide a proof-of-concept matchmaker operating on expressive search requests and descriptions of nineteen IaaS services including: Amazon EC2, Google Compute Engine, ElasticHosts, CloudSigma, and Joyent-Cloud.

Personalised Graph-Based Selection of Web APIs

Autoři

Dojčinovski, M.; Kuchař, J.; Vitvar, T.; Zaremba, M.

Rok

2012

Publikováno

The Semantic Web -- ISWC 2012. Heidelberg: Springer-Verlag, GmbH, 2012. p. 34-48. Lecture Notes in Computer Science. ISSN 0302-9743. ISBN 978-3-642-35175-4.

Typ

Stať ve sborníku

DOI

10.1007/978-3-642-35176-1_3

Pracoviště

Katedra softwarového inženýrství

Anotace

Modelling and understanding various contexts of users is important to enable personalised selection of Web APIs in directories such as Programmable Web. Currently, relationships between users and Web APIs are not clearly understood and utilized by existing selection approaches. In this paper, we present a semantic model of a Web API directory graph that captures relationships such as Web APIs, mashups, developers, and categories. We describe a novel configurable graph-based method for selection of Web APIs with personalised and temporal aspects. The method allows users to get more control over their preferences and recommended Web APIs while they can exploit information about their social links and preferences. We evaluate the method on a real-world dataset from ProgrammableWeb.com, and show that it provides more contextualised results than currently available popularity-based rankings.

Reports of the AAAI 2012 Spring Symposia

Autoři

Alani, H.; Bo, A.; Manish, J.; Takashi, K.; George, K.; William, L.; David, M.; Caroline, P.; Donald, S.; Keiki, T.; Milind, T.; Vitvar, T.

Rok

2012

Publikováno

AI Magazine. 2012, 33(3), 109-114. ISSN 0738-4602.

Typ

Článek

DOI

10.1609/aimag.v33i3.2428

Pracoviště

Katedra softwarového inženýrství

Anotace

The Association for the Advancement of Artificial Intelligence, in cooperation with Stanford University’s Department of Computer Science, was pleased to present the 2012 Spring Symposium Series, held Monday through Wednesday, March 26–28, 2012 at Stanford University, Stanford, California USA. The six symposia held were AI, The Fundamental Social Aggregation Challenge (cochaired by W. F. Lawless, Don Sofge, Mark Klein, and Laurent Chaudron); Designing Intelligent Robots (cochaired by George Konidaris, Byron Boots, Stephen Hart, Todd Hester, Sarah Osentoski, and David Wingate); Game Theory for Security, Sustainability, and Health (cochaired by Bo An and Manish Jain); Intelligent Web Services Meet Social Computing (cochaired by Tomas Vitvar, Harith Alani, and David Martin); Self-Tracking and Collective Intelligence for Personal Wellness (cochaired by Takashi Kido and Keiki Takadama); and Wisdom of the Crowd (cochaired by Caroline Pantofaru, Sonia Chernova, and Alex Sorokin). The papers of the six symposia were published in the AAAI technical report series.

Preference-Based Discovery of Dynamically Generated Service Offers

Autoři

Zaremba, M.; Vitvar, T.; Bhiri, S.; Hauswirth, M.

Rok

2011

Publikováno

The 8th International Conference on Services Computing. New York: IEEE Computer Society Press, 2011. pp. 338-345. ISBN 978-1-4577-0863-3.

Typ

Stať ve sborníku

DOI

10.1109/SCC.2011.12

Pracoviště

Katedra softwarového inženýrství

Anotace

The majority of service discovery approaches operate on abstract service descriptions. However, a single service often provides a significant number of possible service offers which are not reflected in abstract service descriptions. In this paper, we define a preference-based discovery model which operates on rich search request descriptions and dynamically generated individual service offers. We define search request model that include hard constraints, rich preferences, and flexible input parameters. We use a combination of utility functions and weighted rules for modeling rich preferences. We apply our results to an international shipping scenario in the experiment to prove the feasibility and usefulness of our approach in a realistic scenario.

RESTful Services with Lightweight Machine-readable Descriptions and Semantic Annotations

Autoři

Kopecký, J.; Vitvar, T.; Pedrinaci, C.; Maleshkova, M.

Rok

2011

Publikováno

REST: From Research to Practice. New York: Springer, 2011. p. 473-506. ISBN 978-1-4419-8302-2.

Typ

Kapitola v knize

DOI

10.1007/978-1-4419-8303-9_22

Pracoviště

Katedra softwarového inženýrství

Anotace

REST was originally developed as the architectural foundation for the human-oriented Web, but it has turned out to be a useful architectural style for machine-to-machine distributed systems as well. The most prominent wave of machine-oriented RESTful systems are Web APIs (also known as RESTful services), provided by Web sites such as Facebook, Flickr, and Amazon to facilitate access to the services from programmatic clients, including other Web sites. Currently, Web APIs do not commonly provide machine-processable service descriptions which would help tool support and even some degree of automation on the client side. This chapter presents current research on lightweight service description for Web APIs, building on the HTML documentation that accompanies the APIs. descriptions. HTML documentation can be annotated with a microformat that captures a minimal machine-oriented service model, or with RDFa using the RDF representation of the same service model. Machine-oriented descriptions (now embedded in the HTML documentation of Web APIs) can also capture the semantics of Web APIs and thus support further automation for clients. The chapter includes a discussion of various types and degrees of tool support and automation possible using the lightweight service descriptions.

Service Offer Discovery Using Genetic Algorithms

Autoři

Zaremba, M.; Vitvar, T.; Bhiri, S.; Hauswirth, M.

Rok

2011

Publikováno

9th IEEE European Conference on Web Services. New York: IEEE Computer Society Press, 2011. pp. 23-30. ISBN 978-1-4577-1532-7.

Typ

Stať ve sborníku

DOI

10.1109/ECOWS.2011.9

Pracoviště

Katedra softwarového inženýrství

Anotace

Available service descriptions are often specified using abstract definitions of service attributes. However, service consumers are mainly interested in concrete, consumable service offers which are specified using concrete values of service attributes. Service offers, due to their request dependence and dynamicity, have to be generated on-the-fly what may require interaction with a service. We propose a service description model that facilitates creation of consumable service offers. A large number of service offers can be generated considering flexible search requests. In order to address that, we propose a novel approach to dynamic generation of service offers. Our approach is based on genetic algorithms and reduces the number of relevant service offers. For evaluation purposes we apply our approach to the shipping domain where real shipping services on the Web are used to prove the effectiveness and usability of our approach in a real-world domain.

doc. Ing. Tomáš Vitvar, Ph.D.

Publikace

Anomaly Detection in Log Streams based on Time-Contextual Models

Time-Aware Log Anomaly Detection Based on Growing Self-organizing Map

Linked Web APIs Dataset

Posouzení vodního režimu malého povodí s kombinovaným využitím stopovačů

Crowdsourced Corpus with Entity Salience Annotations

Exploiting Temporal Dimension in Tensor-Based Link Prediction

Personalised, Serendipitous and Diverse Linked Data Resource Recommendations

Time-aware Link Prediction in RDF Graphs

Personalised Access to Linked Data

Matchmaking of IaaS cloud computing offers leveraging linked data

Personalised Graph-Based Selection of Web APIs

Reports of the AAAI 2012 Spring Symposia

Preference-Based Discovery of Dynamically Generated Service Offers

RESTful Services with Lightweight Machine-readable Descriptions and Semantic Annotations

Service Offer Discovery Using Genetic Algorithms