Ing. Jaroslav Kuchař, Ph.D.

Publikace

Associative Classification in R: arc, arulesCBA, and rCBA

Autoři
Hahsler, M.; Johnson, I.; Kliegr, T.; Kuchař, J.
Rok
2019
Publikováno
The R Journal. 2019, 11(2), 254-267. ISSN 2073-4859.
Typ
Článek
Anotace
Several methods for creating classifiers based on rules discovered via association rule mining have been proposed in the literature. These classifiers are called associative classifiers and the best-known algorithm is Classification Based on Associations (CBA). Interestingly, only very few implementations are available and, until recently, no implementation was available for R. Now, three packages provide CBA. This paper introduces associative classification, the CBA algorithm, and how it can be used in R. A comparison of the three packages is provided to give the potential user an idea about the advantages of each of the implementations. We also show how the packages are related to the existing infrastructure for association rule mining already available in R.

Content-aware Collaborative Filtering in Point-ofInterest Recommendation Systems

Autoři
Samigullina, G.; Kuchař, J.
Rok
2019
Publikováno
DATA A ZNALOSTI & WIKT 2019. Košice: Technická univerzita v Košiciach, 2019. p. 20-25. ISBN 978-80-553-3354-0.
Typ
Stať ve sborníku
Anotace
With the availability of the vast amount of users and Location-based social networks, the problem of POI recommendations has been widely studied and received significant research attention in the last years. While previous works of POI recommendation mostly focused on investigating the spatial, temporal, and social influence, the use of additional content information has not been directionally studied. In this paper, we propose the content-aware matrix factorization method based on incorporating POI attributes and categories information. We propose two variants of the algorithm that can work with an explicit and implicit feedback. Experimental results show that the proposed method improves the quality of recommendation and outperforms most state-ofthe-art collaborative filtering algorithms.

Detekce anomálií v otevřených datech o znečištění ovzduší polétavým prachem

Rok
2019
Publikováno
DATA A ZNALOSTI & WIKT 2019. Košice: Technická univerzita v Košiciach, 2019. p. 66-71. ISBN 978-80-553-3354-0.
Typ
Stať ve sborníku
Anotace
Senzorická síť veřejného osvětlení na pražském Karlínském náměstí poskytuje měření znečištění ovzduší polétavým prachem PM10 jako otevřená data. V této práci v nich detekujeme anomálie pomocí algoritmů strojového učení pro predikci časových řad a prahování. Chceme, aby se algoritmus strojového učení naučil pravidelnosti v datech a pokud se stane něco neočekávaného, tak to prahováním odhalíme. Experimentovali jsme s lineární regresí a LSTM rekurentní neuronovou sítí, které jsme mezi sebou porovnávali střední kvadratickou chybou. Ukázalo se, že lineární regrese, která predikuje z posledních dvou měření, dosahuje lepších výsledků. Anomálie jsme detekovali z rozdílů predikovaných a skutečných hodnot. Práh pro detekování anomálií jsme vypočítali z histogramu rozdílů predikcí a skutečně naměřených hodnot. Testování ukázalo, že takto navržená metoda dokáže odhalit některé anomálie v měřeních polétavého prachu PM10, ale mnoho anomálií (například postupně nabíhajících) nedetekuje.

Tuning Hyperparameters of Classification Based on Associations (CBA)

Autoři
Kliegr, T.; Kuchař, J.
Rok
2019
Publikováno
Proceedings of the 19th Conference Information Technologies - Applications and Theory (ITAT 2019). Aachen: CEUR Workshop Proceedings, 2019. p. 9-16. vol. 2473. ISSN 1613-0073.
Typ
Stať ve sborníku
Anotace
Classification models composed of crisp rules provide excellent explainability. The limitation of many conventional rule learning algorithms is the separate-and-conquer strategy, which may be slow on large data. Association Rule Classifiers (ARC) is an alternative approach that can be very fast on massive datasets but is highly susceptible to the correct choice of metaparameters. Most existing ARC algorithms use default thresholds of 50% for minimum confidence and 1% minimum support, which can result in excessively long rule generation or underperforming models. Due to the high-costs that can be associated with evaluation of single combination, it is impractical to use standard metaparameter optimization approaches. In this paper, we introduce two variant threshold tuning algorithms specifically designed for ARC. Evaluation on 22 standard UCI datasets shows promising results in terms of model size and accuracy in comparison with the default thresholds. The implementation of the proposed algorithms is made available in R packages rCBA and arc, which are available in the CRAN repository.

Dolování z otevřených dat o rozpočtech a výdajích

Autoři
Chudán, D.; Svátek, V.; Kuchař, J.; Vojíř, S.
Rok
2018
Publikováno
Acta Informatica Pragensia. 2018, 7(1), 58-73. ISSN 1805-4951.
Typ
Článek
Anotace
Metody dolování z dat jsou aplikovány ve stále větší míře, a to i v doménách, které tradičně nemají tak silnou podporu analytických nástrojů a kde převládá ruční práce analytika. Použití těchto metod v oblasti fiskálních dat umožní jejich hlubší analýzu a může přinést nová zjištění. Nasazení pokročilých metod dolování z dat je jednou z částí projektu OpenBudgets.eu, který se zaměřuje na transparentnost a odpovědnost v oblasti nakládání s veřejnými prostředky. Tento přehledový článek shrnuje některé zkušenosti autorů z tohoto projektu získané při vývoji, implementaci a aplikaci vybraných metod dolování z fiskálních dat. Jedná se zejména o metody detekce anomálií a dolování asociačních pravidel. Tyto metody jsou integrovány do centrální platformy projektu, která je k dispozici pokročilým i běžným uživatelům v případě zájmu o analýzu fiskálních dat. Pilotní analýzy ukázaly, že problémem dataminingové analýzy v této doméně je velký objem nacházených pravidel a různorodý původ jejich vzniku.

EasyMiner.eu: Web Framework for Interpretable Machine Learning based on Rules and Frequent Itemsets

Autoři
Vojíř, S.; Zeman, V.; Kuchař, J.; Kliegr, T.
Rok
2018
Publikováno
Knowledge-Based Systems. 2018, 150 111-115. ISSN 0950-7051.
Typ
Článek
Anotace
EasyMiner (http://www.easyminer.eu) is a web-based machine learning system for interpretable machine learning based on frequent itemsets. The system currently offers association rule learning (apriori, FP-Growth) and classification (CBA). For association rule learning and classification, EasyMiner offers a visual interface designed for interactivity, allowing the user to define a constraining pattern for the mining task. The CBA algorithm can also be used for pruning of the rule set, thus addressing the common problem of “too many rules” on the output, and the implementation supports automatic tuning of confidence and support thresholds. The development version additionally supports anomaly detection (FPI and its variations) and linked data mining (AMIE+). EasyMiner is dockerized, some of its components are available as open source R packages.

Framework for Distributed Computing on the Web

Autoři
Šiller, J.; Kuchař, J.
Rok
2018
Publikováno
Proceedings of the 18th Conference Information Technologies - Applications and Theory (ITAT 2018). Aachen: CEUR Workshop Proceedings, 2018. p. 161-167. vol. 2203. ISSN 1613-0073. ISBN 9781727267198.
Typ
Stať ve sborníku
Anotace
This work is a brief summary of a master thesis that focuses on design and implementation of a framework that uses computers of website visitors as computing nodes through web browsers. It contains an analysis of the Web environment, summarization of previous approaches and projects, design and implementation of the framework. The work describes the solution of computing node failure, reaction to slow computing node, possibilities of controlling the load of the framework on a website visitor’s computer, strategies for work distribution and security of the framework. At the end of the work, the experiment results and proposal of improvements are listed.

Spotlighting Anomalies using Frequent Patterns

Autoři
Kuchař, J.; Svátek, V.
Rok
2018
Publikováno
KDD 2017 Workshop on Anomaly Detection in Finance. Proceedings of Machine Learning Research, 2018. p. 33-42. vol. 71. ISSN 1938-7228.
Typ
Stať ve sborníku
Anotace
Approaches for the anomaly detection task based on frequent pattern mining follow the paradigm: if an instance contains more frequent patterns, it means that this data instance is unlikely to be an anomaly. This concept can be used in financial industry to reveal contextual anomalies. The main contribution of this paper is an approach that includes a novel formula for computation of anomaly scores. We evaluated the proposed approach on baseline datasets and present a use case on a real world financial dataset. We also propose a way how to explain the anomaly to the users. Implementations of the evaluated algorithms and experiments are available online in R.

Vyhledávání obrázků v rozšířené realitě

Autoři
Chmelař, P.; Kuchař, J.
Rok
2018
Publikováno
Data a znalosti & WIKT. Brno: Vysoké učení technické v Brně. Fakulta informačních technologií, 2018. p. 111-114. 1. ISBN 978-80-214-5679-2.
Typ
Stať ve sborníku
Anotace
Aplikace pro mobilní telefony se staly každodenní součástí našich životů. Jejich pole působnosti je ale zatím ve většině případů striktně limitované displejem daného zařízení. Jedním ze současných trendů ve vývoji mobilních aplikací je právě rozšiřování možností aplikace za hranice samotného přístroje -- do prostředí virtuální, potažmo rozšířené reality. Cílem tohoto projektu je vytvoření systému, který umožní obohacení mobilních aplikací o vyhledávání v prostředí rozšířené reality. Hlavní komponentou systému je serverová aplikace poskytující službu k vyhodnocení shod přijatého obrázku s definovanou databází obrázků. Pro usnadnění implementace vyhledávání v rozšířené realitě je k dispozici knihovna pro platformu iOS. Správa databází obrázků je umožněna prostřednictvím REST API nebo pomocí jednoduchého webového rozhraní.

EasyMiner – Short History of Research and Current Development

Autoři
Kliegr, T.; Kuchař, J.; Vojíř, S.; Zeman, V.
Rok
2017
Publikováno
ITAT 2017: Information Technologies – Applications and Theory. Aachen: CEUR Workshop Proceedings, 2017. p. 235-239. vol. 1885. ISSN 1613-0073.
Typ
Stať ve sborníku
Anotace
EasyMiner (easyminer.eu) is an academic data mining project providing data mining of association rules, building of classification models based on association rules and outlier detection based on frequent pattern mining. It differs from other data mining systems by adapting the “web search” paradigm. It is web-based, providing both a REST API and a user interface, and puts emphasis on interactivity, simplicity of user interface and immediate response. This paper will give an overview of research related to the EasyMiner project.

InBeat: JavaScript recommender system supporting sensor input and linked data

Autoři
Kuchař, J.; Kliegr, T.
Rok
2017
Publikováno
Knowledge-Based Systems. 2017, 135 40-43. ISSN 0950-7051.
Typ
Článek
Anotace
Interest Beat (inbeat.eu) is an open source recommender framework that fulfills some of the demands raised by emerging applications that infer ratings from sensor input or use linked open data cloud for feature expansion. As a recommender algorithm, InBeat uses association rules, which allow to explain why a specific recommendation was made. Due to modular architecture, other algorithms can be easily plugged in. InBeat has a pure JavaScript version, which allows to confine processing to a client-side device. There is a performance optimized server-side bundle, which succesfully participated in two recent recommender competitions involving large volumes of streaming data. InBeat works on a number of platforms and is also available for Docker.

News Recommender System based on Association Rules @ CLEF NewsREEL 2017

Autoři
Golian, C.; Kuchař, J.
Rok
2017
Publikováno
Working Notes of CLEF 2017 - Conference and Labs of the Evaluation Forum. Aachen: CEUR Workshop Proceedings, 2017. vol. 1866. ISSN 1613-0073.
Typ
Stať ve sborníku
Anotace
Digital editions of newspapers cause information overflow and users have problems choosing what they want to read. Systems which recommend news articles are suitable to solve such problems. Nevertheless, they face challenges unknown to the systems recommending books or movies such as a frequency of producing the new content. CLEF NewsREEL challenge enables to compare and evaluate news recommendation systems in an online and offline task focused on recommending articles to real users and tuning of algorithms respectively. This paper deals with an approach based on association rules acting as a classifier. In our approach we experimented with settings that allows to reduce the amount of rules used for the classification and increase the performance that is crucial for real recommendations. We evaluated our approach in both tasks of the CLEF NewsREEL 2017 challenge.

Outlier (Anomaly) Detection Modelling in PMML

Autoři
Kuchař, J.; Ashenfelter, A.; Kliegr, T.
Rok
2017
Publikováno
RuleML+RR 2017 - Doctoral Consortium, Challenge, Industry Track, Tutorials and Posters. Aachen: CEUR Workshop Proceedings, 2017. vol. 1875. ISSN 1613-0073.
Typ
Stať ve sborníku
Anotace
PMML is an industry-standard XML-based open format for representing statistical and data mining models. Since PMML does not yet support outlier (anomaly) detection, in this paper we propose a new outlier detection model to foster interoperability in this emerging field. Our proposal is included in the PMML RoadMap for PMML 4.4. We demonstrate the proposed format on one supervised and two unsupervised outlier detection approaches: association rule-based classifier CBA, frequent-pattern based method FPOF and isolation forests.

Recommending News Articles using Rule-based Classifier

Autoři
Golian, C.; Kuchař, J.
Rok
2017
Publikováno
Data a znalosti 2017. Plzeň: Západočeská univerzita v Plzni, 2017. p. 51-55. ISBN 978-80-261-0720-0.
Typ
Stať ve sborníku
Anotace
In this paper we summarize our experiments with a rule-based classifier as a recommender within CLEF NewsREEL 2017 challenge. Systems that recommend news articles are suitable to solve information overflow in digital editions of newspapers, when users have problems choosing what they want to read. They face challenges unknown to the systems recommending books or movies such as a frequency of producing the new content. This paper deals with an approach based on association rules acting as a classifier. In our approach we experimented with settings that allow reducing the amount of rules used for the classification and increasing the performance that is crucial for real recommendations.

Using EasyMiner API for Financial Data Analysis in the OpenBudgets.eu Project

Autoři
Vojíř, S.; Zeman, V.; Kuchař, J.; Kliegr, T.
Rok
2017
Publikováno
RuleML+RR 2017 - Doctoral Consortium, Challenge, Industry Track, Tutorials and Posters. Aachen: CEUR Workshop Proceedings, 2017. vol. 1875. ISSN 1613-0073.
Typ
Stať ve sborníku
Anotace
This paper presents a use case for the data mining system EasyMiner in European project OpenBudgets.eu, which is concerned with publication and analysis of financial data of municipalities. EasyMiner is a web-based data mining system. This paper focuses on its new outlier detection functionality, which relies on frequent pattern mining. In addition, the system supports association rule discovery and building of rule-based classification models. The system exposes a REST API and can thus be easily integrated in third party applications.

Využití EasyMiner API v projektu OpenBudgets.eu

Autoři
Vojíř, S.; Zeman, V.; Kuchař, J.; Kliegr, T.
Rok
2017
Publikováno
Data a znalosti 2017. Plzeň: Západočeská univerzita v Plzni, 2017. p. 56-60. ISBN 978-80-261-0720-0.
Typ
Stať ve sborníku
Anotace
V souvislosti s rostoucí popularitou využívání data miningových dat lze registrovat také rostoucí poptávku po možnosti integrace data miningových algoritmů a systémů do komplexnějších, uživatelsky přívětivějších aplikací. Tento příspěvek prezentuje novou verzi systému EasyMiner, integrovanou do softwarového řešení vyvíjeného v rámci evropského projektu OpenBudgets.eu, který je zaměřen na zpřístupňování a analýzy finančních dat samospráv. EasyMiner je webový data miningový systém podporující dolování asociačních pravidel, tvorbu klasifikačních modelů a v současné verzi nově také detekci outlierů. Příslušná funkcionalita je k dispozici nejen prostřednictvím grafického uživatelského rozhraní, ale také prostřednictvím komplexního REST API.

Analýza článků z českých zpravodajských serverů

Autoři
Filipová, M.; Kuchař, J.
Rok
2016
Publikováno
Proceedings in Informatics and Information Technologies - (WIKT & DaZ 2016) 11th Workshop on Intelligent and Knowledge Oriented Technologies 35th Conference on Data and Knowledge. Bratislava: Vydavatel'stvo STU, 2016. pp. 97-101. ISBN 978-80-227-4619-9.
Typ
Stať ve sborníku
Anotace
V dnešní době, kdy množství informací na internetu stále narůstá, se automatické zpracování a třídění dat stalo velmi oblíbeným oborem informačních technologií. Jednou z oblastí je i internetové zpravodajství. Cílem tohoto projektu je nástroj pokrývající celý proces pro základní analýzu článků z českých zpravodajských serverů. Projekt je zaměřen především na extrakci relevantních dat a jejich analýzu. V první části zahrnuje ale i související crawler, díky kterému je možné stáhnout články k analýze ze zpravodajských webů. V druhé části je ze stažených HTML stránek automaticky extrahován relevantní obsah článků a jejich další atributy. Třetí částí je pak textová analýza využívající existující postupy a nástroje, která se zaměřuje na extrakci pojmenovaných entit a analýzu sentimentu českého textu. Nad výslednými strukturovanými daty se lze dotazovat z různých pohledů a provádět tedy různé druhy experimentů.

Exploiting Temporal Dimension in Tensor-Based Link Prediction

Rok
2016
Publikováno
Web Information Systems and Technologies. Cham: Springer International Publishing, 2016. pp. 211-231. Lecture Notes in Business Information Processing. ISSN 1865-1348. ISBN 978-3-319-30995-8.
Typ
Stať ve sborníku vyzvaná či oceněná
Anotace
In the recent years, there is a significant interest in a link prediction - an important task for graph-based data structures. Although there exist many approaches based on the graph theory and factorizations, there is still lack of methods that can work with multiple types of links and temporal information. The creation time of a link is an important aspect: it reflects age and credibility of the information. In this paper, we introduce a method that predicts missing links in RDF datasets. We model multiple relations of RDF as a tensor that incorporates the creation time of links as a key component too. We evaluate the proposed approach on real world datasets: an RDF representation of the ProgrammableWeb directory and a subset of the DBpedia focused on movies. The results show that the proposed method outperforms other link prediction approaches.

Využití cloudu pro dolování asociačních pravidel z velkých dat přes webové rozhraní

Autoři
Zeman, V.; Vojiř, S.; Kuchař, J.; Kliegr, T.
Rok
2016
Publikováno
Proceedings in Informatics and Information Technologies - (WIKT & DaZ 2016) 11th Workshop on Intelligent and Knowledge Oriented Technologies 35th Conference on Data and Knowledge. Bratislava: Vydavatel'stvo STU, 2016. pp. 259-263. ISBN 978-80-227-4619-9.
Typ
Stať ve sborníku
Anotace
Webová aplikace EasyMiner je akademický nástroj pro získávání znalostí z malých a středně velkých dat ve formě asociačních pravidel. Nová verze tohoto systému využívá prostředí Apache Hadoop a Apache Spark pro zpracování velkých datových zdrojů na výpočetním clusteru MetaCentra sdružení CESNET. Aplikace se skládá z několika mikro služeb, které se starají o nahrávání velkých dat do distribuovaného úložiště HDFS, transformaci dat v clusteru do normalizované formy a dolování znalostí z datasetů v podobě asociačních pravidel s využitím výpočetních prostředků clusteru pomocí nástroje Apache Spark. S těmito mikro službami se dá komunikovat prostřednictvím RESTového rozhraní a jako celek tvoří data miningový software fungující jako webová služba - SaaS.

Augmenting a Feature Set of Movies Using Linked Open Data

Autoři
Rok
2015
Publikováno
Rule Challenge and Doctoral Consortium @ RuleML 2015. Aachen: CEUR Workshop Proceedings, 2015. ISSN 1613-0073.
Typ
Stať ve sborníku
Anotace
Augmenting a feature set using mappings to the Web of data is an up-and-coming way to enrich data in the original dataset. Those enrichments are valuable especially for the recent preference learning algorithms and recommender systems. In this paper, we describe the process of mapping and augmenting the movie ratings dataset MovieTweetings from the perspective of RecSysRules 2015 Challenge. The ad-hoc queries to DBpedia are used as an underlying concept. To the best of our knowledge, there is no existing mapping dataset of movies for MovieTweetings. We also provide a brief discussion about the benefits of the augmented feature set for an elementary rule-based representation of the user preferences.

Benchmark of Rule-Based Classifiers in the News Recommendation Task

Autoři
Kliegr, T.; Kuchař, J.
Rok
2015
Publikováno
Experimental IR Meets Multilinguality, Multimodality, and Interaction - 6th International Conference of the CLEF Association. Berlin: Springer-Verlag, 2015. p. 130-141. Lecture Notes in Computer Science. ISSN 0302-9743. ISBN 978-3-319-24026-8.
Typ
Stať ve sborníku
Anotace
In this paper, we present experiments evaluating Association Rule Classification algorithms on on-line and off-line recommender tasks of the CLEF NewsReel 2014 Challenge. The second focus of the experimental evaluation is to investigate possible performance optimizations of the Classification Based on Associations algorithm. Our findings indicate that pruning steps in CBA reduce the number of association rules substantially while not affecting accuracy. Using only part of the data employed for the rule learning phase in the pruning phase may also reduce training time while not affecting accuracy significantly.

EasyMiner/R: Web Interface for Rule Learning and Classification in R

Autoři
Vojíř, S.; Zeman, V.; Kuchař, J.; Kliegr, T.
Rok
2015
Publikováno
Rule Challenge and Doctoral Consortium @ RuleML 2015. Aachen: CEUR Workshop Proceedings, 2015. ISSN 1613-0073.
Typ
Stať ve sborníku
Anotace
EasyMiner is a web-based visual interface for association rule learning. This paper presents a preview of the next release, which uses the R environment as the data processing backend. EasyMiner/R uses the arules package to learn rules. It uses the Classifications Based on Associations (CBA) algorithm as a classifier and to perform rule pruning. Experimental results show that EasyMiner with the R-based backend is able to handle larger datasets than the previous version.

Time-aware Link Prediction in RDF Graphs

Rok
2015
Publikováno
WEBIST 2015 - Proceedings of the 11th International Conference on Web Information Systems and Technologies. Madeira: SciTePress, 2015. ISBN 978-989-758-106-9.
Typ
Stať ve sborníku
Anotace
When a link is not explicitly present in an RDF dataset, it does not mean that the link could not exist in reality. Link prediction methods try to overcome this problem by finding new links in the dataset with support of a background knowledge about the already existing links in the dataset. In dynamic environments that change often and evolve over time, link prediction methods should also take into account the temporal aspects of data. In this paper, we present a novel time-aware link prediction method. We model RDF data as a tensor and take into account the time when RDF data was created. We use an ageing function to model a retention of the information over the time; lower the significance of the older information and promote more recent. Our evaluation shows that the proposed method improves quality of predictions when compared with methods that do not consider the time information.

Bag-of-Entities text representation for client-side (video) recommender systems

Autoři
Kuchař, J.; Kliegr, T.
Rok
2014
Publikováno
RecSysTV 2014. 2014.
Typ
Stať ve sborníku
Anotace
Client-side execution of a recommender system requires enrichment of the content delivered to the user with a list of potentially related content. A possible bottleneck for client-side recommendation is the data volume entailed by transferring the feature set describing each content item to the client, and the computational resources needed to process this feature set. This paper investigates whether the representation of the textual content (e.g. of videos) with Bag of Entities (BoE) vector generated by a wikifier can yield a classifier with the same accuracy at smaller size than the standard BoW approach. Experimental evaluation performed on the Reuters-21578 text categorization collection shows that there is a small improvement for small term vector sizes.

Doporučování multimediálního obsahu s využitím senzoru Microsoft Kinect

Autoři
Kuchař, J.; Kliegr, T.
Rok
2014
Publikováno
Proceedings of the 13th Annual Conference Znalosti 2014. Praha: VŠE, 2014. pp. 84-87. ISBN 978-80-245-2054-4.
Typ
Stať ve sborníku
Anotace
Tento příspěvek představuje online recommender InBeat.eu. Systém umožňuje sběr explicitních a implicitních zpětných vazeb od uživatelů, které jsou použity jako ukazatele zájmu. Demonstrace systému je zaměřena na interakci uživatelů s multimediálním obsahem, konkrétně se jedná o videa a scénář televizních zpráv. Videa jsou automaticky sémanticky anotována s pomocí nástroje pro hledání pojmenovaných entit. Důležitou součástí sytému je propojení se senzorem Microsoft Kinect, který umožňuje analyzovat natočení hlavy za účelem reálného vyhodnocení sledování daného videa. Ze získaných dat jsou odvozena asociační pravidla představující preference daného uživatele. Tyto pravidla jsou následně použita pro doporučování.

InBeat: Recommender System as a Service

Autoři
Kuchař, J.; Kliegr, T.
Rok
2014
Publikováno
CLEF2014 Working Notes. Tilburg: CEUR Workshop Proceedings, 2014. p. 837-844. CLEF. ISSN 1613-0073.
Typ
Stať ve sborníku
Anotace
Interest Beat (inbeat.eu) is a service for recommendation of content. InBeat was designed with emphasis on versatility, scalability and extensibility. The core contains the General Analytics INterceptor module, which collects and aggregates user interactions, the Preference Learning module and the Recommender System module. In this paper, we describe InBeat general architecture, putting emphasis on its high- performance architecture that was used in the CLEF-NEWSREEL: News Recommendation Evaluation Lab.

KINterestTV - Towards Non-invasive Measure of User Interest While Watching TV

Autoři
Leroy, J.; Rocca, F.; Mancas, M.; Madhkour, R.B.; Grisard, F.; Kliegr, T.; Kuchař, J.; Vit, J.; Pirner, I.; Zimmermann, P.
Rok
2014
Publikováno
Innovative and Creative Developments in Multimodal Interaction Systems. Berlin: Springer, 2014. pp. 179-199. IFIP Advances in Information and Communication Technology. ISSN 1868-4238. ISBN 978-3-642-55142-0.
Typ
Stať ve sborníku
Anotace
Is it possible to determine only by observing the behavior of a user what are his interests for a media? The aim of this project is to develop an application that can detect whether or not a user is viewing a content on the TV and use this information to build the user profile and to make it evolve dynamically. Our approach is based on the use of a 3D sensor to study the movements of a user’s head to make an implicit analysis of his behavior. This behavior is synchronized with the TV content (media fragments) and other user interactions (clicks, gestural interaction) to further infer viewer’s interest. Our approach is tested during an experiment simulating the attention changes of a user in a scenario involving second screen (tablet) interaction, a behavior that has become common for spectators and a typical source of attention switches.

Learning Business Rules with Association Rule Classifiers

Autoři
Kliegr, T.; Kuchař, J.; Sottara, D.; Vojíř, S.
Rok
2014
Publikováno
Rules on the Web. From Theory to Applications. Cham: Springer International Publishing AG, 2014. p. 236-250. Lecture Notes in Computer Science. ISSN 0302-9743. ISBN 978-3-319-09869-2.
Typ
Stať ve sborníku
Anotace
The main obstacles for a straightforward use of association rules as candidate business rules are the excessive number of rules discovered even on small datasets, and the fact that contradicting rules are generated. This paper shows that Association Rule Classification algorithms, such as CBA, solve both these problems, and provides a practical guide on using discovered rules in the Drools BRMS and on setting the ARC parameters. Experiments performed with modified CBA on several UCI datasets indicate that data coverage rule pruning keeps the number of rules manageable, while not adversely impacting the accuracy. The best results in terms of overall accuracy are obtained using minimum support and confidence thresholds. Disjunction between attribute values seem to provide a desirable balance between accuracy and rule count, while negated literals have not been found beneficial.

Orwellian Eye: Video Recommendation with Microsoft Kinect

Autoři
Kliegr, T.; Kuchař, J.
Rok
2014
Publikováno
ECAI 2014. Amsterdam: IOS Press, 2014. pp. 1227-1228. Frontiers in Artificial Intelligence and Applications. ISSN 0922-6389. ISBN 978-1-61499-418-3.
Typ
Stať ve sborníku
Anotace
This paper demonstrates Interest Beat (InBeat.eu) as a recommender system for online videos, which determines user interest in the content based on gaze tracking with Microsoft Kinect in addition to explicit user feedback. Content of the videos is represented using a semantic wikifier. User profile is constructed from preference rules, which are discovered with an association rule learner.

When TV meets the Web: towards personalised digital media

Autoři
Tsatsou, D.; Mancas, M.; Kuchař, J.; Nixon, L.; Vacura, M.; Leroy, J.; Rocca, F.; Mezaris, V.
Rok
2014
Publikováno
Semantic Multimedia Analysis and Processing. Boca Raton: CRC Press, 2014. p. 221-256. Digital Imaging and Computer Vision. ISBN 978-1-4665-7549-3.
Typ
Kapitola v knize
Anotace
The rise of new paradigms in the field of television and digital media distribution (e.g. Smart TV, IPTV, Social TV) has opened a new digital world of data communication opportunities but at the same time exacerbated the information overload problem for media consumers and providers. Therefore, the need for personalized content delivery has extended from the traditional web to the networked media domain. This chapter presents a comprehensive research in the field of capturing and representing user preferences and context and an overview of relevant digital media-specific personalized recommendation techniques. Subsequently, it describes the vision and first personalization approach adopted within the LinkedTV EU project, for profiling and contextualizing users and providing targeted information and content in a linked media environment.

GAIN: web service for user tracking and preference learning - a smart TV use case

Autoři
Kuchař, J.; Kliegr, T.
Rok
2013
Publikováno
RecSys '13 Proceedings of the 7th ACM conference on Recommender systems. New York: ACM, 2013. pp. 467-468. ISBN 978-1-4503-2409-0.
Typ
Stať ve sborníku
Anotace
GAIN (inbeat.eu) is a web application and service for capturing and preprocessing user interactions with semantically described content. GAIN outputs a set of instances in tabular form suitable for further processing with generic machine-learning algorithms. GAIN is demoed as a component of a "SMART-TV" recommender system. Content is automatically described with DBpedia types using a Named Entity Recognition (NER) system. Interest is determined based on explicit user actions and user's attention computed by 3D head pose estimation. Preference rules are learnt with an association rule mining algorithm. These can be e.g. deployed to a business rules system, acting as a recommender.

GAIN: Analysis of Implicit Feedback on Semantically Annotated Content

Autoři
Kuchař, J.; Kliegr, T.
Rok
2012
Publikováno
WIKT 2012: 7th Workshop on Intelligent and Knowledge Oriented Technologies. Slovenská technická univerzita v Bratislave, 2012. pp. 75-78. ISBN 978-80-227-3812-5.
Typ
Stať ve sborníku
Anotace
The trend in application development is to provide a personalized interface. The availability of the user preference level associated with user actions is the key for the personalization process. This paper describes a "work-in-progress" framework for deriving user preference from actions performed on semantically annotated objects - be it web pages or TV news. Preference level is computed using supervised learning with genetic programming from implicit feedback, which might be time on page for the web domain, or the user engagement level for the TV domain. We provide tool called GAIN (General Analytics INterceptor) covering the whole approach at wa.vse.cz.

Personalised Graph-Based Selection of Web APIs

Autoři
Rok
2012
Publikováno
The Semantic Web -- ISWC 2012. Heidelberg: Springer-Verlag, GmbH, 2012. p. 34-48. Lecture Notes in Computer Science. ISSN 0302-9743. ISBN 978-3-642-35175-4.
Typ
Stať ve sborníku
Anotace
Modelling and understanding various contexts of users is important to enable personalised selection of Web APIs in directories such as Programmable Web. Currently, relationships between users and Web APIs are not clearly understood and utilized by existing selection approaches. In this paper, we present a semantic model of a Web API directory graph that captures relationships such as Web APIs, mashups, developers, and categories. We describe a novel configurable graph-based method for selection of Web APIs with personalised and temporal aspects. The method allows users to get more control over their preferences and recommended Web APIs while they can exploit information about their social links and preferences. We evaluate the method on a real-world dataset from ProgrammableWeb.com, and show that it provides more contextualised results than currently available popularity-based rankings.

Learning Semantic Web Usage Profiles by Using Genetic Algorithms

Autoři
Kuchař, J.; Jelínek, I.
Rok
2011
Publikováno
International Journal on Information Technologies and Security. 2011, 3(4), 3-20. ISSN 1313-8251.
Typ
Článek
Anotace
Web usage profile is very important in recommender systems. More interesting is the semantic enriched profile, which can describe visitor intents by ontologies and express more information and relations of visitor's character. Our research is based on processing semantically enriched clickstream and application of scoring algorithm, which is based on symbolic regression. A semantic enrichment uses Linked Data principles. The scoring assigns to each pageview a value, which represents and involves visitor interests. Scoring involves all know attributes of each pageview including semantic annotation. The score of each pageview is used to establish a visitor profile. The established profile can be in form of ontologies. In this paper, we propose integrate scoring algorithm into semantic web usage mining and publish visitor profile in RDF/OWL representation. We suggest merge the profiles from different web sites and integrate additional related information from publicly available reso