Dissertation theses
Operations Data Science: Machine learning on massive operational data from systems infrastructures
System infrastructures including operating systems, networking, web servers, load balancers, application servers, etc. produce massive amounts of operations data during systems runtime. Operations data science aims to improve various machine learning methods to help understand patterns, correlations or similarities from the past systems’ behavior in order to improve operations practice, for example, systems maintenance, changes in systems design or root cause analysis of incidents. This includes predicting systems behaviors when unplanned or unexpected events occur such as connectivity lost, failures of servers or spikes in systems workloads. This research will improve supervised and unsupervised learning methods tailored for problems from operational data analytics and cover a wide spectrum of underlying machine learning tasks including classification, clustering or regression.