Master Thesis Defense on A Distributed and Scalable Real-Time Log Analysis

Oct 20, 2020 14:30 — 16:00
Online (zoom)

Monitoring software behaviour is being done in various ways. Log messages are being output by almost any kind of running software system. Therefore, learning how software behaves from doing analysis over log data can lead to new insights about the system. However, the number of log messages in a computer system grow fast, and analysing the log data by hand is a time-consuming job.

The objective of this study is to propose and implement a scalable architecture for doing real-time log analysis. Log data is structured so that analysis can take place, and the solution is horizontally scalable in every module so that the approach can scale with an ever-growing software solution. The focus of the study is on scalability, and ease-of-use of the implementation of the proposed approach.

The proposed solution can scale horizontally and the test set up showed that reporting features for anomalies remained instantaneous when processing 1.2 million log lines per minute. The usability of the proposed approach is tested in a case study at Weave, where bugs were found by running the proposed solution in a controlled environment.