Software System Performance Debugging with Kernel Events Feature Guidance

Abstract

To diagnose performance problems in production systems, many OS kernel-level monitoring and analysis tools have been proposed. Using low level kernel events provides benefits in efficiency and transparency to monitor application software. On the other hand, such approaches miss application-specific semantic information which can be effective to differentiate the trace patterns from distinct application logic. This paper introduces new trace analysis techniques based on event features to improve kernel event based performance diagnosis tools. Our prototype, AppDiff, is based on two analysis features: system resource features convert kernel events to resource usage metrics, thereby enabling the detection of various performance anomalies in a unified way; program behavior features infer the application logic behind the low level events. By using these features and conditional probability, AppDiff can detect outliers and improve the diagnosis of application performance.

Publication
IEEE/IFIP Network Operation and Management Symposium, Krakow, Poland