Toward a Workflow for Identifying Jobs with Similar I/O Behavior Utilizing Time Series Analysis

2021 | conference paper. A publication of Göttingen

Jump to: Cite & Linked | Documents & Media | Details | Version history

Cite this publication

​Kunkel J, Betke E. ​Toward a Workflow for Identifying Jobs with Similar I/O Behavior Utilizing Time Series Analysis​. ​​High Performance Computing: ISC High Performance 2021 International Workshops, Revised Selected Papers. ISC HPC; ​2021-06-24​ - 2021​; ​​Frankfurt. ​Springer; ​2021. doi:10.1007/978-3-030-90539-2_10.

Documents & Media

License

Published Version

GRO License GRO License

Details

Authors
Kunkel, Julian ; Betke, Eugen
Abstract
One goal of support staff at a data center is to identify inefficient jobs and to improve their efficiency. Therefore, a data center deploys monitoring systems that capture the behavior of the executed jobs. While it is easy to utilize statistics to rank jobs based on the utilization of computing, storage, and network, it is tricky to find patterns in 100,000 jobs, i.e., is there a class of jobs that aren't performing well. Similarly, when support staff investigates a specific job in detail, e.g., because it is inefficient or highly efficient, it is relevant to identify related jobs to such a blueprint. This allows staff to understand the usage of the exhibited behavior better and to assess the optimization potential. In this article, our goal is to identify jobs similar to an arbitrary reference job. In particular, we sketch a methodology that utilizes temporal I/O similarity to identify jobs related to the reference job. Practically, we apply several previously developed time series algorithms. A study is conducted to explore the effectiveness of the approach by investigating related jobs for a reference job. The data stem from DKRZ's supercomputer Mistral and include more than 500,000 jobs that have been executed for more than 6 months of operation. Our analysis shows that the strategy and algorithms bear the potential to identify similar jobs, but more testing is necessary.
Issue Date
2021
Publisher
Springer
Organization
Gesellschaft für wissenschaftliche Datenverarbeitung 
Conference
ISC HPC
ISBN
978-3-030-90538-5
eISBN
978-3-030-90539-2
Conference Place
Frankfurt
Event start
2021-06-24
Event end
2021
Language
English

Reference

Citations


Social Media