"A Novel Model for Synthesizing Parallel I/O Workloads in Scientific Applications", in Proceedings of IEEE 2008 Cluster, Tsukuba, Japan, Sept. 29 - Oct. 1, 2008 (acceptance rate: 28/92 = 30%)

One of the challenging issues in performance evaluation of parallel storage systems through synthetic-trace-driven simulation is to accurately characterize the I/O demands of data-intensive scientific applications. This paper analyzes several I/O traces collected from different distributed systems and concludes that correlations in parallel I/O inter-arrival times are inconsistent, either with little correlation or with evident and abundant correlations. Thus conventional Poisson or Markov arrival processes are inappropriate to model I/O arrivals in some applications. Instead, a new and generic model based on the ®-stable process is proposed and validated in this paper to accurately model parallel I/O burstiness in both workloads with little and strong correlations. This model can be used to generate reliable synthetic I/O sequences in simulation studies. Experimental results presented in this paper show that this model can capture the complex I/O behaviors of real storage systems more accurately and faithfully than conventional models, particularly for the burstiness characteristics in the parallel I/O workloads.

BibTeX Entry
author = {Dan Feng and Qiang Zou and Hong Jiang and Yifeng Zhu},
title = {A Novel Model for Synthesizing Parallel I/O Workloads in Scientific Applications},
booktitle = {Proceedings of {IEEE} International Conference on Cluster Computing},
year = {2008},
pages = {xxx-xxx},
address = {Tsukuba, Japan},

Full Paper
Last modified on October 16, 2007