Hierarchical Bloom Filter Arrays (HBA): A Novel, Scalable Metadata Management System for Large Cluster-based Storage. Yifeng Zhu, Hong Jiang, Jun Wang. In Proceedings of the 5th IEEE International Conference on Cluster Computing (Cluster 2004), San Diego, California, Sept. 2004.

Abstract

An efficient and distributed scheme for file mapping or file lookup scheme is critical in decentralizing metadata management within a group of metadata servers. This paper presents a novel technique called HBA (Hierarchical Bloom Filter Arrays) to map file names to the servers holding their metadata. Two levels of probabilistic arrays, i.e., Bloom Filter Arrays, with different accuracies are used on each metadata server. One array, with lower accuracy and representing the distribution of the entire etadata, trades accuracy for significantly reduced memory overhead, while the other array, with higher accuracy, caches partial distribution information and exploits the temporal locality of file access patterns. Extensive trace-driven simulations have shown our HBA design to be highly effective and efficient in improving
performance and scalability of file systems in clusters with 1,000 to 10,000 nodes (or super-clusters).


BibTeX Entry
  @inproceedings{yzhu:cluster04,
author = {Yifeng Zhu and Hong Jiang and Jun Wang},
title = {{Hierarchical Bloom Filter Arrays (HBA)}: A Novel, Scalable Metadata Management System for Large Cluster-based Storage},
booktitle = {Proceedings of {IEEE} International Conference on Cluster Computing},
month = sep,
year = {2004},
address = {San Diego, California}
}

Full Paper
 
Last modified on October 16, 2003