next up previous
Next: Implementation Overview Up: Design, Implementation and Performance Previous: Introduction


Related Work

The proposed system has roots in a number of distributed and parallel file systems. The following gives a brief overview of this related work.

Swift [3] and Zebra [4] employ RAID-4/5 to improve redundancy. Swift conducts file striping so that large files benefit from the parallelism. Zebra aggregates client's data first and then does striping on log-structured file systems to enhance small write performance. In both designs, the parity is calculated by client nodes. In I/O-intensive applications, the calculation of parity potentially wastes important computational resources on the client nodes, which are also computation nodes in a cluster environment. In addition, both systems can tolerate the failure of any single node. The failure of a second node causes them to cease functioning.

PIOUS [5] employs a technique of data declustering to exploit the combined file I/O and buffer cache capacities of networked computing resources. It provides minor fault tolerance with a transaction-based approach so that writes can be guaranteed to either completely succeed or completely fail.

Petal [6], a block level distributed storage system, provides fault tolerance by using chained declustering [7]. Chained declustering is a mechanism that reduces the reliability of RAID-1 to trade for balancing the workload on the remaining working nodes after the failure of one storage node [8]. In Petal, the failure of either neighboring node of a failed node will result in data loss, while only the failure of its mirrored node can make the data unavailable in RAID-1. In addition, Petal does not provide a file level interface and the maximum bandwidth achieved is 43.1MB/s with 4 servers and 42 SCSI disks, which does not fully utilize the disk bandwidth.

GPFS [9] is IBM's parallel shared-disk file system for clusters. The stripping among many disks that are connected over a switching fabric, a dedicated storage network, to the cluster nodes achieves high I/O performance. It utilizes dual-attached RAID controllers and file level duplication to tolerate disk failures. While CEFT-PVFS requires no additional hardware in a cluster, GPFS typically needs fabric interconnections and RAID controllers.

PVFS [2][10] is an open source RAID-0 style parallel file system for clusters. It partitions a file into stripe units and distribute these stripes to disks in a round robin fashion. PVFS consists of one metadata server and serveral data servers. All data trafic of file content flows between clients and data server nodes in parallel without going through the metadata server. The fatal disadvantage of PVFS is that it does not provide any fault-tolerance in its current form. The failure of any single server node will render the whole file system dysfunctional.


next up previous
Next: Implementation Overview Up: Design, Implementation and Performance Previous: Introduction
Yifeng Zhu 2003-10-16