Next: Data Consistency
Up: Implementation Overview
Previous: Metadata Management
Metadata server holds the most critical information about striping
and authorization. The failure of the metadata server will crash
the whole storage system. Therefore, the metadata server needs to
be backed up to improve reliability. However, the original PVFS
can not achieve the backup of the metadata server due to the
limitation of its naming mechanism for the striped files. In PVFS,
the striped data in a data server is sieved together and stored as
a file. In addition, the file name is chosen to be the inode
number of the metadata file to guarantee the uniqueness of the
file name in the data servers. This approach has two main
disadvantages. One is that the total number of inodes on the
metadata server is limited. When the file number is large, we may
run out of inode numbers. The other, more significant disadvantage
is that PVFS can not backup the meta server theoretically because
the data of a new file will be falsely written into an existing
file when the primary metadata server is down and the backup
metadata server assigns the new file an inode number that has been
used by the primary metadata server.
In the design of CEFT-PVFS, we have changed the naming mechanism
and instead used the MD5 sum [16] of the requested file
name as the data file name. In this way, the metadata can be
directly duplicated to any backup storage device to provide
redundancy.
The calculation of MD5 will not introduce significant overhead in
CEFT-PVFS. First, we only need to calculate the MD5 of file names,
which are typically 5-20 bytes. While we measured that the MD5
program can calculate with a speed of 200 MB/sec on a single
node, the calculation of a file name usually takes only 25-100
. Second, the MD5 calculation is not the bottleneck since it
is performed distributively by client nodes. Each client node
calculates the MD5 of its destination file name and sends the
result along with its I/O requests to the metadata server so that
the metadata server can directly extract it from the requests.
Next: Data Consistency
Up: Implementation Overview
Previous: Metadata Management
Yifeng Zhu
2003-10-16