Life in Systems: P2P Backup Systems

The System. Peer to peer backup systems offer a relatively cheaper alternative to Internet backup sites while potentially providing the added security of spreading backed up data across multiple, geographically distant hosts. Reliability in such systems depends on ensuring resources are available perform the backup, store the data on a remote host, and retrieve the data when needed. This requires that a large portion of the hosts meet the needs of other peers to store and supply data when needed and that the system ensure no single host is unexpectedly or greedily consuming resources (bandwidth or storage) such that it deprives other hosts from properly using the system.

Commercialization of these systems then require each host to make and conform to service level agreements, which simultaneously predicts the hosts backup behavior and allows the system to ensure resources are available to service this typical behavior.

Service Level Agreements. A typical service level agreement would be similar to a user stating, "I have this directory that needs backed up every t [minutes, hours, days, weeks, years], and, on average, X bytes of data change in that time frame." Two criterion come from this: the initial size of the directory, which is the size of the storage that will be needed across the system; and the bandwidth needed to meet the update demands, which requires some analysis.

Bandwidth. If at every interval of time, X and only X bytes of data needed to be backed up, the bare minimum bandwidth need for the system to back up the data is X/t. While this may provide a good estimate, it would be wise to assume the user of the backing up node may have spikes and lulls in the amount of data that needs backed up from interval to interval. Thus, a measure of the standard deviation in this average amount of data to be backed up would allow an upper bound to be determined on the amount of bandwidth needed to service the majority of the user's needs. Furthermore, it would allow the system to reject or charge extra for needs that exceed this agreed upon SLA.

A corollary to this argument is that, if the bandwidth is available, the node should consume all unused bandwidth to achieve a consistent, backed up state as soon as possible. Under the conditions, each node must still be guaranteed the minimum bandwidth it needs to meet its needs, which requires the system to be able to back off a node that is consuming more bandwidth that it needs in order to ensure that any other node receives its share of the bandwidth.

Monday, February 20, 2006

P2P Backup Systems

No comments:

About Me

Followers

Blog Archive