Sunday, February 26, 2006

Rough Software Spec

I'm hoping Landon Cox will send me the Samsara code soon, but, as was discussed, it's not wise to wait on him. Thus, this is a bare minimum description of what the software prototype needs to do in the event we don't get Samsara as a base from which to work.

File System.

The full scale software system would need some kind of configuration interface where the user could specify the set of directories or disk mounts that the software should back up. From this the software would essentially create a single huge block of addressable data, which would then be split into chunks for distribution. It would then need to be able to track changes made to the given set of directories and map those changes into the single block.

For this prototype, it would suffice to use a single huge file to model the addressable data space the full scale system would create. Some means of modification to this file would then need to be devised. The prototype would then to notice the change and accurately determine the section of the file that has been modified. We discussed the use of Delta encoding for this purpose (we have to get more information about its implementation if Samsara never comes).

At a bare minimum, the prototype could simply maintain several small blocks, a subset of which are assumed to have changed with each backup "epoch."

Questions

Given a set of directories, the prototype flattens them and maps an address space over the result. What then happens as files are added to the directories? Would the flattened block then grows from the middle, shifting all subsequent data to different addresses?

Partnership management.

While Samsara/Pastiche have already tackled the problem of finding partners based on comparable software installations, our system would have to add a service level agreement advertisement and negotiation scheme to this for a full blown implementation of the Hard Data Proposal. This would require the user to give an configurable estimate of how much data changes, how varying that measure is, how quickly it should be backed up, and how "important" it is to get good service, which would direct the system to give up proportionately more resources compared to its needs, making a more desirable offer, and attracting better partners.

At a minimum, the prototype could assume a set of partnerships (as little as one), each with an assumed service level agreement that may or may not be an equal partnership.

Per Partnership needs

For each partnership, the prototype would need to store:

  • From The Partner
    • The service level agreement for the data being stored, which comprises the size of data to store and the minimum bandwidth needed
    • An identifiable tag for each block being stored for the partner (perhaps the starting address and size)
    • The location of all blocks for which the local node is responsible
  • To The Partner
    • The service level agreement for the data being held by the partner
    • An identifiable tag for each block being held by the partner
    • The segments of the address space that the partner is servicing, i.e. storing the equivalent blocks

On an Update or Restore

When a connection is created, the connecting host either wants to update the data stored or retrieve it. The initial communication of the connection should communicate this along with a requested amount of bandwidth for the update or restore and an identification of the blocks to be accessed. The serving host can then grant the connecting host a bandwidth allotment between the minimum agreed upon bandwidth and the requested amount, which could indicate the hosts wants all available.

Responsibilities

The serving host must be able to control this as the update or restore progresses while other connections from other partners are created and severed. Furthermore, it is the responsibility of the serving host to monitor and control this bandwidth because a connecting host greedily wants as much as possible while the serving host only wants to give up the smallest amount of resources to ensure the partnership continues. It is then the connecting host's responsibility to ensure his connecting service level agreement is not being starved.

Questions

The questions the arises from this is the serving host's ability to manage his network connection. Should it only make agreements such that the sum of the minimum bandwidth requirements is no more than its network capacity? Or can it overshoot its capacity relying on the fact that not all partners will be using the connection at once? If so, can it then deny or delay a connection based on preoccupied resources? Is this then grounds for termination of a SLA? If not, how frequently can it occur before a SLA should be terminated? (While this would be interesting to research, it would require a large number of hosts and an significantly long experiment. So what assumptions can we make?)

Update Frequency

Based the user settings, the prototype would have to ensure the given data, either the single file or set of directories, is up to date within a given timeframe. The simplest way to do this would be to have the prototype wake up at regular intervals in step with the user configuration. Upon such an event, the prototype would need to in some manner examine the entire set of data such that it compares this examination with the last wakeup event. This comparision should yield a set of address ranges that have changed from the last update, which are then compared with each partnership to obtain the set of partners that are storing the corresponding old data. The prototype then connects to each and uploads the newer data while monitoring the upload bandwidth of each connection to ensure it meets its SLA. Once complete, it retires until the next update.

Considerations

To do this, the prototype would need to be able to map the address ranges back to the directory structure, which could potentially map over several files and directories.

The prototype supposedly would determine whether the outgoing network capacity was sufficient to service the user's demand at the time the user configured the software. But this relied on an average update size and a measure of deviation from it. Thus, the prototype could potentially have more data to update than expected. Or a set of circumstances could arise that otherwise impede the update capability. Thus, the prototype must also manage the outgoing bandwidth such that the update occurs as quickly as possible but does not allow for the network capacity to become the bottleneck, restricting the bandwidth for a connection, which the prototype might incorrectly assume is the serving host violating the SLA. One way of overcoming this obstacle, is to make a single connection at a time, dedicating the entirety of the upload pipe to it and then allow the serving host to throttle the bandwidth down. Subsequent connections can then be made, one after another, with a similar protocol until there is not enough bandwidth left to properly service another connection.

Lastly, it's not difficult to assume that updates to the data do not occur all at once, but rather are spread over the update time frame interval. Therefore, the size of the needed update would increase with the time interval. Network utilization would be negatively affected, though, as larger bursts of transfers would be required at longer intervals while a more sustained average usage would be needed for shorter, almost continuous, intervals. This, however, assumes that the set of data to be backed up, if it were a set of directories, does not include those where running processes would be creating temporary files. (We could potentially examine this trade-off)

Issues/Questions

This raises the issue of what happens if more data needs backed up than the software has time to transfer it. Would the software potentially miss an update interval? What then happens if an update occurs while it's trying to transfer the previous snapshot? Can we include the update? Would it simply be easier to include that in the next snapshot? Would that violate what the user needs?

Thursday, February 23, 2006

The Plan

Steps to completing this project:
  1. Practice using Netnice
  2. Hope I get Samsara soon (I have heard from Landon Cox at Duke)
  3. Examine how Samsara/Pastiche find and creates storage partnerships
  4. Implement bandwidth reservations/restrictions for all partnerships
  5. Implement SLA negotiation between two peers
  6. Ensure the SLA is cannot be violated
  7. Optionally augment the Pastiche routing protocol to include finding peers with desired SLA characteristics
  8. Perform some kind of experimentation on the system to determine.... something
  9. Write a paper describing the work and the experiment

Wednesday, February 22, 2006

Netnice

After discussing the basics of Netnice with someone more knowledge than I am with it, I've found a potential snag in its usefulness. Apparently, Netnice is capable of limiting the bandwidth on a process by process basis whereas the demands for this application seem to need control on a connection by connection, all of which may be created within a single process.

However, this may not be an issue. If the Samsara code I'm expected to get soon forks a separate process for each connection, Netnice should work perfectly. Or, if it's multithreaded in a similar way to the fork() call, which I've seen Linux do, Netnice will probably be able to control its bandwidth with the granularity needed.

I'm expecting the daemon to be a self-contained, single process, though, in which case there are two possible solutions I imagine at the moment. At any given time, only a certain number of connections will be established. Each will have a set bandwidth allocation, and the aggregate total of these could be the value Netnice uses to shape the process' bandwidth. The alternative would be to implement a custom shaping scheme involving the packet size and a timer.

Once I get the code, I'll reexamine my options...

Administration

A portion of the bandwidth used by any particular node in a P2P backup system has to be dedicated to administrative purposes. After a service level agreement is negotiated, a node must constantly ensure that his partner is available and that its data is available on that partner's shared storage. While the former is relatively negligible, ensuring the data is available could be rather expensive. The Samsara and/or Pastiche papers suggested using a computed hash of the data to ensure the data is stored on the host, but Lillibridge points out that this will not ensure the partnering node will provide the data when it's needed. The benefit, which is the unused bandwidth, gained from not supplying the data despite the steep cost of having to hold the data anyway seems rather small but still creates a potential impact to the reliability of the system.

Thus, the added administrative cost is the size of the data block that needs transferred, the bandwidth (time) used in performing the transfer, and the frequency with which this needs to occur for all partners of a paritcular node.

Presumbly, this cost is fixed, and hence assumed, for any service level agreement between two partners. It should, however, be chosen so as to minimize the cost of the administration of the partnership.

Tuesday, February 21, 2006

Negotiation

Each node in the P2P backup system must be able to negotiate it's own service level agreements based on the needs specified by the user. Assuming that a node engages in reciprocol storage relationships with N other nodes in the system, it can agree upon an individual service level agreement at the same time with each of the N nodes based upon the portion of the data it wishes to store on any given one of them. The needed bandwidth to perform an update on the data can be divided among the N nodes in proportion with the amount of data that will be stored on the given node but may not necessarily be equal to the bandwidth the node agrees to supply to its partners.

The duty of enforcing a service level agreement falls upon the supplier. It is in the node's best interest to ensure that none of its partners are consuming more bandwidth or storage than was agreed upon. While enforcing storage space limits has been dealt with in Samsara, a node must be able to restrict a connection from a partner to the agreed upon limit of bandwidth if it needs or chooses to allocate the bandwidth elsewhere. Netnice should work well for this provided it allows the bandwidth provided to a certain connection to be arbitrarily changed.

A malicious node may restrict a connection down beyond the agreed upon bandwidth, but a partner in such a situation would be able to terminate the agreement and find another, more benevolent partner. This would then starve malicious nodes, forcing them in the least to meet the minimum of an agreed upon SLA.

Monday, February 20, 2006

P2P Backup Systems

The System. Peer to peer backup systems offer a relatively cheaper alternative to Internet backup sites while potentially providing the added security of spreading backed up data across multiple, geographically distant hosts. Reliability in such systems depends on ensuring resources are available perform the backup, store the data on a remote host, and retrieve the data when needed. This requires that a large portion of the hosts meet the needs of other peers to store and supply data when needed and that the system ensure no single host is unexpectedly or greedily consuming resources (bandwidth or storage) such that it deprives other hosts from properly using the system.

Commercialization of these systems then require each host to make and conform to service level agreements, which simultaneously predicts the hosts backup behavior and allows the system to ensure resources are available to service this typical behavior.

Service Level Agreements. A typical service level agreement would be similar to a user stating, "I have this directory that needs backed up every t [minutes, hours, days, weeks, years], and, on average, X bytes of data change in that time frame." Two criterion come from this: the initial size of the directory, which is the size of the storage that will be needed across the system; and the bandwidth needed to meet the update demands, which requires some analysis.

Bandwidth. If at every interval of time, X and only X bytes of data needed to be backed up, the bare minimum bandwidth need for the system to back up the data is X/t. While this may provide a good estimate, it would be wise to assume the user of the backing up node may have spikes and lulls in the amount of data that needs backed up from interval to interval. Thus, a measure of the standard deviation in this average amount of data to be backed up would allow an upper bound to be determined on the amount of bandwidth needed to service the majority of the user's needs. Furthermore, it would allow the system to reject or charge extra for needs that exceed this agreed upon SLA.

A corollary to this argument is that, if the bandwidth is available, the node should consume all unused bandwidth to achieve a consistent, backed up state as soon as possible. Under the conditions, each node must still be guaranteed the minimum bandwidth it needs to meet its needs, which requires the system to be able to back off a node that is consuming more bandwidth that it needs in order to ensure that any other node receives its share of the bandwidth.