Problem: There is a growing set of Internet-based services that are too big, or too important, to run at a single site. Examples include Web services for e-mail, video and image hosting, and social networking. This is the same reasoning behind Facebook’s need to scale out (see below).
Solution: WheelFS is a wide-area distributed storage system intended to help multi-site applications share data and gain fault tolerance. Different distributed applications might need different properties in a storage system: they might need to see the latest copy of some data, and be willing to pay a price in high delay, or they may want data to be stored durably, or have specific preferences for which site stores a document. Thus, in order to be a usable component in many different systems, a distributed storage system needs to expose a level of control to the surrounding application.
WheelFS allows applications to adjust the default semantics and policies with semantic cues, while maintaining the standard POSIX interface. WheelFS offers a small set of cues in four categories: placement, durability, consistency, and large reads.
Tradeoffs: Every file or directory object has a single “primary” WheelFS storage server that is responsible for maintaining the latest contents of that object. WheelFS maintains the POSIX interface even in the background and accessing a single file could result in communication with several servers, since each subdirectory in the path could be served by a different primary.
In addition, a disadvantage of cues is that they may break software that parses pathnames and assumes that a cue is a directory. Another is that links to pathnames that contain cues may trigger unintuitive behavior. However, they have not encountered examples of these problems.
Future Influence: The main insight from WheelFS is that different distributed applications need different properties in a storage system. In order to use have a distributed storage system as a usable component upon which many different applications are built, the storage system must allow the application to control various policies.
It is unclear whether a truly wide-area distributed storage system is the best approach. A middle ground may be found with distributed storage within each datacenter with updates pushed out to the other datacenters. This will create greater isolation between datacenters to prevent cascading failures.