Position Paper for Sigops Workshop on Fault Tolerance Support in Distributed Systems, 1990 Andrew Birrell Systems Research Center, Digital Equipment Corporation, 130 Lytton Avenue, Palo Alto, CA 94022, U.S.A. At SRC we have been exploring the provision and use of fault tolerance in the basic facilities of a distributed system - the physical communications, the name service and the file service. We now have research prototypes of each of these, and we are starting to gain experience in how tolerant the really are. Our LAN, called "Autonet" provides a mesh-connected network with link speeds of 100 Mbits/sec connecting into full cross-bar switches. Each host is connected to two switches, with automatic fail-over. The network itself is self-configuring. The switches will dynamically re-arrange their routing tables in response to failures of links or switches. Our name service is fully and seamlessly integrated with the file service (which is called "Echo"). Both services provide the same semantics; they differ only in their response to failures. From the point of view of the normal user, the choice of placing a directory in the name service or in the file service is made by considering the desired fault tolerance characteristics. A global path name is rooted in the name service. Name resolution starts in the name service, proceeding until it encounters a "junction", describing a file service volume. The remainder of the path name is then presented to the file service. The name service and file service are both equally accessible through a single interface. So, for example, the Unix "ls" and "find" commands work just as well within the name service as within the file service. The name service is implemented with the familiar lazy update replication scheme, where updates are committed at the initial replica, and asynchronously propagated to the other replicas after returning "success" to the client. A name service volume is available to a client provided the client can contact at least one replica of the volume. This provides very high availability but at the cost of weaker consistency guarantees. The file service is implemented with a replication scheme providing tight consistency - an update does not return to the client until at least a majority of replicas has committed the update. A file service volume is available to a client provided the client can contact a majority of the replicas of that volume. This provides lower availability, but with the benefit of simpler and more powerful semantics - as seen by the client, all replicas of file volumes always contain the same data. (In addition, we maintain tight consistency for cached file system data by using a token-based cache consistency algorithm on client machines.). In general, we can glue these two components together in arbitrary ways. Way can have a file volume as a child of a name service directory, but equally we could have a name service volume as a child of a file service directory. The administrator can choose how to organize his name space, providing the appropriate trade-offs between availability and consistency for each object. In practice, we anticipate that the system will be configured to use the name service for the higher level parts of the name space, and the file service for the lower level parts. This corresponds to the observation that the higher level parts are slowly changing - the weak update semantics are unlikely to disturb users - and so we can benefit from the very high availability of the name service. But the lower level parts change frequently, so lazy update propagation would be difficult to live with. Further, the higher level parts of the name space tend to be widely shared, whereas the lower level parts have high locality of access. This corresponds well to the two replication schemes - the lazy update propagation scheme works well over poorly connected wide area networks, but all existing tight-consistency schemes face substantial performance penalties if you disperse the replicas across such networks. We believe that the combination of these facilities will provide us with the basis for a distributed system that is flexible, scalable, and capable of tolerating many failure modes while retaining high availability. Right now (April) the system is just about to enter service in SRC; by September we should have some real experience with its successes and failures.