File system measurements and their application to the design ofefficient operation logging algorithms
Bacon, D.F.
Reliable Distributed Systems, 1991. Proceedings., Tenth Symposium on
Volume , Issue , 30 Sep-2 Oct 1991 Page(s):21 - 30
Digital Object Identifier 10.1109/RELDIS.1991.145400
Summary:File system operation in a transparently fault-tolerant system
that uses checkpointing and message logging is discussed. Logging
messages to disk is one of the primary performance costs of such
systems. The author has measured the file system operations performed on
large timesharing systems running Unix in terms of the level of
concurrency (number of consecutive operations that do not change the
state of the file system). By performing much of the data analysis
online within a modified Unix kernel, statistics were collected over a
long period of time with a substantial variation in system load. Using
this data, it is demonstrated that a technique called null logging can
reduce the number of messages logged to disk by a factor of 10 to 25,
depending on the workload. This reduces the overhead of the
fault-tolerance mechanism and allows a large fraction of file system
operations to commit instantaneously
View citation and abstract |