Skip to Main Content
We propose a novel method to study storage system predictability based on the visualization of file successor entropy, a form of conditional entropy drawn from a file access trace. First-order conditional entropy can be used as a measure of predictability It is superior to the more common measures such as independent likelihood of data access. For file access data, we developed a visualization tool that produces 3D graphical views of the variation in predictability of successive access events on a per-file basis. Our visualization tool provides interactive observation of the variations in predictability according to some arbitrary criterion, e.g. time of day, program identifier, user groups, or any other classification of files. Four entropy data sets were extracted from various file system traces. These four data sets are representative of the variability in file access patterns for different machine use: server personal workstation, large number of interactive users, and heavy write activity. Visualization results show that there is strong predictability among files and optimizations would be profitable.