Skip to Main Content
The choice of database layout, i.e., how database objects such as tables and indexes are assigned to disk drives can significantly impact the I/O performance of the system. Today, DBAs typically rely on fully striping objects across all available disk drives as the basic mechanism for optimizing I/O performance. While full striping maximizes I/O parallelism, when query execution involves coaccess of two or more large objects, e.g., a merge join of two tables, the above strategy may be suboptimal due to the increased number of random I/O accesses on each disk drive. Here, we propose a framework for automating the choice of database layout for a given database that also takes into account the effects of coaccessed objects in the workload faced by the system. We formulate the above as an optimization problem and present an efficient solution to the problem that judiciously takes into account the trade-off between I/O parallelism and random I/O accesses. Our experiments on Microsoft SQL server show the superior I/O performance of our techniques compared to the traditional approach of fully striping each database object across all disk drives.