By Topic

Towards parallel access of multi-dimensional, multi-resolution scientific data

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

9 Author(s)
Kumar, S. ; SCI Inst., Univ. of Utah, Salt Lake City, UT, USA ; Pascucci, V. ; Vishwanath, V. ; Carns, P.
more authors

Large-scale scientific simulations routinely produce data of increasing resolution. Analyzing this data is key to scientific discovery. A critical bottleneck facing data analysis is the I/O time to access the data due to the disparity between a simulation's data layout and the data layout requirements of analysis applications. One method of addressing this problem is to reorganize the data in a manner that makes it more amenable to analysis and visualization. The IDX file format is one example of this approach. It orders data points so that they can be accessed at multiple resolution levels with favorable spatial locality and caching properties. IDX has been used successfully in fields such as digital photography and visualization of large scientific data, and is a promising approach for analysis of HPC data. Unfortunately, the existing tools for writing data in this format only provide a serial interface. HPC applications must therefore either write all data from a single process or convert existing data as a post-processing step, in either case failing to utilize available parallel I/O resources. In this work, we provide an overview of the IDX file format and the existing ViSUS library that provides serial access to IDX data. We investigate methods for writing IDX data in parallel and demonstrate that it is possible for HPC applications to write data directly into IDX format with scalable performance. Our preliminary results demonstrate 60% of the peak I/O throughput when reorganizing and writing the data from 512 processes on an IBM BG/P system. We also analyze the performance bottlenecks and propose future work towards a flexible and efficient implementation.

Published in:

Petascale Data Storage Workshop (PDSW), 2010 5th

Date of Conference:

15-15 Nov. 2010