By Topic

Programming FFT on DSM multiprocessors

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Hongzhang Shan ; Dept. of Comput. Sci., Princeton Univ., NJ, USA ; Jianhua Feng ; Hongzhong Shan

The performance of the shared address space programming model for the kinds of coarse-grained communicating programs which have traditionally been common in scientific computing, is not clear today. We use the challenging 1-dimensional FFT, a regular coarse-grained program, as our driving application to study how to get high performance for such kind of applications under the shared address space programming model on a hardware supported cache-coherent distributed memory machine. We find that its performance is highly affected by the data placement. Proper data placement will be critical to the success of this kind of application. Prefetching could further improve the performance to a degree of 10 percent to 50 percent for the data sets we studied. Naive programming will easily cause the performance bottleneck by introducing much more contention and lead to great performance loss. If the shared address space programs are properly programmed, it will deliver much better performance than the other popular programming models, such as MPI and SHMEM.

Published in:

High Performance Computing in the Asia-Pacific Region, 2000. Proceedings. The Fourth International Conference/Exhibition on  (Volume:2 )

Date of Conference:

14-17 May 2000