By Topic

Time domain numerical simulation for transient waves on reconfigurable coprocessor platform

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Chuan He ; Texas A&M Univ., College Station, TX, USA ; Wei Zhao ; Mi Lu

A successful application-oriented reconfigurable coprocessor design requires not only a powerful FPGA-based computing engine along with suitable hardware architecture, but also an efficient algorithm tailored for this special application. In this paper, we present our hardware architecture and numerical algorithms designed to speedup the time-domain finite-difference simulation of linear wave propagation problems in 2D and 3D space on FPGA-based reconfigurable platforms. Application fields of this work include seismic modeling and migration, computational electromagnetics, aeroacoustics, marine acoustics, to name a few. By writing first-order linear wave equations into second-order form, we halve the number of unknowns and simplify the treatment of parameters. We also adopt higher-order finite-difference (FD) schemes to further reduce the number of unknowns at the cost of increasing floating-point computations per discrete grid point. By doing so, we relief the bandwidth requirements between the FPGA and onboard memories but put more burden on the computing engine to take full advantage of FPGA's computational potentials. The speed of our design implemented on a Xilinx ML401 Virtex-4 evaluation platform is about 1.5∼4 times faster than a pure software implementation of the same algorithm running on a 3.0 GHz DELL workstation. This impressive result is mainly attributed to the memory architecture design, which is well-tuned for our numerical higher-order FD algorithms and can utilize onboard memory bandwidth more wisely. Furthermore, the good scalability of our design makes it compatible with most commercial reconfigurable coprocessor platforms and correspondingly, the performance would be proportional to their onboard memory bandwidth.

Published in:

Field-Programmable Custom Computing Machines, 2005. FCCM 2005. 13th Annual IEEE Symposium on

Date of Conference:

18-20 April 2005