In this scenario, computational fluid dynamics simulations of turbulence are performed with 64 GPUs and an optimized CFD algorithm using communication/computation overlapping. Detailed timings reveal that the GPUs' internal calculations are so efficient that operations related to data exchange between compute nodes now cause a scaling bottleneck on all but the largest problems.
Published in:
Computing in Science & Engineering
(Volume:14
,
Issue:
3
)
Date of Publication: May-June 2012