By Topic

High-Performance Reconfigurable Computing Technology and Applications ( HPRCTA), 2010 Fourth International Workshop on

Date 14-14 Nov. 2010

Filter Results

Displaying Results 1 - 7 of 7
  • [Front matter]

    Page(s): i - iv
    Save to Project icon | Request Permissions | PDF file iconPDF (770 KB)  
    Freely Available from IEEE
  • Investigating resilient high performance reconfigurable computing with minimally-invasive system monitoring

    Page(s): 1 - 8
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (198 KB) |  | HTML iconHTML  

    As researchers push for Exascale computing, one of the emerging challenges is system resilience. Unlike fault-tolerance which corrects errors, recent reports suggest that resilient systems will need to continue to make progress on an application despite faults. A first step in developing a resilient system is to have robust, scalable system monitoring. The work described here presents a novel, minimally-invasive system monitor that operates over a separate network. We analytically characterize the performance for an arbitrary set of nodes and demonstrate a working implementation of the design. We argue that the hardware approach is inherently superior to the ad hoc, software techniques currently employed in practice. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An application of high-performance reconfigurable computing in radio astronomy signal processing

    Page(s): 1 - 7
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (300 KB) |  | HTML iconHTML  

    Reconfigurable Computing has been making inroads in the front-end digital signal processing systems deployed at radio telescopes around the world. The National Radio Astronomy Observatory (NRAO) at Green Bank has developed a signal processing system expressly for pulsar search and timing observations. These observations are among the most demanding experiments in terms of real-time computational and data rate requirements. In this paper, we describe the application domain, the challenges in designing a system to meet these demands, and the resulting heterogeneous computing system. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Comparative analysis of HPC and accelerator devices: Computation, memory, I/O, and power

    Page(s): 1 - 10
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (542 KB) |  | HTML iconHTML  

    The computing market constantly experiences the introduction of new devices, architectures, and enhancements to existing ones. Due to the number and diversity of processor and accelerator devices available, it is important to be able to objectively compare them based upon their capabilities regarding computation, I/O, power, and memory interfacing. This paper presents an extension to our existing suite of metrics to quantify additional characteristics of devices and highlight tradeoffs that exist between architectures and specific products. These metrics are applied to a large group of modern devices to evaluate their computational density, power consumption, I/O bandwidth, internal memory bandwidth, and external memory bandwidth. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimization and performance study of large-scale biological networks for reconfigurable computing

    Page(s): 1 - 9
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (297 KB) |  | HTML iconHTML  

    Field-programmable gate arrays (FPGAs) can provide an efficient programmable resource for implementing hardware-based spiking neural networks (SNN). In this paper we present a hardware-software design that makes it possible to simulate large-scale (2 million neurons) biologically plausible SNNs on an FPGA-based system. We have chosen three SNN models from the various models available in the literature, the Hodgkin-Huxley (HH), Wilson and Izhikevich models, for implementation on the SRC 7 H MAP FPGA-based system. The models have various computation and communication requirements making them good candidates for a performance and optimization study of SNNs on an FPGA-based system. Significant acceleration of the SNN models using the FPGA is achieved: 38x for the HH model. This paper also provides insights into the factors affecting the speedup achieved such as FLOP:Byte ratio of the application, the problem size, and the optimization techniques available. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Towards production FPGA-accelerated molecular dynamics: Progress and challenges

    Page(s): 1 - 8
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (292 KB) |  | HTML iconHTML  

    Recent work in the FPGA acceleration of molecular dynamics simulation has shown that including on-the-fly neighbor list calculation (particle filtering) in the device has the potential for an 80× per core speed-up over the CPU-based reference code and so to make the approach competitive with other computing technologies. In this paper we report on progress and challenges in advancing this work towards the creation of a production system, especially one capable of running on a large-scale system such as the Novo-G. The current version consists of an FPGA-accelerated NAMD-lite running on a PC with a Gidel PROCStar III. The most important implementation issues include software integration, handling exclusion, and modifying the force pipeline. In the last of these we have added support for Particle-Mesh-Ewald and augmented the Lennard-Jones calculation with a switching function. In experiments, we find that energy stability so far appears to be acceptable, but that longer simulations are needed. Due primarily to the added complexity of the force pipelines, performance is somewhat diminished from the previous study; we find, however, that porting to a newer (existing) device will more than compensate for this loss. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A parallel hardware architecture for information-theoretic adaptive filtering

    Page(s): 1 - 10
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2893 KB) |  | HTML iconHTML  

    Information-theoretic cost functions such as minimization of the error entropy (MEE) can extract more structure from the error signal, yielding better results in many realistic problems. However, adaptive filters (AFs) using MEE methods are more computationally intensive when compared to conventional, mean-squared error (MSE) methods employed in the well-known, least mean squares (LMS) algorithm. This paper presents a novel, parallel hardware architecture for MEE adaptive filtering. The design has been implemented and evaluated in realtime on one of the servers of the Novo-G machine in the NSF CHREC Center at the University of Florida, believed to be the most powerful reconfigurable supercomputer in academia. By pipelining the design and parallelizing independent computations within the algorithm, our proposed hardware architecture successfully achieves a speedup of 5800 on one FPGA, 23200 on one quad-FPGA board, and 46400 on two quad-FPGA boards, as compared to the same algorithm running in software (optimized C program) on a single CPU core. Just as important, our results show that this reconfigurable design does not lose precision while converging to the optimum solution in the same number of steps as the software version. As a result, our approach makes it possible for AFs using the MEE cost function to adapt in real-time for signals that require a sampling rate in excess of 400 kHz and thus can target a much wider range of applications. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.