By Topic

Innovative Architecture for Future Generation High-Performance Processors and Systems (IWIA), 2008 International Workshop on

Date 21-23 Jan. 2008

Filter Results

Displaying Results 1 - 19 of 19
  • [Front cover]

    Publication Year: 2008, Page(s): C1
    Request permission for commercial reuse | PDF file iconPDF (2173 KB)
    Freely Available from IEEE
  • [Title page i]

    Publication Year: 2008, Page(s): i
    Request permission for commercial reuse | PDF file iconPDF (34 KB)
    Freely Available from IEEE
  • [Title page iii]

    Publication Year: 2008, Page(s): iii
    Request permission for commercial reuse | PDF file iconPDF (65 KB)
    Freely Available from IEEE
  • [Copyright notice]

    Publication Year: 2008, Page(s): iv
    Request permission for commercial reuse | PDF file iconPDF (105 KB)
    Freely Available from IEEE
  • Table of contents

    Publication Year: 2008, Page(s):v - vi
    Request permission for commercial reuse | PDF file iconPDF (218 KB)
    Freely Available from IEEE
  • Message from the Editors

    Publication Year: 2008, Page(s): vii
    Request permission for commercial reuse | PDF file iconPDF (50 KB) | HTML iconHTML
    Freely Available from IEEE
  • Reviewing Committee

    Publication Year: 2008, Page(s): viii
    Request permission for commercial reuse | PDF file iconPDF (78 KB)
    Freely Available from IEEE
  • The Shape of Things to Come: Future Potential of "Heavy Node" Multi-Core HPC Architectures

    Publication Year: 2008, Page(s):3 - 10
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (258 KB) | HTML iconHTML

    The Top 500 list has been tracking supercomputers since the early 1990s. The bulk of those systems, especially recently, have been built from leading edge commodity microprocessors. This paper analyzes potential future characteristics of such systems in the light of the advent of power-constrained multi-core microprocessors. The resulting predictions indicate that such systems will not be able, by... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Effect of Reordering Internal Messages in MPI Broadcast According to the Load Imbalance

    Publication Year: 2008, Page(s):11 - 16
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (185 KB) | HTML iconHTML

    To achieve higher scalability of parallel programs on large scale parallel computers, reducing the time spent for collective communications is one of the most important issue. In this paper, a dynamic optimization method to adjust the implementation of Broadcast operation, one of the most popular collective communications, is introduced.Though there have been many attempts to speed up this operati... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Acceleration for MPI Derived Datatypes Using an Enhancer of Memory and Network

    Publication Year: 2008, Page(s):17 - 23
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (370 KB) | HTML iconHTML

    This paper presents a support function for MPI derived datatypes on DIMMnet-3 network interface with multi-banked extended memory, which is under development. Semi-hardwired derived datatype communication based on RDMA with hardwired gather and scatter is proposed. This mechanism and MPI using it are implemented and validated on DIMMnet-2 which is a former prototype operating on DDR DIMM slot. The... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Unified Programming Environment for Heterogeneous Distributed Parallel Systems

    Publication Year: 2008, Page(s):24 - 31
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (360 KB) | HTML iconHTML

    Parallel execution environment, such as the multi-core CPU, a cluster, and a grid, has spread increasingly. The change from a homogeneous core based CPU and a shared memory to the distributed memory and the heterogeneous core based CPU is making system architecture complicated. The programming interface and programming model which are different in each parallel execution environment are used. Sinc... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A PLD Architecture for High Performance Computing

    Publication Year: 2008, Page(s):35 - 42
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (421 KB) | HTML iconHTML

    In recent years, Field Programmable Gate Arrays (FPGAs) have been used for High Performance Computing (HPC). Because there is a significantly difference between configuration speed of FPGA and execution speed of Central Processing Unit (CPU), the difference causes performance degradation. To resolve of this problem, we proposed MPLD as a new Programmable Logic Device (PLD) architecture with high s... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Automatic Application of Last-Touch Instructions for Leakage Energy Reduction

    Publication Year: 2008, Page(s):43 - 50
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (287 KB) | HTML iconHTML

    Recently, energy dissipation in microprocessors is getting larger, which leads to a serious problem in terms of allowable temperature and performance improvement for future microprocessors. Cache memory is effective in bridging a growing speed gap between a processor and relatively slow external main memory, and has increased in its size. However, energy dissipation in the cache memory will approa... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design and Power Performance Evaluation of On-Chip Memory Processor with Arithmetic Accelerators

    Publication Year: 2008, Page(s):51 - 57
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (282 KB) | HTML iconHTML

    In this paper, we design an on-chip memory processor with arithmetic accelerators, which are expected to improve power consumption. In addition, we evaluate the power performance of the processor. We propose implementing vector-type arithmetic accelerators and SIMD-type arithmetic accelerators in the on-chip memory processor. The evaluation results obtained using our simulator indicate that the pe... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Register File Reliability Analysis Through Cycle-Accurate Thermal Emulation

    Publication Year: 2008, Page(s):61 - 66
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (294 KB) | HTML iconHTML

    Continuous transistor scaling due to improvements in CMOS devices and manufacturing technologies is increasing processor power densities and temperatures; thus, creating challenges when trying to maintain manufacturing yield rates and devices which will be reliable throughout their lifetime. New microarchitectures require new reliability-aware design methods that can face these challenges without ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Low-Power and High-Performance Communication Mechanism for Dependable Embedded Systems

    Publication Year: 2008, Page(s):67 - 73
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (240 KB) | HTML iconHTML

    Recently, a multi-core processor has been used to improve the performance and to reduce the power consumption. In order to acquire higher performance, multiprocessor connected with the network can enlarge the processing power. Dependability is also important for the embedded system to protect from a fault and failure. We develop a parallel platform for dependable embedded system, and investigate t... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Introspection-Based Fault Tolerance for COTS-Based High-Capability Computation in Space

    Publication Year: 2008, Page(s):74 - 83
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (175 KB) | HTML iconHTML

    Future missions of deep space exploration face the challenge of designing, building,and operating progressively more capable autonomous spacecraft and planetary rovers. Given the communication latencies and bandwidth limitations for such missions, the need for increased autonomy becomes mandatory, along with the requirement for enhanced on-board computational capabilities while in deep space or ti... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Author index

    Publication Year: 2008, Page(s): 84
    Request permission for commercial reuse | PDF file iconPDF (100 KB)
    Freely Available from IEEE
  • [Publishers information]

    Publication Year: 2008, Page(s): 86
    Request permission for commercial reuse | PDF file iconPDF (142 KB) | HTML iconHTML
    Freely Available from IEEE