By Topic

High Performance Computing and Communications (HPCC), 2010 12th IEEE International Conference on

Date 1-3 Sept. 2010

Filter Results

Displaying Results 1 - 25 of 121
  • [Front cover]

    Publication Year: 2010 , Page(s): C1
    Save to Project icon | Request Permissions | PDF file iconPDF (1206 KB)  
    Freely Available from IEEE
  • [Title page i]

    Publication Year: 2010 , Page(s): i
    Save to Project icon | Request Permissions | PDF file iconPDF (35 KB)  
    Freely Available from IEEE
  • [Title page iii]

    Publication Year: 2010 , Page(s): iii
    Save to Project icon | Request Permissions | PDF file iconPDF (71 KB)  
    Freely Available from IEEE
  • [Copyright notice]

    Publication Year: 2010 , Page(s): iv
    Save to Project icon | Request Permissions | PDF file iconPDF (118 KB)  
    Freely Available from IEEE
  • Table of contents

    Publication Year: 2010 , Page(s): v - xiii
    Save to Project icon | Request Permissions | PDF file iconPDF (248 KB)  
    Freely Available from IEEE
  • Message from the General Chairs

    Publication Year: 2010 , Page(s): xiv
    Save to Project icon | Request Permissions | PDF file iconPDF (88 KB) |  | HTML iconHTML  
    Freely Available from IEEE
  • Message from the Program Chairs

    Publication Year: 2010 , Page(s): xv
    Save to Project icon | Request Permissions | PDF file iconPDF (67 KB) |  | HTML iconHTML  
    Freely Available from IEEE
  • Message from the Steering Committee Chairs

    Publication Year: 2010 , Page(s): xvi
    Save to Project icon | Request Permissions | PDF file iconPDF (68 KB) |  | HTML iconHTML  
    Freely Available from IEEE
  • Message from the Workshop/Symposium Chairs

    Publication Year: 2010 , Page(s): xvii
    Save to Project icon | Request Permissions | PDF file iconPDF (102 KB) |  | HTML iconHTML  
    Freely Available from IEEE
  • Organizing Committee

    Publication Year: 2010 , Page(s): xviii - xix
    Save to Project icon | Request Permissions | PDF file iconPDF (71 KB)  
    Freely Available from IEEE
  • Program Committee

    Publication Year: 2010 , Page(s): xx - xxv
    Save to Project icon | Request Permissions | PDF file iconPDF (87 KB)  
    Freely Available from IEEE
  • Message from the AHPCN-10 Symposium Chairs

    Publication Year: 2010 , Page(s): xxvi
    Save to Project icon | Request Permissions | PDF file iconPDF (98 KB) |  | HTML iconHTML  
    Freely Available from IEEE
  • Message from the IDCS-10 Workshop Chairs

    Publication Year: 2010 , Page(s): xxvii
    Save to Project icon | Request Permissions | PDF file iconPDF (62 KB) |  | HTML iconHTML  
    Freely Available from IEEE
  • IDCS-10 Organizing and Program Committees

    Publication Year: 2010 , Page(s): xxviii
    Save to Project icon | Request Permissions | PDF file iconPDF (64 KB)  
    Freely Available from IEEE
  • Message from the DSMP-10 Workshop Chairs

    Publication Year: 2010 , Page(s): xxix
    Save to Project icon | Request Permissions | PDF file iconPDF (59 KB) |  | HTML iconHTML  
    Freely Available from IEEE
  • DSMP-10 Organizing and Program Committees

    Publication Year: 2010 , Page(s): xxx
    Save to Project icon | Request Permissions | PDF file iconPDF (61 KB)  
    Freely Available from IEEE
  • Keynote: Big Science on DEISA and PRACE- A European HPC Ecosystem

    Publication Year: 2010 , Page(s): xxxi
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (66 KB) |  | HTML iconHTML  

    Summary form only given.This paper is talking about compute clusters , grid and cloud infrastructures with examples of Distributed European infrastructure for Supercomputing applications and Partnership for Advanced Computing in Europe(PRACE). The paper also describes DEISA system architecture,service layers, UNICORE access infrastructure, the distributed user management , production environment, all together forming virtual European supercomputer center. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Keynote: HPCC with Grids and Clouds

    Publication Year: 2010 , Page(s): xxxii
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (81 KB) |  | HTML iconHTML  

    Summary form only given. The paper discuss the impact of clouds and grid technology on HPCC using examples from a variety of fields especially the life sciences. It covers the impact of the growing importance of data analysis and note that it is more suitable for these modern architectures than the large simulations (particle dynamics and partial differential equation solution) that are mainstream use of large scale "massively parallel" supercomputers. The importance of grids is seen in the support of distributed data collection and archiving while clouds should replace grids for the large scale analysis of the data. It discusses the structure of applications that will run on current clouds and use either the basic "ondemand" computing paradigm or higher level frameworks based on MapReduce and its extensions. Current MapReduce implementations run well on algorithms that are a "Map" followed by a "Reduce" but perform poorly on algorithms that iterate over many such phases. Several important algorithms including parallel linear algebra fall into latter class. One can define MapReduce extensions to accommodate iterative map and reduce but these have less fault tolerance than basic MapReduce. Both clouds and exascale computing suggest research into a new generation of run times that lie between MapReduce and MPI and trade-off performance, fault-tolerance and asynchronicity. It concludes with a description of FutureGrid - a TeraGrid system for prototyping new middleware and applications. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Towards Online Application Cache Behaviors Identification in CMPs

    Publication Year: 2010 , Page(s): 1 - 8
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (518 KB) |  | HTML iconHTML  

    On chip multiprocessors (CMPs) platforms, multiple co-scheduled applications can severely degrade performance and quality of service (QoS) when they contend for last-level cache (LLC) resources. Whether an application will impose destructive interference on co-scheduled applications is largely dependent on its own inherent cache access behavior characteristics. In this work, we first present case studies that show how inter-application interferences result in undesirable performance in both shared and private cache based LLC designs. We then propose a new online approach for application cache behavior identification on the basis of detailed simulation and analysis with SPEC CPU2006 benchmarks. We demonstrate that our approach can more concisely identify application cache behaviors. Moreover, the proposed approach can be implemented directly in hardware to dynamically identify the application cache behaviors at runtime. Finally, we show with two case studies that how the proposed approach can be adopted by both shared and private based cache sharing mechanisms, i.e. cache partitioning algorithms (CPAs) and cache spilling techniques, for more concise cache resource management. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Flexible Clusters for High-Performance Computing

    Publication Year: 2010 , Page(s): 9 - 16
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (837 KB) |  | HTML iconHTML  

    High Performance Computational Clusters are, in general, rather rigid objects that present to their user a limited number of degrees of freedom related, usually, only to the specification of the resources requested and to the selection of specific applications and libraries. While in standard production environments this is reasonable and actually desirable, it can become an hindrance when one needs a dynamic and flexible computational environment, for instance for experiments and evaluation, where very different computational approaches, e.g., map-reduce, standard parallel jobs and virtual HPC clusters need to coexist on the same physical facility. In this paper we will present our efforts to address some of these challenges while maintaining a unified cluster management environment. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Sim-spm: A SimpleScalar-Based Simulator for Multi-level SPM Memory Hierarchy Architecture

    Publication Year: 2010 , Page(s): 17 - 23
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (473 KB) |  | HTML iconHTML  

    As a fast on-chip SRAM managed by software (the application and/or compiler), Scratchpad Memory (SPM) is widely used in many fields. This paper presents a Simple Scalar-based multi-level SPM memory hierarchy architecture simulator Sim-spm. We simulate the hardware of the multi-level SPM memory hierarchy successfully by extending Sim-outorder, which is an out-of-order simulator from Simple Scalar. Through the simulating memory method, the simulation framework of the multi-level SPM memory hierarchy has been built under the existing ISA (Instruction Set Architecture), which largely reduces the requirement to modify the existing compiler. The experimental results show that Sim-spm can accurately simulate the running state of the processor with a multi-level SPM memory hierarchy architecture, and it has a good prospect for the research of multi-level SPM memory hierarchy architecture. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • MN-Mate: Resource Management of Manycores with DRAM and Nonvolatile Memories

    Publication Year: 2010 , Page(s): 24 - 34
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1236 KB) |  | HTML iconHTML  

    The advent of many core era breaks the performance wall but it causes severe energy consumption. NVRAM as a main memory can be a good solution to reduce energy consumption due to large size of DRAM. In this paper, we propose MN-MATE, a novel architecture and management techniques for resource allocation of a number of cores and large size of DRAM and NVRAM. In MN-MATE, a hyper visor partitions and allocates cores and memory for guest OSes dynamically. It is clear that optimized matching of heterogeneous cores, DRAM, and NVRAM enhances system performance. Selective locating of data in a main memory composed of DRAM and NVRAM significantly reduces energy consumptions. Preliminary results show that integration of dynamic resource partitioning and selective memory allocation scheme with MN-MATE reduces energy usage significantly and suppresses performance loss from NVRAM's characteristics. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Scheduling Heuristic to Handle Local and Remote Memory in Cluster Computers

    Publication Year: 2010 , Page(s): 35 - 42
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (621 KB) |  | HTML iconHTML  

    In cluster computers, RAM memory is spread among the motherboards hosting the running applications. In these systems, it is common to constrain the memory address space of a given processor to the local motherboard. Constraining the system in this way is much cheaper than using a full-fledged shared memory implementation among motherboards. However, in this case, memory usage might widely differ among motherboards depending on the memory requirements of the applications running on each motherboard. In this context, if an application requires a huge quantity of RAM memory, the only feasible solution is to increase the amount of available memory in its local motherboard, even if the remaining ones are underused. Nevertheless, beyond a certain memory size, this memory budget increase becomes prohibitive. In this paper, we assume that the Remote Memory Access hardware used in a Hyper Transport based system allows applications to allocate the required memory from remote motherboards. We also analyze how the distribution of memory accesses among different memory locations (local or remote) impact on performance. Finally, an heuristic is devised to schedule local and remote memory among applications according to their requirements, and considering quality of service constraints. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Adding an Expressway to Accelerate the Neighborhood Communication

    Publication Year: 2010 , Page(s): 43 - 48
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (524 KB) |  | HTML iconHTML  

    The blade system is very popular in high performance computing. In a blade system, the blade is a fundamental element in which are symmetric multi-processors (SMP). About ten blades constitute a blade box, several blade boxes constitute a cabinet and some cabinets constitute a blade system at last. The blades in a blade box are neighbors because they have relatively short distance. Programmers always try to place the tightly related processes into the same blade box. However, there's seldom any optimization made by hardware to accelerate the communication in a blade box. Thus, a single chip design called hyper-node controller is presented to provide ultra low latency and high bandwidth which resembles an expressway between neighbors. All the nodes in a blade box can act as a single hyper node by using the hyper-node controller. It is apparent that the additional controller is a useful supplement to efficiently enhance the communication in a blade box and finally enhance the entire blade system. A FPGA prototype of the hyper-node controller has been implemented and it can connect five blades simultaneously. In the preliminary performance evaluation, the latency for an 8-byte payload between two blades is less than 1us, 1.33GB/s which is nearly 94% of the peak effective bandwidth can be obtained by transferring messages with a payload of only 256 bytes. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Client Based Data Isolation of Blue Whale File System in Non-linear Edit Field

    Publication Year: 2010 , Page(s): 49 - 54
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (474 KB) |  | HTML iconHTML  

    We have designed and implemented the Blue Whale File System (BWFS), a scalable distributed file system for large distributed data-intensive applications. With many of the features as previous distributed file systems, BWFS has successfully met our storage needs and is widely deployed within many fields. In BWFS, like in mainly traditional File Systems, one client can only access data within one file system where the data access interferes with each other which will lead to performance degradations. In this paper we propose a client based data isolation mechanism of Blue Whale File System in Non-linear Edit (NLE) field to separate different data into different file systems. Test result has verified the effectiveness of data isolation. By separating different data into different file system, the application I/O bandwidth is promoted by 111% in average, and the peak performance is improved by 4 times. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.