By Topic

2010 Ninth International Symposium on Parallel and Distributed Computing

7-9 July 2010

Filter Results

Displaying Results 1 - 25 of 37
  • [Front cover]

    Publication Year: 2010, Page(s): C1
    Request permission for commercial reuse | PDF file iconPDF (738 KB)
    Freely Available from IEEE
  • [Title page i]

    Publication Year: 2010, Page(s): i
    Request permission for commercial reuse | PDF file iconPDF (81 KB)
    Freely Available from IEEE
  • [Title page iii]

    Publication Year: 2010, Page(s): iii
    Request permission for commercial reuse | PDF file iconPDF (138 KB)
    Freely Available from IEEE
  • [Copyright notice]

    Publication Year: 2010, Page(s): iv
    Request permission for commercial reuse | PDF file iconPDF (109 KB)
    Freely Available from IEEE
  • Table of contents

    Publication Year: 2010, Page(s):v - vii
    Request permission for commercial reuse | PDF file iconPDF (169 KB)
    Freely Available from IEEE
  • Message from the ISPDC 2010 Chairs

    Publication Year: 2010, Page(s): viii
    Request permission for commercial reuse | PDF file iconPDF (93 KB) | HTML iconHTML
    Freely Available from IEEE
  • ISPDC 2010 Committees

    Publication Year: 2010, Page(s):ix - xi
    Request permission for commercial reuse | PDF file iconPDF (110 KB)
    Freely Available from IEEE
  • Optimizing the Reliability of Pipelined Applications under Throughput Constraints

    Publication Year: 2010, Page(s):1 - 8
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (317 KB) | HTML iconHTML

    Mapping a pipelined application onto a distributed and parallel platform is a challenging problem. The problem becomes even more difficult when multiple optimization criteria are involved, and when the target resources are heterogeneous (processors and communication links) and subject to failures. This paper investigates the problem of mapping pipelined applications, consisting of a linear chain o... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Algorithm for Mapping Multilayer BP Networks onto the SpiNNaker Neuromorphic Hardware

    Publication Year: 2010, Page(s):9 - 16
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (287 KB) | HTML iconHTML

    This paper demonstrates the feasibility and evaluates the performance of using the SpiNNaker neuromorphic hardware to simulate traditional non-spiking multi-layer perceptron networks with the back propagation learning rule. In addition to investigating the mapping of checker-boarding partitioning scheme onto SpiNNaker, we propose a new algorithm called pipelined checker-boarding partitioning which... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Decomposition Based Algorithm for State Prediction in Large Scale Distributed Systems

    Publication Year: 2010, Page(s):17 - 24
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1719 KB) | HTML iconHTML

    Prediction represents an important component of resource management, providing information about the future state, utilization and availability of resources. We propose a new prediction algorithm inspired from the decomposition of a complex wave into simpler waves with fixed frequencies (similar to Fourier decomposition). The partial results obtained from this decomposition stage are combined usin... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Operational Semantics of the Marte Repetitive Structure Modeling Concepts for Data-Parallel Applications Design

    Publication Year: 2010, Page(s):25 - 32
    Cited by:  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (407 KB) | HTML iconHTML

    This paper presents an operational semantics of the repetitive model of computation, which is the basis for the repetitive structure modeling (RSM) package defined in the standard UML Marte profile. It also deals with the semantics of an RSM extension for control-oriented design. The goal of this semantics is to serve as a formal support for i) reasoning about the behavioral properties of models s... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Butterfly Automorphisms and Edge Faults

    Publication Year: 2010, Page(s):33 - 40
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (298 KB) | HTML iconHTML

    This paper obtains all the automorphisms of a wrapped Butterfly network of degree n using an algebraic model. It also investigates the translation of butterfly edges by automorphisms. It proposes a new strategy for algorithm mappings on an architecture with faulty edges. This strategy essentially consists of finding an automorphism that would map the faulty edges to the free edges in the graph. Ha... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Cost Performance Analysis in Multi-level Tree Networks

    Publication Year: 2010, Page(s):41 - 48
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (888 KB) | HTML iconHTML

    A monetary network cost problem involving a homogeneous multi-level tree of processors and links is discussed. The monetary network cost of processing a divisible load, which is linearly dependent on the amount of divisible workload, is basically composed of a communication cost and a computing cost. A monetary network analysis is performed by aggregating the network speed parameters and network c... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Pretty Good Accuracy in Matrix Multiplication with GPUs

    Publication Year: 2010, Page(s):49 - 55
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (172 KB) | HTML iconHTML

    With systems such as Road Runner, there is a trend in super computing to offload parallel tasks to special purpose co-processors, composed of many relatively simple scalar processors. The cheaper commodity class equivalent of such a processor would be the graphics card, potentially offering super computer power within the confines of a desktop PC. Graphics cards however are not without problems, t... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Exploiting the Power of GPUs for Multi-gigabit Wireless Baseband Processing

    Publication Year: 2010, Page(s):56 - 62
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (333 KB) | HTML iconHTML

    In this paper, we explore the feasibility of achieving gigabit baseband throughput using the vast computational power offered by the graphics processors (GPUs). One of the most computationally intensive functions commonly used in baseband communications, the Fast Fourier Transform (FFT) algorithm, is implemented on an NVIDIA GPU using their general-purpose computing platform called the Compute Uni... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • NQueens on CUDA: Optimization Issues

    Publication Year: 2010, Page(s):63 - 70
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (428 KB) | HTML iconHTML

    Todays commercial off-the-shelf computer systems are multicore computing systems as a combination of CPU, graphic processor (GPU) and custom devices. In comparison with CPU cores, graphic cards are capable to execute hundreds up to thousands compute units in parallel. To benefit from these GPU computing resources, applications have to be parallelized and adapted to the target architecture. In this... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel Cycle Based Logic Simulation Using Graphics Processing Units

    Publication Year: 2010, Page(s):71 - 78
    Cited by:  Papers (9)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (412 KB) | HTML iconHTML

    Graphics Processing Units (GPUs) are gaining popularity for parallelization of general purpose applications. GPUs are massively parallel processors with huge performance in a small and readily available package. At the same time, the emergence of general purpose programming environments for GPUs such as CUDA shorten the learning curve of GPU programming. We present a GPU-based parallelization of l... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel ID Shadow-Map Decompression on GPU

    Publication Year: 2010, Page(s):79 - 84
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (341 KB) | HTML iconHTML

    ID shadow-maps are used for robust real-time rendering of shadows. The primary disadvantage of using shadow-maps is their excessive size for large scenes in case high quality shadows are needed. To eliminate large memory requirements and texture-size limitations of the current generation GPUs, texture compression is an important tool. We present a framework where compressed ID-shadow-maps are used... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Realizing Optimization Opportunities for Distributed Applications in the Middleware Layer by Utilizing InDiGO Framework

    Publication Year: 2010, Page(s):85 - 92
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (614 KB) | HTML iconHTML

    InDiGO framework provides an infrastructure to develop generic but customizable middleware services. It also provides tools to customize the middleware algorithms for specific applications. Such customization allows one to optimize algorithms by removing communication which is redundant in the context of a specific application. In this paper, we apply InDiGO framework to study a class of bidding d... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Practical Uniform Peer Sampling under Churn

    Publication Year: 2010, Page(s):93 - 100
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (366 KB) | HTML iconHTML

    Providing independent uniform samples from a system population poses considerable problems in highly dynamic settings, like P2P systems, where the number of participants and their unpredictable behavior (e.g., churn, crashes etc.) may introduce relevant bias. Current implementations of the Peer Sampling Service are designed to provide uniform samples only in static settings and do not consider tha... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Improving Grid Fault Tolerance by Means of Global Behavior Modeling

    Publication Year: 2010, Page(s):101 - 108
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (309 KB) | HTML iconHTML

    Grid systems have proved to be one of the most important new alternatives to face challenging problems but, to exploit its benefits, dependability and fault tolerance are key aspects. However, the vast complexity of these systems limits the efficiency of traditional fault tolerance techniques. It seems necessary to distinguish between resource-level fault tolerance (focused on every machine) and s... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Toward a Reliable Distributed Data Management System

    Publication Year: 2010, Page(s):109 - 116
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (607 KB) | HTML iconHTML

    Modern collaborative science has placed increasing burden on data management infrastructure to handle the increasingly large data archives generated. Beside functionality, reliability and availability are also key factors in delivering a data management system that can efficiently and effectively meet the challenges posed and compounded by the unbounded increase in the size of data archive generat... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Early Performance Evaluation of New Six-Core Intel® Xeon® 5600 Family Processors for HPC

    Publication Year: 2010, Page(s):117 - 124
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (418 KB) | HTML iconHTML

    In this paper we take a look at what the newest member of the Intel Xeon Processor family, code named Westmere brings to high performance computing. We compare three generations of Intel Xeon based systems and present a performance evolutions based on 16 node clusters based on these CPUs respectively. We compare CPU generations utilizing dual socket platforms and a cluster across a number of HPC b... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Energy Minimization on Thread-Level Speculation in Multicore Systems

    Publication Year: 2010, Page(s):125 - 132
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (317 KB) | HTML iconHTML

    Thread-Level Speculation (TLS) has shown great promise as an automatic parallelization technique to achieve high level performance by partitioning a sequential program into threads, which are expected to be optimistically executed in parallel. In this paper, we propose a load-balancing approach to save energy using dynamic voltage scaling. By scaling the voltage of processors running short threads... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Resource-Aware Compiler Prefetching for Many-Cores

    Publication Year: 2010, Page(s):133 - 140
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (324 KB) | HTML iconHTML

    Super-scalar, out-of-order processors that can have tens of read and write requests in the execution window place significant demands on Memory Level Parallelism (MLP). Multi-and many-cores with shared parallel caches further increase MLP demand. Current cache hierarchies however have been unable to keep up with this trend, with modern designs allowing only 4-16 concurrent cache misses. This disco... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.