9-11 Feb. 2011
Filter Results
-
[Front cover]
Publication Year: 2011, Page(s): C1|
PDF (57 KB)
-
[Title page i]
Publication Year: 2011, Page(s): i|
PDF (92 KB)
-
[Title page iii]
Publication Year: 2011, Page(s): iii|
PDF (149 KB)
-
[Copyright notice]
Publication Year: 2011, Page(s): iv|
PDF (109 KB)
-
Table of contents
Publication Year: 2011, Page(s):v - xii|
PDF (154 KB)
-
Preface from the Program Chairs
Publication Year: 2011, Page(s): xiii -
Preface from the Organizing Chair
Publication Year: 2011, Page(s): xiv -
Program Committee
Publication Year: 2011, Page(s): xv|
PDF (56 KB)
-
Additional Reviewers
Publication Year: 2011, Page(s): xvi|
PDF (53 KB)
-
A Fast and Verified Algorithm for Proving Store-and-Forward Networks Deadlock-Free
Publication Year: 2011, Page(s):3 - 10
Cited by: Papers (2)Deadlocks are an important issue in the design of interconnection networks. A successful approach is to restrict the routing function such that it satisfies a necessary and sufficient condition for deadlock-free routing. Typically, such a condition states that some (extended) dependency graph must be a cyclic. Defining and proving such a condition is complex. Proving that a routing function satisf... View full abstract»
-
Dynamic I/O Reconfiguration for a NFS-Based Parallel File System
Publication Year: 2011, Page(s):11 - 18The large gap between the speed in which data can be processed and the performance of I/O devices makes the shared storage infrastructure of a cluster a great bottle-neck. Parallel File Systems try to smooth such difference by distributing data onto several servers, increasing the system's available bandwidth. However, most implementations use a fixed number of I/O servers, defined during the init... View full abstract»
-
Reliability Study of Coding Schemes for Wide-Area Distributed Storage Systems
Publication Year: 2011, Page(s):19 - 23
Cited by: Papers (1)Distributed storage systems comprise a large number of commodity hardware distributed across several data centers. Even in the presence of failures (permanent failures) the system should provide reliable storage. While replication has advantages because of its simplicity there exist coding techniques that provide adaptable reliability properties with an optimal redundancy ratio at the same time e.... View full abstract»
-
A Redundant Communication Approach to Scalable Fault Tolerance in PGAS Programming Models
Publication Year: 2011, Page(s):24 - 31
Cited by: Papers (9)Recent trends in high-performance computing point toward increasingly large machines with millions of processing, storage, and networking elements. Unfortunately, the reliability of these machines is inversely proportional to their size, resulting in a system-wide mean time between failures (MTBF), ranging from a few days to a few hours. As such, for long-running applications, the ability to effic... View full abstract»
-
Quantifying Thread Vulnerability for Multicore Architectures
Publication Year: 2011, Page(s):32 - 39
Cited by: Papers (2)Continuously reducing transistor sizes and aggressive low power operating modes employed by modern architectures tend to increase transient error rates. Concurrently, multicore machines are dominating the architectural spectrum in various application domains. These two trends require a fresh look at resiliency of multithreaded applications against transient errors from a software perspective. In t... View full abstract»
-
In Situ Power Analysis of General Purpose Graphical Processing Units
Publication Year: 2011, Page(s):40 - 44
Cited by: Papers (4)In this paper, an in situ power analysis profiling over time for general purpose graphics processing units (GPGPU) is presented. Based on this method the power consumption of different modes of operations like data transfer between GPU and host CPU, basic single precision floating point arithmetic operations (addition, subtraction, multiplication) on the multiprocessor units and instructions for s... View full abstract»
-
Job Scheduling with License Reservation: A Semantic Approach
Publication Year: 2011, Page(s):47 - 54
Cited by: Papers (1)The license management is one of the main concerns when Independent Software Vendors (ISV) try to distribute their software in computing platforms such as Clouds. They want to be sure that customers use their software according to their license terms. The work presented in this paper tries to solve part of this problem extending a semantic resource allocation approach for supporting the scheduling... View full abstract»
-
A Deadline Satisfaction Enhanced Workflow Scheduling Algorithm
Publication Year: 2011, Page(s):55 - 61
Cited by: Papers (1)Meeting users' deadline constraint is usually the most important goal of workflow scheduling in Grid environment. In order to consider the dynamism of Grid resource, we adopted a stochastic model to describe dynamic workloads of Grid resources. A concept called Deadline Satisfaction Degree of Workflow (DSDW) was defined to represent the probability that a workflow could be completed before its dea... View full abstract»
-
Distributed Load Balancing for Parallel Agent-Based Simulations
Publication Year: 2011, Page(s):62 - 69
Cited by: Papers (16)We focus on agent-based simulations where a large number of agents move in the space, obeying to some simple rules. Since such kind of simulations are computational intensive, it is challenging, for such a contest, to let the number of agents to grow and to increase the quality of the simulation. A fascinating way to answer to this need is by exploiting parallel architectures. In this paper, we pr... View full abstract»
-
A Failure Handling Framework for Distributed Data Mining Services on the Grid
Publication Year: 2011, Page(s):70 - 79
Cited by: Papers (1)Fault tolerance is an important issue in Grid computing, where many and heterogenous machines are used. In this paper we present a flexible failure handling framework which extends a service-oriented architecture for Distributed Data Mining previously proposed, addressing the requirements for fault tolerance in the Grid. The framework allows users to achieve failure recovery whenever a crash can o... View full abstract»
-
Balancing Workloads of Servers Maintaining Scalable Distributed Data Structures
Publication Year: 2011, Page(s):80 - 84
Cited by: Papers (2)A new architecture of Scalable Distributed Data Structures (SDDS) is presented and evaluated. It applies for SDDS files with overactive servers. Every bucket of the file is supplemented with a reference counter. The number of references to a bucket is counted up. It reflects activity of the bucket and is used for selecting the most active and most often used buckets (overactive servers). Workloads... View full abstract»
-
High Performance Matrix Inversion on a Multi-core Platform with Several GPUs
Publication Year: 2011, Page(s):87 - 93
Cited by: Papers (6)Inversion of large-scale matrices appears in a few scientific applications like model reduction or optimal control. Matrix inversion requires an important computational effort and, therefore, the application of high performance computing techniques and architectures for matrices with dimension in the order of thousands. Following the recent uprise of graphics processors (GPUs), we present and eval... View full abstract»
-
Parallization of Adaboost Algorithm through Hybrid MPI/OpenMP and Transactional Memory
Publication Year: 2011, Page(s):94 - 100
Cited by: Papers (2)This paper proposes a parallelization of the Adaboost algorithm through hybrid usage of MPI, OpenMP, and transactional memory. After detailed analysis of the Adaboost algorithm, we show that multiple levels of parallelism exists in the algorithm. We develop the lower level of parallelism through OpenMP and higher level parallelism through MPI. Software transactional memory are used to facilitate t... View full abstract»
-
Scaleable Sparse Matrix-Vector Multiplication with Functional Memory and GPUs
Publication Year: 2011, Page(s):101 - 108Sparse matrix-vector multiplication on GPUs faces to a serious problem when the vector length is too large to be stored in GPU's device memory. To solve this problem, we propose a novel software-hardware hybrid method for a heterogeneous system with GPUs and functional memory modules connected by PCI express. The functional memory contains huge capacity of memory and provides scatter/gather operat... View full abstract»
-
Accelerating Parameter Sweep Applications Using CUDA
Publication Year: 2011, Page(s):111 - 118This paper proposes a parallelization scheme for parameter sweep (PS) applications using the compute unified device architecture (CUDA). Our scheme focuses on PS applications with irregular access patterns, which usually result in lower performance on the GPU. The key idea to resolve this irregularity is to exploit the similarity of data accesses between different parameters. That is, the scheme s... View full abstract»
-
FFT Implementation on a Streaming Architecture
Publication Year: 2011, Page(s):119 - 126
Cited by: Papers (2)Fast Fourier Transform (FFT) is a useful tool for applications requiring signal analysis and processing. However, its high computational cost requires efficient implementations, specially if real time applications are used, where response time is a decisive factor. Thus, the computational cost and wide application range that requires FFT transforms has motivated the research of efficient implement... View full abstract»