By Topic

Parallel and Distributed Processing Symposium, 2007. IPDPS 2007. IEEE International

Date 26-30 March 2007

Filter Results

Displaying Results 1 - 25 of 499
  • [Front cover]

    Page(s): C1
    Save to Project icon | Request Permissions | PDF file iconPDF (258 KB)  
    Freely Available from IEEE
  • [Copyright notice]

    Page(s): ii
    Save to Project icon | Request Permissions | PDF file iconPDF (332 KB)  
    Freely Available from IEEE
  • 21st International Parallel and Distributed Processing Symposium

    Page(s): iii
    Save to Project icon | Request Permissions | PDF file iconPDF (342 KB)  
    Freely Available from IEEE
  • Summary of Contents

    Page(s): v
    Save to Project icon | Request Permissions | PDF file iconPDF (619 KB)  
    Freely Available from IEEE
  • Detailed Table of contents

    Page(s): vii - xxxv
    Save to Project icon | Request Permissions | PDF file iconPDF (19157 KB)  
    Freely Available from IEEE
  • Message from the General Chair

    Page(s): 2 - 3
    Save to Project icon | Request Permissions | PDF file iconPDF (96 KB)  
    Freely Available from IEEE
  • Message from the Program Chair

    Page(s): 4 - 5
    Save to Project icon | Request Permissions | PDF file iconPDF (539 KB)  
    Freely Available from IEEE
  • Message from the Workshops Chair

    Page(s): 6
    Save to Project icon | Request Permissions | PDF file iconPDF (101 KB)  
    Freely Available from IEEE
  • Message from the Steering Co-Chairs

    Page(s): 7
    Save to Project icon | Request Permissions | PDF file iconPDF (108 KB)  
    Freely Available from IEEE
  • IPDPS 2007 Organization

    Page(s): 8 - 11
    Save to Project icon | Request Permissions | PDF file iconPDF (102 KB)  
    Freely Available from IEEE
  • TCPP Presentation and Invited Speech Reinventing Computing

    Page(s): 12
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (92 KB)  

    The many-core inflection point presents a new challenge for our industry, namely general-purpose parallel computing. Unless this challenge is met, the continued growth and importance of computing itself and of the businesses engaged in it are at risk. We must make parallel programming easier and more generally applicable than it is now, and build hardware and software that will execute arbitrary parallel programs on whatever scale of system the user has. The changes needed to accomplish this are significant and affect computer architecture, the entire software development tool chain, and the army of application developers that will rely on those tools to develop parallel applications. This talk will point out a few of the hard problems that face us and some prospects for addressing them. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Keynote Speech: Large-Scale Bioimaging and Visualization

    Page(s): 12 - 13
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (95 KB)  

    The next decades will see an explosion in the use and the scope of medical imaging, fueled by advanced computing and visualization techniques. In my opinion, advanced, multimodal imaging and visualization techniques, powered by new computational methods, will change the face of biology and medicine and provide comprehensive views of the human body in progressively greater depth and detail. As the resolution of imaging devices continue to increase, image sizes grow accordingly. Multi-modal and/or longitudinal imaging studies result in large-scale data sets requiring parallel computing and visualization. In this presentation, I will discuss the state-of-the-art in large-scale biomedical imaging and visualization research, present examples of their vital roles in neuroscience, neurosurgery, radiology, and biology and discuss future challenges. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Symposium Evening Tutorial: High-performance Computing Methods for Computational Genomics

    Page(s): 13
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (70 KB)  

    As biomolecular sequence data continue to be amassed at unprecedented rates, the design of effective computational methods and capabilities that can derive biologically significant information from them has become both increasingly challenging and imperative. In this tutorial, the audience will be first introduced to the different types of biomolecular sequence data and the wealth of information they encode. Following this technical grounding, high-performance computing approaches developed to address some of the most computationally challenging problems in genomics will be described. The contents will be presented in three parts: (i) In the first part, we will describe methods that were designed to query a sequence against a large sequence database. Two popular parallel approaches, mpiBLAST and ScalaBLAST, implementing the NCBI BLAST suite of programs will be described. (ii) Next, we will describe PaCE, which is a parallel DNA sequence clustering algorithm. As direct applications, we will discuss the clustering of large-scale Expressed Sequence Tag data and the assembly of complex genomes. (iii) Finally, we describe GRAPPA, which is a high-performance software suite developed for phylogenetic reconstruction of a collection of genomes or genes. Throughout the tutorial, emphasis will be on both scalability and effectiveness in exploiting large-scale state-of-the-art supercomputing technologies. The intended audience are academic and industry researchers, educators, and/or commercial application developers, with a computational background. No background in biology is assumed. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Keynote Speech: Avoiding the Memory Bottleneck through Structured Arrays

    Page(s): 13 - 14
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (98 KB)  

    Basic to parallel program speedup is dealing with memory bandwidth requirements. One solution is an architectural arrangement to stream data across multiple processing elements before storing the result in memory. This MISD type of configuration provides multiple operations per data item fetched from memory. One realization of this streamed approach uses FPGAs. We'll discuss both the general memory problem and some results based on work at Maxeler using FPGAs for acceleration. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • IPDPS Panel: Is the Multi-Core Roadmap going to Live Up to its Promises?

    Page(s): 14
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (96 KB)  

    Multi-cores are here to stay, whether we like it or not. With a quadrupling of the core count every three years, chips with hundreds of processor cores are projected in the next decade. The question is, how much of their computational power can be unleashed, what it will take to unleash it, and how best can research accelerate progress? Several decades of research in multiprocessing has not really made the case. On the other hand, now that coarse-grain parallelism seems to be our only hope and the computing landscape is arguably different, opportunities may arise. The following cross-cutting issues will be debated in this panel with the hope of distilling new avenues for parallelism exploitation: Is the computing landscape (technology, applications, and market) today sufficiently different to exploit multiprocessors from what it was in the past? If yes, in what sense? If not, why? Do we need more research in multiprocessing given past work? If yes, what are the biggest challenges? If not, state the reasons. Will progress in software/architecture make it possible to make sequential languages prevail? If yes, what are the top priorities in research to make that happen? If not, what are the visions for a parallel-language paradigm shift and what are the major challenges in software/architecture research to accelerate uptake in the programming community? Would multi-disciplinary research (across the applications, algorithms, software, and architecture areas) be a good way to accelerate developments? Then, what areas should interact more closely and with what goals in mind? View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Banquet and Invited Speech Why Peta-Scale is Different: An Ecosystem Approach to Predictive Scientific and Engineering Simulation

    Page(s): 14 - 15
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (100 KB)  

    With the recent advent of 100s of teraFLOP/s-scale simulations capability at Lawrence Livermore National Laboratory and other sites, it has become clear that the scientific method has changed. This transition has taken us from theory and experiment to theory and experiment being tightly integrated by simulation. With the advent of peta-scale simulations on the horizon it is appropriate to take stock of the recent advances and to look forward to the coming wave of future systems. In this talk we focus on some areas of science that open up with peta-scale systems and how this is VERY different from the science one can accomplish with a single workstation (giga-scale simulation). In actual fact, the science enabled by tera-scale and peta-scale systems require a whole new approach to the scientific method. One of the things we are starting to realize being at the leading edge of applying this new technology, is that with the coming onset of peta-scale simulations (systems, visualization, and applications) is that we may be headed for huge scientific breakthroughs enabled View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Keynote Speech: Quantum Physics and the Nature of Computation

    Page(s): 15 - 16
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (97 KB)  

    Quantum physics is a fascinating area from a computational viewpoint. The features that make quantum systems prohibitively hard to simulate classically are precisely the aspects exploited by quantum computation to obtain exponential speedups over classical computers. In this talk I will survey our current understanding of the power (and limits) of quantum computers, and prospects for experimentally realizing them in the near future. I will also touch upon insights from quantum comuptation that have resulted in new classical algorithms for efficient simulation of certain important quantum systems. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • IPDPS 2007 Reviewers

    Page(s): 17 - 18
    Save to Project icon | Request Permissions | PDF file iconPDF (84 KB)  
    Freely Available from IEEE
  • High-performance Computing Methods for Computational Genomics

    Page(s): 1 - 143
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (62960 KB)  

    The article consists of a Powerpoint presentation on high performance computing methods for computational genomics. The areas discussed include: sequence alignment; database querying; EST clustering; genome assembly; evolutionary history reconstruction. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • VoroNet: A scalable object network based on Voronoi tessellations

    Page(s): 1 - 10
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (476 KB) |  | HTML iconHTML  

    In this paper, we propose the design of VoroNet, an object-based peer to peer overlay network relying on Voronoi tessellations, along with its theoretical analysis and experimental evaluation. VoroNet differs from previous overlay networks in that peers are application objects themselves and get identifiers reflecting the semantics of the application instead of relying on hashing functions. This enables a scalable support for efficient search in large collections of data. In VoroNet, objects are organized in an attribute space according to a Voronoi diagram. VoroNet is inspired from the Kleinberg's small-world model where each peer gets connected to close neighbours and maintains an additional pointer to a long-range neighbour. VoroNet improves upon the original proposal as it deals with general object topologies and therefore copes with skewed data distributions. We show that VoroNet can be built and maintained in a fully decentralized way. The theoretical analysis of the system proves that routing in VoroNet can be achieved in a poly-logarithmic number of hops in the size of the system. The analysis is fully confirmed by our experimental evaluation by simulation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Almost Peer-to-Peer Clock Synchronization

    Page(s): 1 - 10
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2389 KB) |  | HTML iconHTML  

    In this paper, an almost peer-to-peer (AP2P) clock synchronization protocol is proposed. AP2P is almost peer-to-peer in the sense that it provides the desirable features of a purely hierarchical (client/server) clock synchronization protocol while avoiding the undesirable consequences of a purely peer-to-peer one. In AP2P, a unique node is elected as a leader in a distributed manner. Each non-leader node adjusts its clock rate based on message exchanges with its neighbors, taking into consideration that neighbors that are closer to the leader have more effect on the adjustment than the neighbors that are further away from the leader. We compare the performance of AP2P with that of the server time protocol (STP), which is a purely hierarchical clock synchronization protocol. Simulation results, which have been conducted on several network topologies, have shown that AP2P can provide a clock synchronization accuracy that is indistinguishable from that of STP. Furthermore, AP2P is more fault-tolerant because it can recover from certain types of failures that STP cannot recover from. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Locality-Aware Consistency Maintenance for Heterogeneous P2P Systems

    Page(s): 1 - 10
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (348 KB) |  | HTML iconHTML  

    Replication and caching have been deployed widely in current P2P systems. In update-allowed P2P systems, a consistency maintenance mechanism is strongly demanded. Several solutions have been proposed to maintain the consistency of P2P systems. However, they either use too much redundant update messages, or ignore the heterogeneity nature of P2P systems. Moreover, they propagate updated contents on a locality-ignorant structure, which could consume unnecessary backbone bandwidth and delay the convergence of consistency maintenance. This paper presents a locality-aware consistency maintenance scheme for heterogeneous P2P systems. Taking the heterogeneity nature, we form the replica nodes into a locality-aware hierarchical structure: the upper layer is DHT-based and a node in the lower layer attaches to a physically close node in the upper layer. An efficient update tree is built dynamically upon the upper layer to propagate the updated contents. Theoretical analyses and simulation results demonstrate the effectiveness of our scheme. Specially, experiment results show that, compared with gossip-based scheme, our approach reduces the cost by about one order of magnitude. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Benefits of Targeting in Trusted Gossiping for Peer-to-Peer Information Sharing

    Page(s): 1 - 10
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (322 KB) |  | HTML iconHTML  

    In a recent study, we proposed a trusted gossip protocol for rumor resistant information sharing in peer-to-peer networks. While trust aware gossiping significantly reduced the rumor spread on the network, we observed that the random message spraying in trusted gossip creates too many redundant messages increasing the message overhead and error rate. In this paper, we propose a message targeting scheme that can significantly improve the performance of the trusted gossip. Our targeting scheme can be easily implemented in a social network setting. We performed large-scale simulations using traces collected from the Flickr social network and other data sets to estimate the performance of targeting in trusted gossip. Our experiments show that significant performance gains can be achieved. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Building the Tree of Life on Terascale Systems

    Page(s): 1 - 10
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (11702 KB) |  | HTML iconHTML  

    Bayesian phylogenetic inference is an important alternative to maximum likelihood-based phylogenetic method. However, inferring large trees using the Bayesian approach is computationally demanding - requiring huge amounts of memory and months of computational time. With a combination of novel parallel algorithms and latest system technology, terascale phylogenetic tools provide biologists the computational power necessary to conduct experiments on very large dataset, and thus aid construction of the tree of life. In this work we evaluate the performance of PBPI, a parallel application that reconstructs phylogenetic trees using MCMC-based Bayesian methods, on two terascale systems, Blue Gene/L at IBM Rochester and System X at Virginia Tech. Our results confirm that for a benchmark dataset with 218 taxa and 10000 characters, PBPI can achieve linear speedup on 1024 or more processors for both systems. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Inverse Space-Filling Curve Partitioning of a Global Ocean Model

    Page(s): 1 - 10
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1581 KB) |  | HTML iconHTML  

    In this paper, we describe how inverse space-filling curve partitioning is used to increase the simulation rate of a global ocean model. Space-filling curve partitioning allows for the elimination of load imbalance in the computational grid due to land points. Improved load balance combined with code modifications within the conjugate gradient solver significantly increase the simulation rate of the parallel ocean program at high resolution. The simulation rate for a high resolution model nearly doubled from 4.0 to 7.9 simulated years per day on 28,972 IBM Blue Gene/L processors. We also demonstrate that our techniques increase the simulation rate on 7545 Cray XT3 processors from 6.3 to 8.1 simulated years per day. Our results demonstrate how minor code modifications can have significant impact on resulting performance for very large processor counts. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.