By Topic

Parallel and Distributed Systems, IEEE Transactions on

Issue 5 • Date May 2010

Filter Results

Displaying Results 1 - 17 of 17
  • [Front cover]

    Page(s): c1
    Save to Project icon | Request Permissions | PDF file iconPDF (110 KB)  
    Freely Available from IEEE
  • [Inside front cover]

    Page(s): c2
    Save to Project icon | Request Permissions | PDF file iconPDF (138 KB)  
    Freely Available from IEEE
  • Editor's Note

    Page(s): 577 - 578
    Save to Project icon | Request Permissions | PDF file iconPDF (143 KB)  
    Freely Available from IEEE
  • Energy-Efficient Protocol for Deterministic and Probabilistic Coverage in Sensor Networks

    Page(s): 579 - 593
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1804 KB) |  | HTML iconHTML  

    Various sensor types, e.g., temperature, humidity, and acoustic, sense physical phenomena in different ways, and thus, are expected to have different sensing models. Even for the same sensor type, the sensing model may need to be changed in different environments. Designing and testing a different coverage protocol for each sensing model is indeed a costly task. To address this challenging task, we propose a new probabilistic coverage protocol (denoted by PCP) that could employ different sensing models. We show that PCP works with the common disk sensing model as well as probabilistic sensing models, with minimal changes. We analyze the complexity of PCP and prove its correctness. In addition, we conduct an extensive simulation study of large-scale sensor networks to rigorously evaluate PCP and compare it against other deterministic and probabilistic protocols in the literature. Our simulation demonstrates that PCP is robust, and it can function correctly in presence of random node failures, inaccuracies in node locations, and imperfect time synchronization of nodes. Our comparisons with other protocols indicate that PCP outperforms them in several aspects, including number of activated sensors, total energy consumed, and network lifetime. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Efficient Superpeer Overlay Construction and Broadcasting Scheme Based on Perfect Difference Graph

    Page(s): 594 - 606
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1667 KB) |  | HTML iconHTML  

    Two-layer hierarchy unstructured peer-to-peer (P2P) systems, comprising an upper layer of superpeers and an underlying layer of ordinary peers, are commonly used to improve the performance of large-scale P2P systems. However, the optimal superpeer network design involves several requirements including superpeer degree, network diameter, scalability, load balancing, and flooding performance. A perfect difference graph has desirable properties to satisfy the above design rationale of superpeers overlay network. This paper proposes a two-layer hierarchical unstructured P2P system in which a perfect difference graph (PDG) is used to dynamically construct and maintain the superpeer overlay topology. In addition, the broadcasting performance of the P2P system is enhanced through the use of a PDG-based forwarding algorithm, which ensures that each superpeer receives just one lookup query flooding message. The theoretical results show that the proposed system improves existing superpeer hierarchical unstructured P2P systems in terms of a smaller network diameter, fewer lookup flooding messages, and a reduced average delay, and the experimental results show that the proposed two-layer hierarchy P2P system performs very well in the dynamic network environment. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Coupling-Based Internal Clock Synchronization for Large-Scale Dynamic Distributed Systems

    Page(s): 607 - 619
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1992 KB) |  | HTML iconHTML  

    This paper studies the problem of realizing a common software clock among a large set of nodes without an external time reference (i.e., internal clock synchronization), any centralized control, and where nodes can join and leave the distributed system at their will. The paper proposes an internal clock synchronization algorithm which combines the gossip-based paradigm with a nature-inspired approach, coming from the coupled oscillators phenomenon, to cope with scale and churn. The algorithm works on the top of an overlay network and uses a uniform peer sampling service to fulfill each node's local view. Therefore, differently from clock synchronization protocols for small scale and static distributed systems, here, each node synchronizes regularly with only the neighbors in its local view and not with the whole system. An evaluation of the convergence speed and the synchronization error of the coupled-based internal clock synchronization algorithm has been carried out, showing how convergence time and the synchronization error depends on the coupling factor and the local view size. Moreover, the variation of the synchronization error with respect to churn and the impact of a sudden variation of the number of nodes have been analyzed to show the stability of the algorithm. In all these contexts, the algorithm shows nice performance and very good self-organizing properties. Finally, we showed how the assumption on the existence of a uniform peer-sampling service is instrumental for the good behavior of the algorithm and how, in system models where network delays are unbounded, a mean-based convergence function reaches a lower synchronization error than median-based convergence functions exploiting the number of averaged clock values. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient Algorithms for Global Snapshots in Large Distributed Systems

    Page(s): 620 - 630
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2003 KB) |  | HTML iconHTML  

    Existing algorithms for global snapshots in distributed systems are not scalable when the underlying topology is complete. There are primarily two classes of existing algorithms for computing a global snapshot. Algorithms in the first class use control messages of size 0(1) but require O(N) space and O(N) messages per processor in a network with JV processors. Algorithms in the second class use control messages (such as rotating tokens with vector counter method) of size O(N), use multiple control messages per channel, or require recording of message history. As a result, algorithms in both of these classes are not efficient in large systems when the logical topology of the communication layer such as MPI is complete. In this paper, we propose three scalable algorithms for global snapshots: a grid-based, a tree-based, and a centralized algorithm. The grid-based algorithm uses O(N) space but only O(??(N)) messages per processor each of size O(??(N)). The tree-based and centralized algorithms use only O(1) size messages. The tree-based algorithm requires O(1) space and O(log N log(W/N)) messages per processor where W is the total number of messages in transit. The centralized algorithm requires O(1) space and O(log(W/N)) messages per processor. We also have a matching lower bound for this problem. We also present hybrid of centralized and tree-based algorithms that allow trade-off between the decentralization and the message complexity. Our algorithms have applications in checkpointing, detecting stable predicates, and implementing synchronizers. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Quality of Trilateration: Confidence-Based Iterative Localization

    Page(s): 631 - 640
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2018 KB) |  | HTML iconHTML  

    The proliferation of wireless and mobile devices has fostered the demand for context-aware applications, in which location is one of the most significant contexts. Multilateration, as a basic building block of localization, however, has not yet overcome the challenges of 1) poor ranging measurements; 2) dynamic and noisy environments; and 3) fluctuations in wireless communications. Hence, multilateration-based approaches often suffer from poor accuracy and can hardly be employed in practical applications. In this study, we propose Quality of Trilateration (QoT) that quantifies the geometric relationship of objects and ranging noises. Based on QoT, we design a confidence-based iterative localization scheme, in which nodes dynamically select trilaterations with the highest quality for location computation. To validate this design, a prototype network based on wireless sensor motes is deployed and the results show that QoT well represents trilateration accuracy, and the proposed scheme significantly improves localization accuracy. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The Signal Synchronous Multiclock Approach to the Design of Distributed Embedded Systems

    Page(s): 641 - 657
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2831 KB) |  | HTML iconHTML  

    This paper presents the design of distributed embedded systems using the synchronous multiclock model of the SIGNAL language. It proposes a methodology that ensures a correct-by-construction functional implementation of these systems from high-level models. It shows the capability of the synchronous approach to apply formal techniques and tools that guarantee the reliability of the designed systems. Such a capability is necessary and highly worthy when dealing with safety-critical systems. The proposed methodology is demonstrated through a case study consisting of a simple avionic application, which aims to pragmatically help the reader to understand the manipulated formal concepts, and to apply them easily in order to solve system correctness issues encountered in practice. The application functionality is first modeled as well as its distribution on a generic hardware architecture. This relies on the endochrony and endo-isochrony properties of SIGNAL specifications, defined previously. The considered architectures include asynchronous communication mechanisms, which are also modeled in SIGNAL and proved to achieve message exchanges correctly. Furthermore, the synchronizability of the different parts in the resulting system is addressed after its deployment on a specific execution platform with multirate clocks. After all these steps, a distributed code can be automatically generated. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • PowerPack: Energy Profiling and Analysis of High-Performance Systems and Applications

    Page(s): 658 - 671
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3653 KB) |  | HTML iconHTML  

    Energy efficiency is a major concern in modern high-performance computing system design. In the past few years, there has been mounting evidence that power usage limits system scale and computing density, and thus, ultimately system performance. However, despite the impact of power and energy on the computer systems community, few studies provide insight to where and how power is consumed on high-performance systems and applications. In previous work, we designed a framework called PowerPack that was the first tool to isolate the power consumption of devices including disks, memory, NICs, and processors in a high-performance cluster and correlate these measurements to application functions. In this work, we extend our framework to support systems with multicore, multiprocessor-based nodes, and then provide in-depth analyses of the energy consumption of parallel applications on clusters of these systems. These analyses include the impacts of chip multiprocessing on power and energy efficiency, and its interaction with application executions. In addition, we use PowerPack to study the power dynamics and energy efficiencies of dynamic voltage and frequency scaling (DVFS) techniques on clusters. Our experiments reveal conclusively how intelligent DVFS scheduling can enhance system energy efficiency while maintaining performance. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Real-Time Modeling of Wheel-Rail Contact Laws with System-On-Chip

    Page(s): 672 - 684
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (5362 KB) |  | HTML iconHTML  

    This paper presents the development and implementation of a multiprocessor system-on-chip solution for fast and real-time simulations of complex and nonlinear wheel-rail contact mechanics. There are two main significances in this paper. First, the wheel-rail contact laws (including Hertz and Fastsim algorithms), which are widely used in the study of railway vehicle dynamics, are restructured for improved suitability that can take advantage of the rapid developing multiprocessor technology. Second, the complex algorithms for the contact laws are successfully implemented on a medium-sized Field-Programmable Gate Array (FPGA) device using six NiosII processors, where the executions of the Hertz and Fastsim parts are pipelined to achieve further enhancement in multiple contacts and the operation scheduling is optimized. In the Fastsim part, the floating point units with buffering mechanism are efficiently shared by five processors connected in a token ring topology. The FPGA design shows good flexibility in utilizing logic element and on-chip memory resource on the device and scalability for a significant speed up on a larger device in future work. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Toward Systematical Data Scheduling for Layered Streaming in Peer-to-Peer Networks: Can We Go Farther?

    Page(s): 685 - 697
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2609 KB) |  | HTML iconHTML  

    Layered streaming in P2P networks has become a hot topic recently. However, the "layered" feature makes the data scheduling quite different from that for nonlayered streaming, and it hasn't been systematically studied yet. In this paper, first, according to the unique characteristics caused by layered coding, we present four objectives that should be addressed by scheduling: throughput, layer delivery ratio, useless packets ratio, and subscription jitter prevention; then a three-stage scheduling approach LayerP2P is designed to request data, where the min-cost flow model, probability decision mechanism, and multiwindow remedy mechanism are used in Free Stage, Decision Stage, and Remedy Stage, respectively, to collaboratively achieve the above objectives. With the basic version of LayerP2P and corresponding experiment results achieved in our previous work, in this paper, more efforts are put on its mechanism details and analysis to its unique features; besides, to further guarantee the performance under sharp bandwidth variation, we propose the enhanced approach by improving the Decision Stage strategy. Extensive experiments by simulation and real network implementation indicate that it outperforms other schemes. LayerP2P has also been deployed in PDEPS Project in China, which is expected to be the first practical layered streaming system for education in P2P networks. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Self-Consistent MPI Performance Guidelines

    Page(s): 698 - 709
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (771 KB) |  | HTML iconHTML  

    Message passing using the Message-Passing Interface (MPI) is at present the most widely adopted framework for programming parallel applications for distributed memory and clustered parallel systems. For reasons of (universal) implementability, the MPI standard does not state any specific performance guarantees, but users expect MPI implementations to deliver good and consistent performance in the sense of efficient utilization of the underlying parallel (communication) system. For performance portability reasons, users also naturally desire communication optimizations performed on one parallel platform with one MPI implementation to be preserved when switching to another MPI implementation on another platform. We address the problem of ensuring performance consistency and portability by formulating performance guidelines and conditions that are desirable for good MPI implementations to fulfill. Instead of prescribing a specific performance model (which may be realistic on some systems, under some MPI protocol and algorithm assumptions, etc.), we formulate these guidelines by relating the performance of various aspects of the semantically strongly interrelated MPI standard to each other. Common-sense expectations, for instance, suggest that no MPI function should perform worse than a combination of other MPI functions that implement the same functionality, no specialized function should perform worse than a more general function that can implement the same functionality, no function with weak semantic guarantees should perform worse than a similar function with stronger semantics, and so on. Such guidelines may enable implementers to provide higher quality MPI implementations, minimize performance surprises, and eliminate the need for users to make special, nonportable optimizations by hand. We introduce and semiformalize the concept of self-consistent performance guidelines for MPI, and provide a (nonexhaustive) set of such guidelines in a form that could be automat- - ically verified by benchmarks and experiment management tools. We present experimental results that show cases where guidelines are not satisfied in common MPI implementations, thereby indicating room for improvement in today's MPI implementations. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Connectivity-Based Skeleton Extraction in Wireless Sensor Networks

    Page(s): 710 - 721
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2802 KB) |  | HTML iconHTML  

    Many sensor network applications are tightly coupled with the geometric environment where the sensor nodes are deployed. The topological skeleton extraction for the topology has shown great impact on the performance of such services as location, routing, and path planning in wireless sensor networks. Nonetheless, current studies focus on using skeleton extraction for various applications in wireless sensor networks. How to achieve a better skeleton extraction has not been thoroughly investigated. There are studies on skeleton extraction from the computer vision community; their centralized algorithms for continuous space, however, are not immediately applicable for the discrete and distributed wireless sensor networks. In this paper, we present a novel Connectivity-bAsed Skeleton Extraction (CASE) algorithm to compute skeleton graph that is robust to noise, and accurate in preservation of the original topology. In addition, CASE is distributed as no centralized operation is required, and is scalable as both its time complexity and its message complexity are linearly proportional to the network size. The skeleton graph is extracted by partitioning the boundary of the sensor network to identify the skeleton points, then generating the skeleton arcs, connecting these arcs, and finally refining the coarse skeleton graph. We believe that CASE has broad applications and present a skeleton-assisted segmentation algorithm as an example. Our evaluation shows that CASE is able to extract a well-connected skeleton graph in the presence of significant noise and shape variations, and outperforms the state-of-the-art algorithms. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Inverting Systems of Embedded Sensors for Position Verification in Location-Aware Applications

    Page(s): 722 - 736
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (4292 KB) |  | HTML iconHTML  

    Wireless sensor networks are typically deployed to monitor phenomena that vary over the spatial region the sensor network covers. The sensor readings may also be dual-used for additional purposes. In this paper, we propose to use the inherent spatial variability in physical phenomena, such as temperature or ambient acoustic energy, to support localization and position verification. We first present the problem of localization using general spatial information fields, and then, propose a theory for exploiting this spatial variability for localization. Our Spatial Correlation Weighting Mechanism (SCWM) uses spatial correlation across different phenomena to isolate an appropriate subset of environmental parameters for better location accuracy. We then develop an array of algorithms employing environmental parameters using a two-level approach: first, we develop the strategies on how the subset of parameters should be chosen, and second, we derive mapping functions for position estimation. Our algorithms support our theoretical model for performing localization utilizing environmental properties. Finally, we provide an experimental evaluation of our approach by using a collection of physical phenomena measured across 100 locations inside a building. Our results provide strong evidence of the viability of using general sensor readings for location-aware applications. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • TPDS Information for authors

    Page(s): c3
    Save to Project icon | Request Permissions | PDF file iconPDF (138 KB)  
    Freely Available from IEEE
  • [Back cover]

    Page(s): c4
    Save to Project icon | Request Permissions | PDF file iconPDF (110 KB)  
    Freely Available from IEEE

Aims & Scope

IEEE Transactions on Parallel and Distributed Systems (TPDS) is published monthly. It publishes a range of papers, comments on previously published papers, and survey articles that deal with the parallel and distributed systems research areas of current importance to our readers.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
David Bader
College of Computing
Georgia Institute of Technology