Scheduled System Maintenance on May 29th, 2015:
IEEE Xplore will be upgraded between 11:00 AM and 10:00 PM EDT. During this time there may be intermittent impact on performance. We apologize for any inconvenience.
By Topic

Parallel and Distributed Systems, IEEE Transactions on

Issue 10 • Date Oct. 2007

Filter Results

Displaying Results 1 - 14 of 14
  • [Front cover]

    Publication Year: 2007 , Page(s): c1
    Save to Project icon | Request Permissions | PDF file iconPDF (99 KB)  
    Freely Available from IEEE
  • [Inside front cover]

    Publication Year: 2007 , Page(s): c2
    Save to Project icon | Request Permissions | PDF file iconPDF (82 KB)  
    Freely Available from IEEE
  • A Quorum-Based Group Mutual Exclusion Algorithm for a Distributed System with Dynamic Group Set

    Publication Year: 2007 , Page(s): 1345 - 1360
    Cited by:  Papers (6)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3356 KB) |  | HTML iconHTML  

    The group mutual exclusion problem extends the traditional mutual exclusion problem by associating a type (or a group) with each critical section. In this problem, processes requesting critical sections of the same type can execute their critical sections concurrently. However, processes requesting critical sections of different types must execute their critical sections in a mutually exclusive manner. We present a distributed algorithm for solving the group mutual exclusion problem based on the notion of surrogate-quorum. Intuitively, our algorithm uses the quorum that has been successfully locked by a request as a surrogate to service other compatible requests for the same type of critical section. Unlike the existing quorum-based algorithms for group mutual exclusion, our algorithm achieves a low message complexity of O(q) and a low (amortized) bit-message complexity of O(bqr), where q is the maximum size of a quorum, b is the maximum number of processes from which a node can receive critical section requests, and r is the maximum size of a request while maintaining both synchronization delay and waiting time at two message hops. As opposed to some existing quorum-based algorithms, our algorithm can adapt without performance penalties to dynamic changes in the set of groups. Our simulation results indicate that our algorithm outperforms the existing quorum-based algorithms for group mutual exclusion by as much as 45 percent in some cases. We also discuss how our algorithm can be extended to satisfy certain desirable properties such as concurrent entry and unnecessary blocking freedom. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Distributed Selfish Caching

    Publication Year: 2007 , Page(s): 1361 - 1376
    Cited by:  Papers (6)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2173 KB) |  | HTML iconHTML  

    Although cooperation generally increases the amount of resources available to a community of nodes, thus improving individual and collective performance, it also allows for the appearance of potential mistreatment problems through the exposition of one node's resources to others. We study such concerns by considering a group of independent, rational, self-aware nodes that cooperate using online caching algorithms, where the exposed resource is the storage at each node. Motivated by content networking applications - including Web caching, content delivery networks (CDNs), and peer-to-peer (P2P) - this paper extends our previous work on the offline version of the problem, which was conducted under a game-theoretic framework and limited to object replication. We identify and investigate two causes of mistreatment: 1) cache state interactions (due to the cooperative servicing of requests) and 2) the adoption of a common scheme for cache management policies. Using analytic models, numerical solutions of these models, and simulation experiments, we show that online cooperation schemes using caching are fairly robust to mistreatment caused by state interactions. To appear in a substantial manner, the interaction through the exchange of miss streams has to be very intense, making it feasible for the mistreated nodes to detect and react to exploitation. This robustness ceases to exist when nodes fetch and store objects in response to remote requests, that is, when they operate as level-2 caches (or proxies) for other nodes. Regarding mistreatment due to a common scheme, we show that this can easily take place when the "outlier" characteristics of some of the nodes get overlooked. This finding underscores the importance of allowing cooperative caching nodes the flexibility of choosing from a diverse set of schemes to fit the peculiarities of individual nodes. To that end, we outline an emulation-based framework for the development of mistreatment-resilient distributed selfis- h caching schemes. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • High-Performance Reduction Circuits Using Deeply Pipelined Operators on FPGAs

    Publication Year: 2007 , Page(s): 1377 - 1392
    Cited by:  Papers (26)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (4636 KB) |  | HTML iconHTML  

    Field-programmable gate arrays (FPGAs) have become an attractive option for accelerating scientific applications. Many scientific operations such as matrix-vector multiplication and dot product involve the reduction of a sequentially produced stream of values. Unfortunately, because of the pipelining in FPGA-based floating-point units, data hazards may occur during these sequential reduction operations. Improperly designed reduction circuits can adversely impact the performance, impose unrealistic buffer requirements, and consume a significant portion of the FPGA. In this paper, we identify two basic methods for designing serial reduction circuits: the tree-traversal method and the striding method. Using accumulation as an example, we analyze the design trade-offs among the number of adders, buffer size, and latency. We then propose high-performance and area-efficient designs using each method. The proposed designs reduce multiple sets of sequentially delivered floating-point values without stalling the pipeline or imposing unrealistic buffer requirements. Using a Xilinx Virtex-ll Pro FPGA as the target device, we implemented our designs and present performance and area results. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Comprehensive Framework for Enhancing Security in InfiniBand Architecture

    Publication Year: 2007 , Page(s): 1393 - 1406
    Cited by:  Papers (4)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1991 KB) |  | HTML iconHTML  

    The InfiniBand architecture (IBA) is a promising communication standard for building clusters and system area networks. However, the IBA specification has left out security aspects, resulting in potential security vulnerabilities, which could be exploited with moderate effort. In this paper, we view these vulnerabilities from three classical security aspects - confidentiality, authentication, and availability - and investigate the following security issues. First, as groundwork for secure services in IBA, we present partition-level and queue-pair-level key management schemes, both of which can be easily integrated into IBA. Second, for confidentiality and authentication, we present a method to incorporate a scalable encryption and authentication algorithm into IBA, with little performance overhead. Third, for better availability, we propose a stateful ingress filtering mechanism to block denial-of-service (DoS) attacks. Finally, to further improve the availability, we provide a scalable packet marking method tracing back DoS attacks. Simulation results of an IBA network show that the security performance overhead due to encryption/authentication on network latency ranges from 0.7 percent to 12.4 percent. Since the stateful ingress filtering is enabled only when a DoS attack is active, there is no performance overhead in a normal situation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Worst-Case Delay Control in Multigroup Overlay Networks

    Publication Year: 2007 , Page(s): 1407 - 1419
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1289 KB) |  | HTML iconHTML  

    This paper proposes a novel and simple adaptive control algorithm for the effective delay control and resource utilization of end host multicast (EMcast) when the traffic load becomes heavy in a multigroup network with real-time flows constrained by (sigma, rho) regulators. The control algorithm is implemented at the overlay networks and provides more regulations through a novel (sigma, rho, lambda) regulator at each group end host who suffers from heavy input traffic. To our knowledge, it is the first work to incorporate traffic regulators into the end host multicast to control heavy traffic output. Our further contributions include a theoretical analysis and a set of results. We prove the existence and calculate the value of the rate threshold rho* such that for a given set of K groups, when the average rate of traffic entering the group end hosts rho macr > rho*, the ratio of the worst-case multicast delay bound of the proposed (sigma, rho, lambda) regulator over the traditional (sigma, rho) regulator is O(1/Kn) for any integer n. We also prove the efficiency of the novel algorithm and regulator in decreasing worst-case delays by conducting computer simulations. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Adaptive Allocation of Independent Tasks to Maximize Throughput

    Publication Year: 2007 , Page(s): 1420 - 1435
    Cited by:  Papers (17)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1398 KB) |  | HTML iconHTML  

    In this paper, we consider the task allocation problem for computing a large set of equal-sized independent tasks on a heterogeneous computing system where the tasks initially reside on a single computer (the root) in the system. This problem represents the computation paradigm for a wide range of applications such as SETI@home and Monte Carlo simulations. We consider the scenario where the systems have a general graph-structured topology and the computers are capable of concurrent communications and overlapping communications with computation. We show that the maximization of system throughput reduces to a standard network flow problem. We then develop a decentralized adaptive algorithm that solves a relaxed form of the standard network flow problem and maximizes the system throughput. This algorithm is then approximated by a simple decentralized protocol to coordinate the resources adaptively. Simulations are conducted to verify the effectiveness of the proposed approach. For both uniformly distributed and power law distributed systems, a close-to-optimal throughput is achieved, and improved performance over a bandwidth-centric heuristic is observed. The adaptivity of the proposed approach is also verified through simulations. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The Design and Implementation of a Domain-Specific Language for Network Performance Testing

    Publication Year: 2007 , Page(s): 1436 - 1449
    Cited by:  Papers (10)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2597 KB) |  | HTML iconHTML  

    CONCEPTUAL is a toolset designed specifically to help measure the performance of high-speed interconnection networks such as those used in workstation clusters and parallel computers. It centers around a high-level domain-specific language, which makes it easy for a programmer to express, measure, and report the performance of complex communication patterns. The primary challenge in implementing a compiler for such a language is that the generated code must be extremely efficient so as not to misattribute overhead costs to the messaging library. At the same time, the language itself must not sacrifice expressiveness for compiler efficiency, or there would be little point in using a high-level language for performance testing. This paper describes the CONCEPTUAL language and the CONCEPTUAL compiler's novel code-generation framework. The language provides primitives for a wide variety of idioms needed for performance testing and emphasizes a readable syntax. The core code-generation technique, based on unrolling CONCEPTUAL programs into sequences of communication events, is simple yet enables the efficient implementation of a variety of high-level constructs. The paper further explains how CONCEPTUAL implements time-bounded loops - even those that comprise blocking communication - in the absence of a time-out mechanism as this is a somewhat unique language/implementation feature. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Resource-Aware Distributed Scheduling Strategies for Large-Scale Computational Cluster/Grid Systems

    Publication Year: 2007 , Page(s): 1450 - 1461
    Cited by:  Papers (26)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2688 KB) |  | HTML iconHTML  

    In this paper, we propose distributed algorithms referred to as resource-aware dynamic incremental scheduling (RADIS) strategies. Our strategies are specifically designed to handle large volumes of computationally intensive arbitrarily divisible loads submitted for processing at cluster/grid systems involving multiple sources and sinks (processing nodes). We consider a real-life scenario, wherein the buffer space (memory) available at the sinks (required for holding and processing the loads) varies over time, and the loads have deadlines and propose efficient "pull-based" scheduling strategies with an admission control policy that ensures that the admitted loads are processed, satisfying their deadline requirements. The design of our proposed strategies adopts the divisible load paradigm, referred to as the divisible load theory (DLT), which is shown to be efficient in handling large volume loads. We demonstrate detailed workings of the proposed algorithms via a simulation study by using real-life parameters obtained from a major physics experiment. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Gradient Boundary Detection for Time Series Snapshot Construction in Sensor Networks

    Publication Year: 2007 , Page(s): 1462 - 1475
    Cited by:  Papers (9)  |  Patents (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3193 KB) |  | HTML iconHTML  

    In many applications of sensor networks, the sink needs to keep track of the history of sensed data of a monitored region for scientific analysis or supporting historical queries. We call these historical data a time series of value distributions or snapshots. Obviously, to build the time series snapshots by requiring all of the sensors to transmit their data to the sink periodically is not energy efficient. In this paper, we introduce the idea of gradient boundary and propose the gradient boundary detection (GBD) algorithm to construct these time series snapshots of a monitored region. In GBD, a monitored region is partitioned into a set of subregions and all sensed data in one subregion are within a predefined value range, namely, the gradient interval. Sensors located on the boundaries of the subregions are required to transmit the data to the sink and, then, the sink recovers all subregions to construct snapshots of the monitored area. In this process, only the boundary sensors transmit their data and, therefore, energy consumption is greatly reduced. The simulation results show that GBD is able to build snapshots with a comparable accuracy and has up to 40 percent energy savings compared with the existing approaches for large gradient intervals. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • SenCar: An Energy-Efficient Data Gathering Mechanism for Large-Scale Multihop Sensor Networks

    Publication Year: 2007 , Page(s): 1476 - 1488
    Cited by:  Papers (70)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3046 KB) |  | HTML iconHTML  

    In this paper, we propose a new data gathering mechanism for large-scale multihop sensor networks. A mobile data observer, called SenCar, which could be a mobile robot or a vehicle equipped with a powerful transceiver and battery, works like a mobile base station in the network. SenCar starts the data gathering tour periodically from the static data processing center, traverses the entire sensor network, gathers the data from sensors while moving, returns to the starting point, and, finally, uploads data to the data processing center. Unlike SenCar, sensors in the network are static and can be made very simple and inexpensive. They upload sensed data to SenCar when SenCar moves close to them. Since sensors can only communicate with others within a very limited range, packets from some sensors may need multihop relays to reach SenCar. We first show that the moving path of SenCar can greatly affect network lifetime. We then present heuristic algorithms for planning the moving path/circle of SenCar and balancing traffic load in the network. We show that, by driving SenCar along a better path and balancing the traffic load from sensors to SenCar, network lifetime can be prolonged significantly. Our moving planning algorithm can be used in both connected networks and disconnected networks. In addition, SenCar can avoid obstacles while moving. Our simulation results demonstrate that the proposed data gathering mechanism can prolong network lifetime significantly compared to a network that has only a static observer or a network in which the mobile observer can only move along straight lines. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • TPDS Information for authors

    Publication Year: 2007 , Page(s): c3
    Save to Project icon | Request Permissions | PDF file iconPDF (82 KB)  
    Freely Available from IEEE
  • [Back cover]

    Publication Year: 2007 , Page(s): c4
    Save to Project icon | Request Permissions | PDF file iconPDF (99 KB)  
    Freely Available from IEEE

Aims & Scope

IEEE Transactions on Parallel and Distributed Systems (TPDS) is published monthly. It publishes a range of papers, comments on previously published papers, and survey articles that deal with the parallel and distributed systems research areas of current importance to our readers.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
David Bader
College of Computing
Georgia Institute of Technology