By Topic

Parallel and Distributed Systems, IEEE Transactions on

Issue 8 • Date Aug. 2011

Filter Results

Displaying Results 1 - 23 of 23
  • [Front cover]

    Page(s): c1
    Save to Project icon | Request Permissions | PDF file iconPDF (154 KB)  
    Freely Available from IEEE
  • [Inside front cover]

    Page(s): c2
    Save to Project icon | Request Permissions | PDF file iconPDF (156 KB)  
    Freely Available from IEEE
  • Energy-Efficient Localized Routing in Random Multihop Wireless Networks

    Page(s): 1249 - 1257
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (409 KB)  

    A number of energy-aware routing protocols were proposed to seek the energy efficiency of routes in multihop wireless networks. Among them, several geographical localized routing protocols were proposed to help making smarter routing decision using only local information and reduce the routing overhead. However, all proposed localized routing methods cannot guarantee the energy efficiency of their routes. In this paper, we first give a simple localized routing algorithm, called Localized Energy-Aware Restricted Neighborhood routing (LEARN), which can guarantee the energy efficiency of its route if it can find the route successfully. We then theoretically study its critical transmission radius in random networks which can guarantee that LEARN routing finds a route for any source and destination pairs asymptotically almost surely. We also extend the proposed routing into three-dimensional (3D) networks and derive its critical transmission radius in 3D random networks. Simulation results confirm our theoretical analysis of LEARN routing and demonstrate its energy efficiency in large scale random networks. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Impact of Traffic Influxes: Revealing Exponential Intercontact Time in Urban VANETs

    Page(s): 1258 - 1266
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (775 KB) |  | HTML iconHTML  

    Intercontact time between moving vehicles is one of the key metrics in vehicular ad hoc networks (VANETs) and central to forwarding algorithms and the end-to-end delay. Due to prohibitive costs, little work has conducted experimental study on intercontact time in urban vehicular environments. In this paper, we carry out an extensive experiment involving thousands of operational taxies in Shanghai city. Studying the taxi trace data on the frequency and duration of transfer opportunities between taxies, we observe that the tail distribution of the intercontact time, that is, the time gap separating two contacts of the same pair of taxies, exhibits an exponential decay, over a large range of timescale. This observation is in sharp contrast to recent empirical data studies based on human mobility, in which the distribution of the intercontact time obeys a power law. By analyzing a simplified mobility model that captures the effect of hot areas in the city, we rigorously prove that common traffic influxes, where large volume of traffic converges, play a major role in generating the exponential tail of the intercontact time. Our results thus provide fundamental guidelines on design of new vehicular mobility models in urban scenarios, new data forwarding protocols and their performance analysis. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Minimum-Delay Service Provisioning in Opportunistic Networks

    Page(s): 1267 - 1275
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (610 KB)  

    Opportunistic networks are created dynamically by exploiting contacts between pairs of mobile devices that come within communication range. While forwarding in opportunistic networking has been explored, investigations into asynchronous service provisioning on top of opportunistic networks are unique contributions of this paper. Mobile devices are typically heterogeneous, possess disparate physical resources, and can provide a variety of services. During opportunistic contacts, the pairing peers can cooperatively provide (avail of) their (other peer's) services. This service provisioning paradigm is a key feature of the emerging opportunistic computing paradigm. We develop an analytical model to study the behaviors of service seeking nodes (seekers) and service providing nodes (providers) that spawn and execute service requests, respectively. The model considers the case in which seekers can spawn parallel executions on multiple providers for any given request, and determines: 1) the delays at different stages of service provisioning; and 2) the optimal number of parallel executions that minimizes the expected execution time. The analytical model is validated through simulations, and exploited to investigate the performance of service provisioning over a wide range of parameters. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel Implementation of the Irregular Terrain Model (ITM) for Radio Transmission Loss Prediction Using GPU and Cell BE Processors

    Page(s): 1276 - 1283
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1170 KB) |  | HTML iconHTML  

    The Irregular Terrain Model (ITM), also known as the Longley-Rice model, predicts long-range average transmission loss of a radio signal based on atmospheric and geographic conditions. Due to variable terrain effects and constantly changing atmospheric conditions which can dramatically influence radio wave propagation, there is a pressing need for computational resources capable of running hundreds of thousands of transmission loss calculations per second. Multicore processors, like the NVIDIA Graphics Processing Unit (GPU) and IBM Cell Broadband Engine (BE), offer improved performance over mainstream microprocessors for ITM. We study architectural features of the Tesla C870 GPU and Cell BE and evaluate the effectiveness of architecture-specific optimizations and parallelization strategies for ITM on these platforms. We assess the GPU implementations that utilize both global and shared memories along with fine-grained parallelism. We assess the Cell BE implementations that utilize direct memory access, double buffering, and SIMDization. With these optimization strategies, we achieve less than a second of computation time on each platform which is not feasible with a general purpose processor, and we observe that the GPU delivers better performance than Cell BE in terms of total execution time and performance per watt metrics by a factor of 2.3x and 1.6x, respectively. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Integrating Caching and Prefetching Mechanisms in a Distributed Transactional Memory

    Page(s): 1284 - 1298
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1336 KB) |  | HTML iconHTML  

    We present a distributed transactional memory system that exploits a new opportunity to automatically hide network latency by speculatively prefetching and caching objects. The system includes an object caching framework, language extensions to support our approach, and symbolic prefetches. To our knowledge, this is the first prefetching approach that can prefetch objects whose addresses have not been computed or predicted. Our approach makes aggressive use of both prefetching and caching of remote objects to hide network latency while relying on the transaction commit mechanism to preserve the simple transactional consistency model that we present to the developer. We have evaluated this approach on three distributed benchmarks, five scientific benchmarks, and several microbenchmarks. We have found that our approach enables our benchmark applications to effectively utilize multiple machines and benefit from prefetching and caching. We have observed a speedup of up to 7.26× for distributed applications on our system using prefetching and caching and a speedup of up to 5.55× for parallel applications on our system. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Data Replication in Data Intensive Scientific Applications with Performance Guarantee

    Page(s): 1299 - 1306
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (387 KB)  

    Data replication has been well adopted in data intensive scientific applications to reduce data file transfer time and bandwidth consumption. However, the problem of data replication in Data Grids, an enabling technology for data intensive applications, has proven to be NP-hard and even non approximable, making this problem difficult to solve. Meanwhile, most of the previous research in this field is either theoretical investigation without practical consideration, or heuristics-based with little or no theoretical performance guarantee. In this paper, we propose a data replication algorithm that not only has a provable theoretical performance guarantee, but also can be implemented in a distributed and practical manner. Specifically, we design a polynomial time centralized replication algorithm that reduces the total data file access delay by at least half of that reduced by the optimal replication solution. Based on this centralized algorithm, we also design a distributed caching algorithm, which can be easily adopted in a distributed environment such as Data Grids. Extensive simulations are performed to validate the efficiency of our proposed algorithms. Using our own simulator, we show that our centralized replication algorithm performs comparably to the optimal algorithm and other intuitive heuristics under different network parameters. Using GridSim, a popular distributed Grid simulator, we demonstrate that the distributed caching technique significantly outperforms an existing popular file caching technique in Data Grids, and it is more scalable and adaptive to the dynamic change of file access patterns in Data Grids. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Timely Result-Data Offloading for Improved HPC Center Scratch Provisioning and Serviceability

    Page(s): 1307 - 1322
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1601 KB) |  | HTML iconHTML  

    Modern High-Performance Computing (HPC) centers are facing a data deluge from emerging scientific applications. Supporting large data entails a significant commitment of the high-throughput center storage system, scratch space. However, the scratch space is typically managed using simple “purge policies,” without sophisticated end-user data services to balance resource consumption and user serviceability. End-user data services such as offloading are performed using point-to-point transfers that are unable to reconcile center's purge and users' delivery deadlines, unable to adapt to changing dynamics in the end-to-end data path and are not fault-tolerant. Such inefficiencies can be prohibitive to sustaining high performance. In this paper, we address the above issues by designing a framework for the timely, decentralized offload of application result data. Our framework uses an overlay of user-specified intermediate and landmark sites to orchestrate a decentralized fault-tolerant delivery. We have implemented our techniques within a production job scheduler (PBS) and data transfer tool (BitTorrent). Our evaluation using both a real implementation and supercomputer job log-driven simulations show that: the offloading times can be significantly reduced (90.4 percent for a 5 GB data transfer); the exposure window can be minimized while also meeting center-user service level agreements. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Association Control for Vehicular WiFi Access: Pursuing Efficiency and Fairness

    Page(s): 1323 - 1331
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (837 KB)  

    Deploying road-side WiFi access points has made possible internet access in a vehicle, nevertheless it is challenging to maintain client performance at vehicular speed especially when multiple mobile users exist. This paper considers the association control problem for vehicular WiFi access in the Drive-thru Internet scenario. In particular, we aim to improve the efficiency and fairness for all users. We design efficient algorithms to achieve these objectives through several techniques including approximation. Our simulation results demonstrate that our algorithms can achieve significantly better performance than conventional approaches. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Call Admission Control Performance Analysis in Mobile Networks Using Stochastic Well-Formed Petri Nets

    Page(s): 1332 - 1341
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2558 KB) |  | HTML iconHTML  

    (This work is an extension of our conference paper at Valuetools 2006 [18].) Stochastic Well-formed Petri Nets (SWNs) are a powerful tool for modeling complex systems with concurrency, synchronization, and cooperation. Call Admission Control (CAC) is an important mechanism for mobile networks. While several studies have been done on GSM/GPRS and on UMTS system with mixed voice and data, to the best of our knowledge, limited CAC models for mobile networks that have been proposed in the literature are represented with a unidimensional Markov chain, where the communication system was mainly based upon voice calls in order to reduce the state space of the Markov chain. Another drawback of those studies is the lack of a clear synchronization between mobile nodes and the servers. In this paper, we propose an efficient CAC scheme for mobile networks that takes into account voice connections as well as synchronous and asynchronous data connections. Furthermore, we use SWNs to model the system interaction, which consists of several mobile nodes, gateways, cells, and servers. We describe our scheme and present its analytical performance results using the WNSIM symbolic simulator of GreatSPN tool. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Churn-Resilient Protocol for Massive Data Dissemination in P2P Networks

    Page(s): 1342 - 1349
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (753 KB)  

    Massive data dissemination is often disrupted by frequent join and departure or failure of client nodes in a peer-to-peer (P2P) network. We propose a new churn-resilient protocol (CRP) to assure alternating path and data proximity to accelerate the data dissemination process under network churn. The CRP enables the construction of proximity-aware P2P content delivery systems. We present new data dissemination algorithms using this proximity-aware overlay design. We simulated P2P networks up to 20,000 nodes to validate the claimed advantages. Specifically, we make four technical contributions: 1). The CRP scheme promotes proximity awareness, dynamic load balancing, and resilience to node failures and network anomalies. 2). The proximity-aware overlay network has a 28-50 percent speed gain in massive data dissemination, compared with the use of scope-flooding or epidemic tree schemes in unstructured P2P networks. 3). The CRP-enabled network requires only 1/3 of the control messages used in a large CAM-Chord network. 4) Even with 40 percent of node failures, the CRP network guarantees atomic broadcast of all data items. These results clearly demonstrate the scalability and robustness of CRP networks under churn conditions. The scheme appeals especially to web-scale applications in digital content delivery, network worm containment, and consumer relationship management over hundreds of datacenters in cloud computing services. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast and Cost-Effective Online Load-Balancing in Distributed Range-Queriable Systems

    Page(s): 1350 - 1364
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1734 KB) |  | HTML iconHTML  

    Distributed systems such as Peer-to-Peer overlays have been shown to efficiently support the processing of range queries over large numbers of participating hosts. In such systems, uneven load allocation has to be effectively tackled in order to minimize overloaded peers and optimize their performance. In this work, we detect the two basic methodologies used to achieve load-balancing: Iterative key redistribution between neighbors and node migration. We identify these two key mechanisms and describe their relative advantages and disadvantages. Based on this analysis, we propose NIXMIG, a hybrid method that adaptively utilizes these two extremes to achieve both fast and cost-effective load-balancing in distributed systems that support range queries. We theoretically prove its convergence and as a case study, we offer an implementation on top of a Skip Graph, where we thoroughly validate our findings in a variety of static, dynamic and realistic workloads. We compare NIXMIG with an existing load-balancing algorithm proposed by Karger and Ruhl [1] and our experimental analysis shows that, NIXMIG can be as much as three times faster, requiring only one sixth and one third of message and item exchanges, respectively, to bring the system to a balanced state. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Passive Network Performance Estimation for Large-Scale, Data-Intensive Computing

    Page(s): 1365 - 1373
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (640 KB)  

    Distributed computing applications are increasingly utilizing distributed data sources. However, the unpredictable cost of data access in large-scale computing infrastructures can lead to severe performance bottlenecks. Providing predictability in data access is, thus, essential to accommodate the large set of newly emerging large-scale, data-intensive computing applications. In this regard, accurate estimation of network performance is crucial to meeting the performance goals of such applications. Passive estimation based on past measurements is attractive for its relatively small overhead compared to relying on explicit probing. In this paper, we take a passive approach for network performance estimation. Our approach is different from existing passive techniques that rely either on past direct measurements of pairs of nodes or on topological similarities. Instead, we exploit secondhand measurements collected by other nodes without any topological restrictions. In this paper, we present Overlay Passive Estimation of Network performance (OPEN), a scalable framework providing end-to-end network performance estimation based on secondhand measurements, and discuss how OPEN achieves cost-effective estimation in a large-scale infrastructure. Our extensive experimental results show that OPEN estimation can be applicable for replica and resource selections commonly used in distributed computing. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Energy Conscious Scheduling for Distributed Computing Systems under Different Operating Conditions

    Page(s): 1374 - 1381
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (950 KB)  

    Traditionally, the primary performance goal of computer systems has focused on reducing the execution time of applications while increasing throughput. This performance goal has been mostly achieved by the development of high-density computer systems. As witnessed recently, these systems provide very powerful processing capability and capacity. They often consist of tens or hundreds of thousands of processors and other resource-hungry devices. The energy consumption of these systems has become a major concern. In this paper, we address the problem of scheduling precedence-constrained parallel applications on multiprocessor computer systems and present two energy-conscious scheduling algorithms using dynamic voltage scaling (DVS). A number of recent commodity processors are capable of DVS, which enables processors to operate at different voltage supply levels at the expense of sacrificing clock frequencies. In the context of scheduling, this multiple voltage facility implies that there is a trade-off between the quality of schedules and energy consumption. To effectively balance these two performance goals, we have devised a novel objective function and a variant from that. The main difference between the two algorithms is in their measurement of energy consumption. The extensive comparative evaluations conducted as part of this work show that the performance of our algorithms is very compelling in terms of both application completion time and energy consumption. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Satisfiability Modulo Graph Theory for Task Mapping and Scheduling on Multiprocessor Systems

    Page(s): 1382 - 1389
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (554 KB) |  | HTML iconHTML  

    Task graph scheduling on multiprocessor systems is a representative multiprocessor scheduling problem. A solution to this problem consists of the mapping of tasks to processors and the scheduling of tasks on each processor. Optimal solution can be obtained by exploring the entire design space of all possible mapping and scheduling choices. Since the problem is NP-hard, scalability becomes the main concern in solving the problem optimally. In this paper, a SAT-based optimization framework is proposed to address this problem, in which SAT solver is enhanced by integrating with a scheduling analysis tool in a branch and bound manner to prune the solution space efficiently. Performance evaluation results show that our technique has average performance improvement in more than an order of magnitude compared to state-of-the-art techniques. We further build a cycle-accurate network-on-chip simulator based on SystemC to verify the effectiveness of the proposed technique on realistic multiprocessor systems. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Generic Framework for Three-Factor Authentication: Preserving Security and Privacy in Distributed Systems

    Page(s): 1390 - 1397
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (348 KB)  

    As part of the security within distributed systems, various services and resources need protection from unauthorized use. Remote authentication is the most commonly used method to determine the identity of a remote client. This paper investigates a systematic approach for authenticating clients by three factors, namely password, smart card, and biometrics. A generic and secure framework is proposed to upgrade two-factor authentication to three-factor authentication. The conversion not only significantly improves the information assurance at low cost but also protects client privacy in distributed systems. In addition, our framework retains several practice-friendly properties of the underlying two-factor authentication, which we believe is of independent interest. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Dealing with Nonuniformity in Data Centric Storage for Wireless Sensor Networks

    Page(s): 1398 - 1406
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (761 KB)  

    In-network storage of data in Wireless Sensor Networks (WSNs) is considered a promising alternative to external storage since it contributes to reduce the communication overhead inside the network. Recent approaches to data storage rely on Geographic Hash Tables (GHT) for efficient data storage and retrieval. These approaches, however, assume that sensors are uniformly distributed in the sensor field, which is seldom true in real applications. Also they do not allow tuning the redundancy level in the storage according to the importance of the data to be stored. To deal with these issues, we propose an approach based on two mechanisms. The first is aimed at estimating the real network distribution. The second exploits data dispersal method based on the estimated network distribution. Experiments through simulation show that our approach approximates quite closely the real distribution of sensors and that our dispersal protocol sensibly reduces data losses due to unbalanced data load. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Sensor Placement Algorithms for Fusion-Based Surveillance Networks

    Page(s): 1407 - 1414
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (891 KB)  

    Mission-critical target detection imposes stringent performance requirements for wireless sensor networks, such as high detection probabilities and low false alarm rates. Data fusion has been shown as an effective technique for improving system detection performance by enabling efficient collaboration among sensors with limited sensing capability. Due to the high cost of network deployment, it is desirable to place sensors at optimal locations to achieve maximum detection performance. However, for sensor networks employing data fusion, optimal sensor placement is a nonlinear and nonconvex optimization problem with prohibitively high computational complexity. In this paper, we present fast sensor placement algorithms based on a probabilistic data fusion model. Simulation results show that our algorithms can meet the desired detection performance with a small number of sensors while achieving up to seven-fold speedup over the optimal algorithm. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Traffic-Aware Relay Node Deployment: Maximizing Lifetime for Data Collection Wireless Sensor Networks

    Page(s): 1415 - 1423
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1260 KB)  

    Wireless sensor networks have been widely used for ambient data collection in diverse environments. While in many such networks the nodes are randomly deployed in massive quantity, there is a broad range of applications advocating manual deployment. A typical example is structure health monitoring, where the sensors have to be placed at critical locations to fulfill civil engineering requirements. The raw data collected by the sensors can then be forwarded to a remote base station (the sink) through a series of relay nodes. In the wireless communication context, the operation time of a battery-limited relay node depends on its traffic volume and communication range. Hence, although not bounded by the civil-engineering-like requirements, the locations of the relay nodes have to be carefully planned to achieve the maximum network lifetime. The deployment has to not only ensure connectivity between the data sources and the sink, but also accommodate the heterogeneous traffic flows from different sources and the dominating many-to-one traffic pattern. Inspired by the uniqueness of such application scenarios, in this paper, we present an in-depth study on the traffic-aware relay node deployment problem. We develop optimal solutions for the simple case of one source node, both with single and multiple traffic flows. We show however that the general form of the deployment problem is difficult, and the existing only connectivity-guaranteed solutions cannot be directly applied here. We then transform our problem into a generalized version of the Euclidean Steiner Minimum Tree problem (ESMT). Nevertheless, we face further challenges as its solution is in continuous space and may yield fractional numbers of relay nodes, where simple rounding of the solution can lead to poor performance. We thus develop algorithms for discrete relay node assignment, together with local adjustments that yield high-quality practical solutions. Our solution has been evaluated through both numerica- - l analysis and ns-2 simulations and compared with state-of-the-art approaches. The results show that for all test cases where the continuous space optimal solution can be computed within acceptable time frames, the network lifetime achieved by our solution is very close to the upper bound of the optimal solution (the difference is less than 13.5 percent). Moreover, it achieves up to 6-14 times improvement over the existing traffic-oblivious strategies. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • IEEE Computer Society OnlinePlus Coming Soon to TPDS

    Page(s): 1424
    Save to Project icon | Request Permissions | PDF file iconPDF (229 KB)  
    Freely Available from IEEE
  • TPDS Information for authors

    Page(s): c3
    Save to Project icon | Request Permissions | PDF file iconPDF (156 KB)  
    Freely Available from IEEE
  • [Back cover]

    Page(s): c4
    Save to Project icon | Request Permissions | PDF file iconPDF (154 KB)  
    Freely Available from IEEE

Aims & Scope

IEEE Transactions on Parallel and Distributed Systems (TPDS) is published monthly. It publishes a range of papers, comments on previously published papers, and survey articles that deal with the parallel and distributed systems research areas of current importance to our readers.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
David Bader
College of Computing
Georgia Institute of Technology