By Topic

Parallel and Distributed Systems, IEEE Transactions on

Issue 8 • Date Aug. 2013

Filter Results

Displaying Results 1 - 21 of 21
  • A 3.42-Approximation Algorithm for Scheduling Malleable Tasks under Precedence Constraints

    Page(s): 1479 - 1488
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (769 KB)  

    Scheduling malleable tasks under general precedence constraints involves finding a minimum makespan (maximum completion time) by a feasible allotment. Based on the monotonous penalty assumptions of Blayo et al. [2], this work defines two assumptions concerning malleable tasks: the processing time of a malleable task is nonincreasing in the number of processors, while the work of a malleable task is nondecreasing in the number of processors. Additionally, the work function is assumed herein to be convex in the processing time. The proposed algorithm reformulates the linear program of [11], and this algorithm and associated proofs are inspired by the ones of [11]. This work describes a novel polynomial-time approximation algorithm that is capable of achieving an approximation ratio of 2+√2≈3.4142. This work further demonstrates that the proposed algorithm can yield an approximation ratio of 2.9549 when the processing time is strictly decreasing in the number of the processors allocated to the task. This finding represents an improvement upon the previous best approximation ratio of 100/63+100(√6469+137)/5481≈3.2920 [12] achieved under the same assumptions. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Aging-Aware Energy-Efficient Workload Allocation for Mobile Multimedia Platforms

    Page(s): 1489 - 1499
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (837 KB)  

    Multicore platforms are characterized by increasing variability and aging effects that imply heterogeneity in core performance, energy consumption, and reliability. In particular, wear-out effects such as negative-bias-temperature-instability require runtime adaptation of system resource utilization to time-varying and uneven platform degradation, so as to prevent premature chip failure. In this context, task allocation techniques can be used to deal with heterogeneous cores and extend chip lifetime while minimizing energy and preserving quality of service. We propose a new formulation of the task allocation problem for variability affected platforms, which manages per-core utilization to achieve a target lifetime while minimizing energy consumption during the execution of rate-constrained multimedia applications. We devise an adaptive solution that can be applied online and approximates the result of an optimal, offline version. Our allocator has been implemented and tested on real-life functional workloads running on a timing accurate simulator of a next-generation industrial multicore platform. We extensively assess the effectiveness of the online strategy both against the optimal solution and also compared to alternative state-of-the-art policies. The proposed policy outperforms state-of-the-art strategies in terms of lifetime preservation, while saving up to 20 percent of energy consumption without impacting timing constraints. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Efficient Penalty-Aware Cache to Improve the Performance of Parity-Based Disk Arrays under Faulty Conditions

    Page(s): 1500 - 1513
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (5477 KB) |  | HTML iconHTML  

    The buffer cache plays an essential role in smoothing the gap between the upper level computational components and the lower level storage devices. A good buffer cache management scheme should be beneficial to not only the computational components, but also the storage components by reducing disk I/Os. Existing cache replacement algorithms are well optimized for disks in normal mode, but inefficient under faulty scenarios, such as a parity-based disk array with faulty disk(s). To address this issue, we propose a novel penalty-aware buffer cache replacement strategy, named Victim Disk(s) First (VDF) cache, to improve the reliability and performance of a storage system consisting of a buffer cache and disk arrays. VDF cache gives higher priority to cache the blocks on the faulty disks when the disk array fails, thus reducing the I/Os addressed directly to the faulty disks. To verify the effectiveness of the VDF cache, we have integrated VDF into the popular cache algorithms least frequently used (LFU) and least recently used (LRU), named VDF-LFU and VDF-LRU, respectively. We have conducted intensive simulations as well as a prototype implementation for disk arrays to tolerate one disk failure (RAID-5) and two disk failures (RAID-6). The simulation results have shown that VDF-LFU can reduce disk I/Os to surviving disks by up to 42.3 percent in RAID-5 and 50.7 percent in RAID-6, and VDF-LRU can reduce those by up to 36.2 percent in RAID-5 and 48.9 percent in RAID-6. Our measurement results also show that VDF-LFU can speed up the online recovery by up to 46.3 percent in RAID-5 and 47.2 percent in RAID-6 under spare-rebuilding mode, or improve the maximum system service rate by up to 47.7 percent in RAID-5 under degraded mode without a reconstruction workload. Similarly, VDF-LRU can speed up the online recovery by up to 34.6 percent in RAID-5 and 38.2 percent in RAID-6, or improve the system service rate by up to 28.4 percent in RAID-5. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • DoMaIN: A Novel Dynamic Location Management Solution for Internet-Based Infrastructure Wireless Mesh Networks

    Page(s): 1514 - 1524
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1828 KB) |  | HTML iconHTML  

    Wireless mesh networks (WMNs) have been deployed in many areas. There is an increasing demand for supporting a large number of mobile users in WMNs. As one of the key components in mobility management support, location management serves the purpose of tracking mobile users and locating them prior to establishing new communications. Previous dynamic location management schemes proposed for cellular and wireless local area networks (WLANs) cannot be directly applied to WMNs due to the existence of multihop wireless links in WMNs. Moreover, new design challenges arise when applying location management for silently roaming mobile users in the mesh backbone. Considering the number of wireless hops, an important factor affecting the performance of WMNs, we propose a DoMaIN framework that can help mobile users to decide whether an intra- or intergateway location update (LU) is needed to ensure the best location management performance (i.e., packet delivery) among dynamic location management solutions. In addition, by dynamically guiding mobile users to perform LU to a desirable location entity, the proposed DoMaIN framework can minimize the location management protocol overhead in terms of LU overhead in the mesh backbone. Furthermore, DoMaIN brings extra benefits for supporting a dynamic hop-based LU triggering method that is different from previous dynamic LU triggering schemes proposed for cellular networks and WLANs. We evaluate the performance of DoMaIN in different case studies using OPNET simulations. Comprehensive simulation results demonstrate that DoMaIN outperforms other location management schemes and is a satisfactory location management solution for a large number of mobile users silently and arbitrarily roaming under the wireless mesh backbone. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient Computation of Robust Average of Compressive Sensing Data in Wireless Sensor Networks in the Presence of Sensor Faults

    Page(s): 1525 - 1534
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1273 KB) |  | HTML iconHTML  

    Wireless sensor networks (WSNs) enable the collection of physical measurements over a large geographic area. It is often the case that we are interested in computing and tracking the spatial-average of the sensor measurements over a region of the WSN. Unfortunately, the standard average operation is not robust because it is highly susceptible to sensor faults and heterogeneous measurement noise. In this paper, we propose a computational efficient method to compute a weighted average (which we will call robust average) of sensor measurements, which appropriately takes sensor faults and sensor noise into consideration. We assume that the sensors in the WSN use random projections to compress the data and send the compressed data to the data fusion centre. Computational efficiency of our method is achieved by having the data fusion centre work directly with the compressed data streams. The key advantage of our proposed method is that the data fusion centre only needs to perform decompression once to compute the robust average, thus greatly reducing the computational requirements. We apply our proposed method to the data collected from two WSN deployments to demonstrate its efficiency and accuracy. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • E-SmallTalker: A Distributed Mobile System for Social Networking in Physical Proximity

    Page(s): 1535 - 1545
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (579 KB)  

    Small talk is an important social lubricant that helps people, especially strangers, initiate conversations and make friends with each other in physical proximity. However, due to difficulties in quickly identifying significant topics of common interest, real-world small talk tends to be superficial. The mass popularity of mobile phones can help improve the effectiveness of small talk. In this paper, we present E-SmallTalker, a distributed mobile communications system that facilitates social networking in physical proximity. It automatically discovers and suggests topics such as common interests for more significant conversations. We build on Bluetooth Service Discovery Protocol (SDP) to exchange potential topics by customizing service attributes to publish non-service-related information without establishing a connection. We propose a novel iterative Bloom filter protocol that encodes topics to fit in SDP attributes and achieves a low false-positive rate. We have implemented the system in Java ME for ease of deployment. Our experiments on real-world phones show that it is efficient enough at the system level to facilitate social interactions among strangers in physical proximity. To the best of our knowledge, E-SmallTalker is the first distributed mobile system to achieve the same purpose. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Formal Specification and Runtime Detection of Dynamic Properties in Asynchronous Pervasive Computing Environments

    Page(s): 1546 - 1555
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (973 KB)  

    Formal specification and runtime detection of contextual properties is one of the primary approaches to enabling context awareness in pervasive computing environments. Due to the intrinsic dynamism of the pervasive computing environment, dynamic properties, which delineate concerns of context-aware applications on the temporal evolution of the environment state, are of great importance. However, detection of dynamic properties is challenging, mainly due to the intrinsic asynchrony among computing entities in the pervasive computing environment. Moreover, the detection must be conducted at runtime in pervasive computing scenarios, which makes existing schemes do not work. To address these challenges, we propose the property detection for asynchronous context (PDAC) framework, which consists of three essential parts: 1) Logical time is employed to model the temporal evolution of environment state as a lattice. The active surface of the lattice is introduced as the key notion to model the runtime evolution of the environment state; 2) Specification of dynamic properties is viewed as a formal language defined over the trace of environment state evolution; and 3) The SurfMaint algorithm is proposed to achieve runtime maintenance of the active surface of the lattice, which further enables runtime detection of dynamic properties. A case study is conducted to demonstrate how the PDAC framework enables context awareness in asynchronous pervasive computing scenarios. The SurfMaint algorithm is implemented and evaluated over MIPA - the open-source context-aware middleware we developed. Performance measurements show the accuracy and cost-effectiveness of SurfMaint, even when faced with dynamic changes in the asynchronous pervasive computing environment. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • GPUs as Storage System Accelerators

    Page(s): 1556 - 1566
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (538 KB)  

    Massively multicore processors, such as graphics processing units (GPUs), provide, at a comparable price, a one order of magnitude higher peak performance than traditional CPUs. This drop in the cost of computation, as any order-of-magnitude drop in the cost per unit of performance for a class of system components, triggers the opportunity to redesign systems and to explore new ways to engineer them to recalibrate the cost-to-performance relation. This project explores the feasibility of harnessing GPUs' computational power to improve the performance, reliability, or security of distributed storage systems. In this context, we present the design of a storage system prototype that uses GPU offloading to accelerate a number of computationally intensive primitives based on hashing, and introduce techniques to efficiently leverage the processing power of GPUs. We evaluate the performance of this prototype under two configurations: as a content addressable storage system that facilitates online similarity detection between successive versions of the same file and as a traditional system that uses hashing to preserve data integrity. Further, we evaluate the impact of offloading to the GPU on competing applications' performance. Our results show that this technique can bring tangible performance gains without negatively impacting the performance of concurrently running applications. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • High-Accuracy TDOA-Based Localization without Time Synchronization

    Page(s): 1567 - 1576
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (931 KB)  

    Localization is of great importance in mobile and wireless network applications. Time Difference of Arrival (TDOA) is one of the widely used localization schemes, in which the target (source) emits a signal and a number of anchors (receivers) record the arriving time of the source signal. By calculating the time difference of different receivers, the location of the target is estimated. In such a scheme, receivers must be precisely time synchronized. But time synchronization adds computational cost, and brings errors which may lower localization accuracy. Previous studies have shown that existing time synchronization approaches using low-cost devices are insufficiently accurate, or even infeasible under high requirement for accuracy. In our scheme (called Whistle), several asynchronous receivers record a target signal and a successive signal that is generated artificially. By two-signal sensing and sample counting techniques, time synchronization requirement can be removed, while high time resolution can be achieved. This design fundamentally changes TDOA in the sense of releasing the synchronization requirement and avoiding many sources of errors caused by time synchronization. We implement Whistle on commercial off-the-shelf (COTS) cell phones with acoustic signal and perform simulations with UWB signal. Especially we use Whistle to localize nodes of large-scale wireless networks, and also achieve desirable results. The extensive real-world experiments and simulations show that Whistle can be widely used with good accuracy. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Intelligent Sensor Placement for Hot Server Detection in Data Centers

    Page(s): 1577 - 1588
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1214 KB)  

    Recent studies have shown that a significant portion of the total energy consumption of many data centers is caused by the inefficient operation of their cooling systems. Without effective thermal monitoring with accurate location information, the cooling systems often use unnecessarily low temperature set points to overcool the entire room, resulting in excessive energy consumption. Sensor network technology has recently been adopted for data-center thermal monitoring because of its nonintrusive nature for the already complex data center facilities and robustness to instantaneous CPU or disk activities. However, existing solutions place sensors in a simplistic way without considering the thermal dynamics in data centers, resulting in unnecessarily degraded hot server detection probability. In this paper, we first formulate the problems of sensor placement for hot server detection in a data center as constrained optimization problems in two different scenarios. We then propose a novel placement scheme based on computational fluid dynamics (CFD) to take various factors, such as cooling systems and server layout, as inputs to analyze the thermal conditions of the data center. Based on the CFD analysis in various server overheating scenarios, we apply data fusion and advanced optimization techniques to find a near-optimal sensor placement solution, such that the probability of detecting hot servers is significantly improved. Our empirical results in a real server room demonstrate the detection performance of our placement solution. Extensive simulation results in a large-scale data center with 32 racks also show that the proposed solution outperforms several commonly used placement solutions in terms of detection probability. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • ITA: Innocuous Topology Awareness for Unstructured P2P Networks

    Page(s): 1589 - 1601
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2172 KB) |  | HTML iconHTML  

    One of the most appealing characteristics of unstructured P2P overlays is their enhanced self-* properties, which results from their loose, random structure. In addition, most of the algorithms which make searching in unstructured P2P systems scalable, such as dynamic querying and 1-hop replication, rely on the random nature of the overlay to function efficiently. The underlying communications network (i.e., the Internet), however, is not as randomly constructed. This leads to a mismatch between the distance of two peers on the overlay and the hosts they reside on at the IP layer, which in turn leads to its misuse. The crux of the problem arises from the fact that any effort to provide a better match between the overlay and the IP layer will inevitably lead to a reduction in the random structure of the P2P overlay, with many adverse results. With this in mind, we propose ITA, an algorithm which creates a random overlay of randomly connected neighborhoods providing topology awareness to P2P systems, while at the same time has no negative effect on the self-* properties or the operation of the other P2P algorithms. Using extensive simulations, both at the IP router level and autonomous system level, we show that ITA reduces communication latencies by as much as 50 percent. Furthermore, it not only reduces by 20 percent the number of IP network messages which is critical for ISPs carrying the burden of transporting P2P traffic, but also distributes the traffic load more evenly on the routers of the IP network layer. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • K-Means for Parallel Architectures Using All-Prefix-Sum Sorting and Updating Steps

    Page(s): 1602 - 1612
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (923 KB)  

    We present an implementation of parallel K-means clustering, called Kps-means, that achieves high performance with near-full occupancy compute kernels without imposing limits on the number of dimensions and data points permitted as input, thus combining flexibility with high degrees of parallelism and efficiency. As a key element to performance improvement, we introduce parallel sorting as data preprocessing and updating steps. Our final implementation for Nvidia GPUs achieves speedups of up to 200-fold over CPU reference code and of up to three orders of magnitude when compared with popular numerical software packages. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • LU Factorization with Partial Pivoting for a Multicore System with Accelerators

    Page(s): 1613 - 1621
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1105 KB) |  | HTML iconHTML  

    LU factorization with partial pivoting is a canonical numerical procedure and the main component of the high performance LINPACK benchmark. This paper presents an implementation of the algorithm for a hybrid, shared memory, system with standard CPU cores and GPU accelerators. The difficulty of implementing the algorithm for such a system lies in the disproportion between the computational power of the CPUs, compared to the GPUs, and in the meager bandwidth of the communication link between their memory systems. An additional challenge comes from the complexity of the memory-bound and synchronization-rich nature of the panel factorization component of the block LU algorithm, imposed by the use of partial pivoting. The challenges are tackled with the use of a data layout geared toward complex memory hierarchies, autotuning of GPU kernels, fine-grain parallelization of memory-bound CPU operations and dynamic scheduling of tasks to different devices. Performance in excess of one TeraFLOPS is achieved using four AMD Magny Cours CPUs and four NVIDIA Fermi GPUs. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • LvtPPP: Live-Time Protected Pseudopartitioning of Multicore Shared Caches

    Page(s): 1622 - 1632
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1795 KB)  

    Partition enforcement policy is essential in the cache partition, and its main function is to protect the lines and retain the cache quota of each core. This paper focuses online protection based on its generation time rather than the CPU core ID that it belongs to or the position of the replacement stack, where it is located. The basic idea is that when a line is live, it must be protected and retained in the cache; when the line is “dead,” it needs to be evicted as early as possible. Therefore, the live-time protected counter (LvtP, four bits) is augmented to trace the lines' live time. Moreover, dead blocks are predicted according to the access event sequence. This paper presents a pseudopartition approach-LvtPPP and proposes a two-cascade victim selection mechanism to alleviate dead blocks based on the LRU replacement policy and the LvtP counter. LvtPPP also supports flexible handling of allocation deviation by introducing a parameter λ to adjust the generation time of the line. There is significant improvement of the performance and fairness in LvtPPP over PIPP and UCP according to the evaluation results based on Simics. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Modeling Propagation Dynamics of Social Network Worms

    Page(s): 1633 - 1643
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1439 KB)  

    Social network worms, such as email worms and facebook worms, pose a critical security threat to the Internet. Modeling their propagation dynamics is essential to predict their potential damages and develop countermeasures. Although several analytical models have been proposed for modeling propagation dynamics of social network worms, there are two critical problems unsolved: temporal dynamics and spatial dependence. First, previous models have not taken into account the different time periods of Internet users checking emails or social messages, namely, temporal dynamics. Second, the problem of spatial dependence results from the improper assumption that the states of neighboring nodes are independent. These two problems seriously affect the accuracy of the previous analytical models. To address these two problems, we propose a novel analytical model. This model implements a spatial-temporal synchronization process, which is able to capture the temporal dynamics. Additionally, we find the essence of spatial dependence is the spreading cycles. By eliminating the effect of these cycles, our model overcomes the computational challenge of spatial dependence and provides a stronger approximation to the propagation dynamics. To evaluate our susceptible-infectious-immunized (SII) model, we conduct both theoretical analysis and extensive simulations. Compared with previous epidemic models and the spatial-temporal model, the experimental results show our SII model achieves a greater accuracy. We also compare our model with the susceptible-infectious-susceptible and susceptible-infectious-recovered models. The results show that our model is more suitable for modeling the propagation of social network worms. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Online Balancing Two Independent Criteria upon Placements and Deletions

    Page(s): 1644 - 1650
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1031 KB) |  | HTML iconHTML  

    We study the online bicriteria load balancing problem in this paper. We choose a system of distributed homogeneous file servers located in a cluster as the scenario and propose an online approximate solution for balancing their loads and required storage spaces upon placements and deletions. By placement (resp. deletion), we mean to insert a document into (resp. remove a document from) a server system. The main technique is to keep two global quantities large enough. To the best of our knowledge, the technique is novel, and the result is the first one, in the literature. Our result works for any sequences of document placements and deletions. For each deletion, a limited number of documents are reallocated. The load and storage space bounds are 1.5 to 4 times those in the best existing result for sole placements. We refer sole placements to those placement algorithms that do not allow any reallocation and replication. The time complexity, for each operation, is O(logMN), where M is the number of servers, and N is the number of existing documents in the servers, plus the reallocation cost for document deletion. The price for handling document deletion is almost totally reflected by the reallocation cost, and the higher bounds of load and storage spaces, while the O(logN) additive term in the time complexity serves as the remainder. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel K-Clique Community Detection on Large-Scale Networks

    Page(s): 1651 - 1660
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1345 KB)  

    The analysis of real-world complex networks has been the focus of recent research. Detecting communities helps in uncovering their structural and functional organization. Valuable insight can be obtained by analyzing the dense, overlapping, and highly interwoven k-clique communities. However, their detection is challenging due to extensive memory requirements and execution time. In this paper, we present a novel, parallel k-clique community detection method, based on an innovative technique which enables connected components of a network to be obtained from those of its subnetworks. The novel method has an unbounded, user-configurable, and input-independent maximum degree of parallelism, and hence is able to make full use of computational resources. Theoretical tight upper bounds on its worst case time and space complexities are given as well. Experiments on real-world networks such as the Internet and the World Wide Web confirmed the almost optimal use of parallelism (i.e., a linear speedup). Comparisons with other state-of-the-art k-clique community detection methods show dramatic reductions in execution time and memory footprint. An open-source implementation of the method is also made publicly available. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Scalable Hypergrid k-NN-Based Online Anomaly Detection in Wireless Sensor Networks

    Page(s): 1661 - 1670
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (663 KB)  

    Online anomaly detection (AD) is an important technique for monitoring wireless sensor networks (WSNs), which protects WSNs from cyberattacks and random faults. As a scalable and parameter-free unsupervised AD technique, k-nearest neighbor (kNN) algorithm has attracted a lot of attention for its applications in computer networks and WSNs. However, the nature of lazy-learning makes the kNN-based AD schemes difficult to be used in an online manner, especially when communication cost is constrained. In this paper, a new kNN-based AD scheme based on hypergrid intuition is proposed for WSN applications to overcome the lazy-learning problem. Through redefining anomaly from a hypersphere detection region (DR) to a hypercube DR, the computational complexity is reduced significantly. At the same time, an attached coefficient is used to convert a hypergrid structure into a positive coordinate space in order to retain the redundancy for online update and tailor for bit operation. In addition, distributed computing is taken into account, and position of the hypercube is encoded by a few bits only using the bit operation. As a result, the new scheme is able to work successfully in any environment without human interventions. Finally, the experiments with a real WSN data set demonstrate that the proposed scheme is effective and robust. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Task Allocation for Undependable Multiagent Systems in Social Networks

    Page(s): 1671 - 1681
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1301 KB)  

    Task execution of multiagent systems in social networks (MAS-SN) can be described through agents' operations when accessing necessary resources distributed in the social networks; thus, task allocation can be implemented based on the agents' access to the resources required for each task and aimed to minimize this resource access time. Currently, in undependable MAS-SN, there are deceptive agents that may fabricate their resource status information during task allocation but not really contribute resources to task execution; although there are some game theory-based solutions for undependable MAS, but which do not consider minimizing resource access time that is crucial to the performance of task execution in social networks. To achieve dependable resources with the least access time to execute tasks in undependable MAS-SN, this paper presents a novel task allocation model based on the negotiation reputation mechanism, where an agent's past behaviors in the resource negotiation of task execution can influence its probability to be allocated new tasks in the future. In this model, the agent that contributes more dependable resources with less access time during task execution is rewarded with a higher negotiation reputation, and may receive preferential allocation of new tasks. Through experiments, we determine that our task allocation model is superior to the traditional resources-based allocation approaches and game theory-based allocation approaches in terms of both the task allocation success rate and task execution time and that it usually performs close to the ideal approach (in which deceptive agents are fully detected) in terms of task execution time. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Two Blocks Are Enough: On the Feasibility of Using Network Coding to Ameliorate the Content Availability of BitTorrent Swarms

    Page(s): 1682 - 1694
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1190 KB)  

    In this paper, we conduct an in-depth study on the feasibility of using network coding to ameliorate the content availability of BitTorrent swarms. We first perform mathematical analysis on the potential improvement in the content availability and bandwidth utilization induced by two existing network coding schemes. It is found that these two coding schemes either incur a very high coding complexity and disk operation overhead or cannot effectively leverage the potential of improving the content availability. In this regard, we propose a simple sparse network coding scheme in which both the drawbacks mentioned before are precluded. To accommodate the proposed coding scheme into BitTorrent, a new block scheduling algorithm is also developed based on the original rarest-first block scheduling policy of BitTorrent. Through extensive simulations and performance evaluations, we show that the proposed coding scheme is very effective in terms of improving the content availability of BitTorrent swarms when compared with some existing methods. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Virtual Batching: Request Batching for Server Energy Conservation in Virtualized Data Centers

    Page(s): 1695 - 1705
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1222 KB)  

    Many power management strategies have been proposed for enterprise servers based on dynamic voltage and frequency scaling (DVFS), but those solutions cannot further reduce the energy consumption of a server when the server processor is already at the lowest DVFS level and the server utilization is still low (e.g., 10 percent or lower). To achieve improved energy efficiency, request batching can be conducted to group received requests into batches and put the processor into sleep between the batches. However, it is challenging to perform request batching on a virtualized server because different virtual machines on the same server may have different workload intensities. Hence, putting the shared processor into sleep may severely impact the application performance of all the virtual machines. This paper proposes Virtual Batching, a novel request batching solution for virtualized servers with primarily light workloads. Our solution dynamically allocates CPU resources such that all the virtual machines can have approximately the same performance level relative to their allowed peak values. Based on this uniform level, Virtual Batching determines the time length for periodically batching incoming requests and putting the processor into sleep. When the workload intensity changes from light to moderate, request batching is automatically switched to DVFS to increase processor frequency for performance guarantees. Virtual Batching is also extended to integrate with server consolidation for maximized energy conservation with performance guarantees for virtualized data centers. Empirical results based on a hardware testbed and real trace files show that Virtual Batching can achieve the desired performance with more energy conservation than several well-designed baselines, e.g., 63 percent more, on average, than a solution based on DVFS only. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

IEEE Transactions on Parallel and Distributed Systems (TPDS) is published monthly. It publishes a range of papers, comments on previously published papers, and survey articles that deal with the parallel and distributed systems research areas of current importance to our readers.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
David Bader
College of Computing
Georgia Institute of Technology