By Topic

Parallel and Distributed Systems, IEEE Transactions on

Issue 4 • Date April 2014

Filter Results

Displaying Results 1 - 25 of 25
  • A Delaunay-Based Coordinate-Free Mechanism for Full Coverage in Wireless Sensor Networks

    Page(s): 828 - 839
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1576 KB) |  | HTML iconHTML  

    Recently, many schemes have been proposed for detecting and healing coverage holes to achieve full coverage in wireless sensor networks (WSNs). However, none of these schemes aim to find the shortest node movement paths to heal the coverage holes, which could significantly reduce energy usage for node movement. Also, current hole healing schemes require accurate knowledge of sensor locations; obtaining this knowledge consumes high energy. In this paper, we propose a Delaunay-based coordinate-free mechanism (DECM) for full coverage. Based on rigorous mathematical analysis, DECM can detect coverage holes and find the locally shortest paths for healing holes in a distributed manner without requiring accurate node location information. Also, DECM incorporates a cooperative movement mechanism that can prevent generating new holes during node movements in healing holes. Simulation results and experimental results from the real-world GENI Orbit testbed show that DECM achieves superior performance in terms of the energy-efficiency, effectiveness of hole healing, energy consumption balance and lifetime compared to previous schemes. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A High-Utilization Scheduling Schemeof Stream Programs on ClusteredVLIW Stream Architectures

    Page(s): 840 - 850
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (4096 KB) |  | HTML iconHTML  

    Stream architectures have emerged as a mainstream solution for computation-intensive applications due to their rich arithmetic units. This paper proposes a multithreading technique based on a scheduling scheme of stream programs on clustered VLIW stream architecture, which aims at optimal arithmetic unit utilization without increasing energy consumption. Its principle is to exploit more kernel-level parallelism for further optimal compilation by constructing homogeneous multiple threads on stream programs. Three phases are proposed in the scheduling scheme. First, threads in stream programs are replicated for constructing homogeneous multiple threads. Second, time step assignment for homogeneous multithreaded stream programs is utilized to obtain efficient kernel combination. Third, stream segmentation is presented to optimize both memory transfers and startup overheads of kernels. A set of benchmarks are exploited to evaluate the effectiveness of the proposed technique. Experimental results show that, with exploiting kernel-level software pipeline, the proposed technique improves the performance by 20.9 percent averagely with the energy decreasing by 7.6 percent. Utilizations of adders and multipliers are up to average 77.4 and 75.8 percent, increasing 17.0 and 13.3 percent, respectively. Moreover, the proposed technique performs an average of 12.5 percent improvement over CSMT4 with the energy decreasing by 12.0 percent. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • ADAPT-POLICY: Task Assignment in Server Farms when the Service Time Distributionof Tasks is Not Known A Priori

    Page(s): 851 - 861
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (961 KB) |  | HTML iconHTML  

    Service time distribution of certain computing workloads such as static web content is well known. However, for many other computing workloads (e.g., dynamic web content, scientific workloads) the service time distribution is not well understood and it is not correct to assume that these tasks follow a particular distribution. In this paper, we consider task assignment in server farms when both the service time distribution of tasks and (actual) sizes of tasks are not known a priori. We propose an adaptive task assignment policy, called ADAPT-POLICY, which is based on the concept of multiple static-based task assignment policies. ADAPT-POLICY defines a set of policies for a given system taking into account the specific properties of the system. These policies are selected in such a way that they have different performance characteristics under different workload conditions (i.e., service time distributions, etc.). The objective is to use the task assignment policy with the best performance (i.e., the one with the least expected waiting time) to assign tasks. Which task assignment policy performs the best depends on the traffic conditions that vary over time. ADAPT-POLICY determines the best task assignment using the service time distribution of tasks (and various other traffic properties), which is estimated on-line and then it adaptively changes the task assignment policy to suit the most recent traffic conditions. The experimental results show that ADAPT-POLICY can result in significant performance improvements over both static and dynamic task assignment policies. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Efficient and Trustworthy Resource Sharing Platform for Collaborative Cloud Computing

    Page(s): 862 - 875
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1959 KB) |  | HTML iconHTML  

    Advancements in cloud computing are leading to a promising future for collaborative cloud computing (CCC), where globally-scattered distributed cloud resources belonging to different organizations or individuals (i.e., entities) are collectively used in a cooperative manner to provide services. Due to the autonomous features of entities in CCC, the issues of resource management and reputation management must be jointly addressed in order to ensure the successful deployment of CCC. However, these two issues have typically been addressed separately in previous research efforts, and simply combining the two systems generates double overhead. Also, previous resource and reputation management methods are not sufficiently efficient or effective. By providing a single reputation value for each node, the methods cannot reflect the reputation of a node in providing individual types of resources. By always selecting the highest-reputed nodes, the methods fail to exploit node reputation in resource selection to fully and fairly utilize resources in the system and to meet users' diverse QoS demands. We propose a CCC platform, called Harmony, which integrates resource management and reputation management in a harmonious manner. Harmony incorporates three key innovations: integrated multi-faceted resource/reputation management, multi-QoS-oriented resource selection, and price-assisted resource/reputation control. The trace data we collected from an online trading platform implies the importance of multi-faceted reputation and the drawbacks of highest-reputed node selection. Simulations and trace-driven experiments on the real-world PlanetLab testbed show that Harmony outperforms existing resource management and reputation management systems in terms of QoS, efficiency and effectiveness. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Automated and Agile Server ParameterTuning by Coordinated Learning and Control

    Page(s): 876 - 886
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1617 KB) |  | HTML iconHTML  

    Automated server parameter tuning is crucial to performance and availability of Internet applications hosted in cloud environments. It is challenging due to high dynamics and burstiness of workloads, multi-tier service architecture, and virtualized server infrastructure. In this paper, we investigate automated and agile server parameter tuning for maximizing effective throughput of multi-tier Internet applications. A recent study proposed a reinforcement learning based server parameter tuning approach for minimizing average response time of multi-tier applications. Reinforcement learning is a decision making process determining the parameter tuning direction based on trial-and-error, instead of quantitative values for agile parameter tuning. It relies on a predefined adjustment value for each tuning action. However it is nontrivial or even infeasible to find an optimal value under highly dynamic and bursty workloads. We design a neural fuzzy control based approach that combines the strengths of fast online learning and self-adaptiveness of neural networks and fuzzy control. Due to the model independence, it is robust to highly dynamic and bursty workloads. It is agile in server parameter tuning due to its quantitative control outputs. We implemented the new approach on a testbed of virtualized data center hosting RUBiS and WikiBench benchmark applications. Experimental results demonstrate that the new approach significantly outperforms the reinforcement learning based approach for both improving effective system throughput and minimizing average response time. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Cloning, Resource Exchange, and RelationAdaptation: An Integrative Self-Organisation Mechanism in a Distributed Agent Network

    Page(s): 887 - 897
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1094 KB) |  | HTML iconHTML  

    Self-organisation provides a suitable paradigm for developing self-managed complex distributed systems, such as grid computing and sensor networks. In this paper, an integrative self-organisation mechanism is proposed. Unlike current related studies, which propose only a single principle of self-organisation, this mechanism synthesises the three principles of self-organisation: cloning/spawning, resource exchange and relation adaptation. Based on this mechanism, an agent can autonomously generate new agents when it is overloaded, exchange resources with other agents if necessary, and modify relations with other agents to achieve a better agent network structure. In this way, agents can adapt to dynamic environments. The proposed mechanism is evaluated through a comparison with three other approaches, each of which represents state-of-the-art research in each of the three self-organisation principles. Experimental results demonstrate that the proposed mechanism outperforms the three approaches in terms of the profit of individual agents and the entire agent network, the load-balancing among agents, and the time consumption to finish a simulation run. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Composing Kerberos and Multimedia Internet KEYing (MIKEY) for AuthenticatedTransport of Group Keys

    Page(s): 898 - 907
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (972 KB) |  | HTML iconHTML  

    We motivate and present two designs for the composition of the authentication protocol, Kerberos, and the key transport protocol, Multimedia Internet KEYing (MIKEY) for authenticated transport of cryptographic keys for secure group-communication in enterprise and public-safety settings. A technical challenge, and our main contribution, is the analysis of the security of the composition. Towards this, we design our compositions to have intuitive appeal and thereby less prone to security vulnerabilities. We then employ protocol composition logic (PCL), a state-of-the-art approach for analyzing our composition. For this, we first articulate two properties that are of interest. Both properties are on the group key that is transported; we call them Group Key Confidentiality and Acquisition. Group Key Confidentiality is the property that if a principal possesses the key, then it is an authorized member of the group. Group Key Acquisition is the property that if a principal is a member of the group, then it is able to acquire the group key. In the course of our rigorous analysis, we discovered a flaw in our first design, which we point out, and which lead us to our second design. We have implemented both designs starting with the publicly available reference implementation of Kerberos, and an open-source implementation of MIKEY. Our implementations are available as open-source. We discuss our experience from the implementation, and present empirical results. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Connectivity-Based Boundary Extractionof Large-Scale 3D Sensor Networks:Algorithm and Applications

    Page(s): 908 - 918
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2193 KB) |  | HTML iconHTML  

    Sensor networks are invariably coupled tightly with the geometric environment in which the sensor nodes are deployed. Network boundary is one of the key features that characterize such environments. While significant advances have been made for 2D cases, so far boundary extraction for 3D sensor networks has not been thoroughly studied. We present CABET, a novel Connectivity-Based Boundary Extraction scheme for large-scale 3D sensor networks. To the best of our knowledge, CABET is the first 3D-capable and pure connectivity-based solution for detecting sensor network boundaries. It is fully distributed, and is highly scalable, requiring overall message cost linear with the network size. A highlight of CABET is its non-uniform critical node sampling , called r'-sampling , that selects landmarks to form boundary surfaces with bias toward nodes embodying salient topological features. Simulations show that CABET is able to extract a well-connected boundary in the presence of holes and shape variation, with performance superior to that of some state-of-the-art alternatives. In addition, we show how CABET benefits a range of sensor network applications including 3D skeleton extraction, 3D segmentation, and 3D localization. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Constructing Limited Scale-Free Topologiesover Peer-to-Peer Networks

    Page(s): 919 - 928
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1317 KB) |  | HTML iconHTML  

    Overlay network topology together with peer/data organization and search algorithm are the crucial components of unstructured peer-to-peer (P2P) networks as they directly affect the efficiency of search on such networks. Scale-free (power-law) overlay network topologies are among structures that offer high performance for these networks. A key problem for these topologies is the existence of hubs, nodes with high connectivity. Yet, the peers in a typical unstructured P2P network may not be willing or able to cope with such high connectivity and its associated load. Therefore, some hard cutoffs are often imposed on the number of edges that each peer can have, restricting feasible overlays to limited or truncated scale-free networks. In this paper, we analyze the growth of such limited scale-free networks and propose two different algorithms for constructing perfect scale-free overlay network topologies at each instance of such growth. Our algorithms allow the user to define the desired scale-free exponent ( γ). They also induce low communication overhead when network grows from one size to another. Using extensive simulations, we demonstrate that these algorithms indeed generate perfect scale free networks (at each step of network growth) that provide better search efficiency in various search algorithms than the networks generated by the existing solutions. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Constructing Sub-Arrays with ShortInterconnects from Degradable VLSI Arrays

    Page(s): 929 - 938
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1098 KB) |  | HTML iconHTML  

    Reducing the interconnection length of VLSI arrays leads to less capacitance, power dissipation and dynamic communication cost between the processing elements (PEs). This paper develops efficient algorithms for constructing tightly-coupled subarrays from the mesh-connected VLSI arrays with faulty PEs. For a given size r·s of the target (logical) array, the proposed algorithm searches and reroutes a physical r×s subarray that has the least number of faults, resulting in an approximate target array, which is subsequently extended to the desired target array. Experimental results show that over 65 percent redundant interconnects can be reduced for a 64×64 target array on the 512×512 host array with no more than 1 percent faults. In addition, we propose a recursive divide-and-conquer algorithm for constructing the maximum target array (MTA). The lower bound of the total interconnection length of the MTA has been established. Experimental results show that the proposed algorithm is capable of reducing the long interconnects by over 33 percent for the MTA derived from the 512×512 host array with no more than 1 percent faults. Moreover, the proposed total interconnection length of target array is close to the lower bound for the cases with relatively fewer number of faults. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Detecting Movements of a Target Using Face Tracking in Wireless Sensor Networks

    Page(s): 939 - 949
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1670 KB) |  | HTML iconHTML  

    Target tracking is one of the key applications of wireless sensor networks (WSNs). Existing work mostly requires organizing groups of sensor nodes with measurements of a target's movements or accurate distance measurements from the nodes to the target, and predicting those movements. These are, however, often difficult to accurately achieve in practice, especially in the case of unpredictable environments, sensor faults, etc. In this paper, we propose a new tracking framework, called FaceTrack, which employs the nodes of a spatial region surrounding a target, called a face. Instead of predicting the target location separately in a face, we estimate the target's moving toward another face. We introduce an edge detection algorithm to generate each face further in such a way that the nodes can prepare ahead of the target's moving, which greatly helps tracking the target in a timely fashion and recovering from special cases, e.g., sensor fault, loss of tracking. Also, we develop an optimal selection algorithm to select which sensors of faces to query and to forward the tracking data. Simulation results, compared with existing work, show that FaceTrack achieves better tracking accuracy and energy efficiency. We also validate its effectiveness via a proof-of-concept system of the Imote2 sensor platform. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Distributed Detection in Mobile Access Wireless Sensor Networks under Byzantine Attacks

    Page(s): 950 - 959
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (984 KB) |  | HTML iconHTML  

    This paper explores reliable data fusion in mobile access wireless sensor networks under Byzantine attacks. We consider the q-out-of-m rule, which is popular in distributed detection and can achieve a good tradeoff between the miss detection probability and the false alarm rate. However, a major limitation with it is that the optimal scheme parameters can only be obtained through exhaustive search, making it infeasible for large networks. In this paper, first, by exploiting the linear relationship between the scheme parameters and the network size, we propose simple but effective sub-optimal linear approaches. Second, for better flexibility and scalability, we derive a near-optimal closed-form solution based on the central limit theorem. Third, subjecting to a miss detection constraint, we prove that the false alarm rate of q-out-of-m diminishes exponentially as the network size increases, even if the percentage of malicious nodes remains fixed. Finally, we propose an effective malicious node detection scheme for adaptive data fusion under time-varying attacks; the proposed scheme is analyzed using the entropy-based trust model, and shown to be optimal from the information theory point of view. Simulation examples are provided to illustrate the performance of proposed approaches under both static and dynamic attacks. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficiently Representing Membershipfor Variable Large Data Sets

    Page(s): 960 - 970
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (1782 KB) |  | HTML iconHTML  

    Cloud computing has raised new challenges for the membership representation scheme of storage systems that manage very large data sets. This paper proposes DBA, a dynamic Bloom filter array aimed at representing membership for variable large data sets in storage systems in a scalable way. DBA consists of dynamically created groups of space-efficient Bloom filters (BFs) to accommodate changes in set sizes. Within a group, BFs are homogeneous and the data layout is optimized at the bit level to enable parallel access and thus achieve high query performance. DBA can effectively control its query accuracy by partially adjusting the error rate of the constructing BFs, where each BF only represents an independent subset to help locate elements and confirm membership. Further, DBA supports element deletion by introducing a lazy update policy. We prototype and evaluate our DBA scheme as a scalable fast index in the MAD2 deduplication storage system. Experimental results reveal that DBA (with 64 BFs per group) shows significantly higher query performance than the state-of-the-art approach while scaling up to 160 BFs. DBA is also shown to excel in scalability, query accuracy, and space efficiency by theoretical analysis and experimental evaluation. View full abstract»

    Open Access
  • Guarantee Strict Fairness and UtilizePrediction Better in Parallel Job Scheduling

    Page(s): 971 - 981
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2204 KB) |  | HTML iconHTML  

    As the most widely used parallel job scheduling strategy, EASY backfilling achieved great success, not only because it can balance fairness and performance, but also because it is universally applicable to most HPC systems. However, unfairness still exists in EASY. Our simulation shows that a blocked job can be delayed by later jobs for more than 90 hours on real workloads. Additionally, directly employing runtime prediction techniques in EASY would lead to a serious situation called reservation violation. In this paper, we aim at guaranteeing strict fairness (no job is delayed by any jobs of lower priority) while achieving attractive performance, and employing prediction without causing reservation violation in parallel job scheduling. We propose two novel strategies, namely, shadow load preemption (SLP) and venture backfilling (VB), which are integrated into EASY to construct preemptive venture EASY backfilling (PV-EASY). Experimental results on three real HPC workloads demonstrate that PV-EASY is more attractive than EASY in parallel job scheduling, from both academic and industry perspectives. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • How to Conduct Distributed IncompletePattern Matching

    Page(s): 982 - 992
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1007 KB) |  | HTML iconHTML  

    In this paper, we first propose a very interesting and practical problem, pattern matching in a distributed mobile environment. Pattern matching is a well-known problem and extensive research has been conducted for performing effective and efficient search. However, previous proposed approaches assume that data are centrally stored, which is not the case in a mobile environment (e.g., mobile phone networks), where one person's pattern could be separately stored in a number of different stations, and such a local pattern is incomplete compared with the global pattern. A simple solution to pattern matching over a mobile environment is to collect all the data distributed in base stations to a data center and conduct pattern matching at the data center afterwards. Clearly, such a simple solution will raise huge amount of communication traffic, which could cause the communication bottleneck brought by the limited wireless bandwidth to be even worse. Therefore, a communication efficient and search effective solution is necessary. In our work, we present a novel solution which is based on our well-designed weighted bloom filter (WBF), called, Distributed Incomplete pattern matching ( DI-matching), to find target patterns over a distributed mobile environment. Specifically, to save communication cost and ensure pattern matching in distributed incomplete patterns, we use WBF to encode a query pattern and disseminate the encoded data to each base station. Each base station conducts a local pattern search according to the received WBF. Only qualified IDs and corresponding weights in each base station are sent to the data center for aggregation and verification. Through non-trivial theoretical analysis and extensive empirical experiments on a real city-scale mobile networks data set, we demonstrate the effectiveness and efficiency of our proposed solutions. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Process Placement in Multicore Clusters:Algorithmic Issues and Practical Techniques

    Page(s): 993 - 1002
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (903 KB) |  | HTML iconHTML  

    Current generations of NUMA node clusters feature multicore or manycore processors. Programming such architectures efficiently is a challenge because numerous hardware characteristics have to be taken into account, especially the memory hierarchy. One appealing idea to improve the performance of parallel applications is to decrease their communication costs by matching the communication pattern to the underlying hardware architecture. In this paper, we detail the algorithm and techniques proposed to achieve such a result: first, we gather both the communication pattern information and the hardware details. Then we compute a relevant reordering of the various process ranks of the application. Finally, those new ranks are used to reduce the communication costs of the application. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • QoF: Towards Comprehensive Path Quality Measurement in Wireless Sensor Networks

    Page(s): 1003 - 1013
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1655 KB) |  | HTML iconHTML  

    Due to its large scale and constrained communication radius, a wireless sensor network mostly relies on multi-hop transmissions to deliver a data packet along a sequence of nodes. It is of essential importance to measure the forwarding quality of multi-hop paths and such information shall be utilized in designing efficient routing strategies. Existing metrics like ETX, ETF mainly focus on quantifying the link performance in between the nodes while overlooking the forwarding capabilities inside the sensor nodes. The experience on manipulating GreenOrbs, a large-scale sensor network with 330 nodes, reveals that the quality of forwarding inside each sensor node is at the least an equally important factor that contributes to the path quality in data delivery. In this paper we propose QoF, Quality of Forwarding, a new metric which explores the performance in the gray zone inside a node left unattended in previous studies. By combining the QoF measurements within a node and over a link, we are able to comprehensively measure the intact path quality in designing efficient multi-hop routing protocols. We implement QoF and build a modified Collection Tree Protocol (CTP). We evaluate the data collection performance in a testbed consisting of 50 TelosB nodes, and compare it with the original CTP protocol. The experimental results show that our approach takes both transmission cost and forwarding reliability into consideration, thus achieving a high throughput for data collection. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Rateless Codes and Random Walksfor P2P Resource Discovery in Grids

    Page(s): 1014 - 1023
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1004 KB) |  | HTML iconHTML  

    Peer-to-peer (P2P) resource location techniques in grid systems have been recently investigated to obtain scalability, reliability, efficiency, fault-tolerance, security, and robustness. Query resolution for locating resources and update information on their own resource status in these systems can be abstracted as the problem of allowing one peer to obtain a local view of global information defined on all peers of a P2P unstructured network. In this paper, the system is represented as a set of nodes connected to form a P2P network where each node holds a piece of information that is required to be communicated to all the participants. Moreover, we assume that the information can dynamically change and that each peer periodically requires to access the values of the data of all other peers. A novel approach based on a continuous flow of control packets exchanged among the nodes using the random walk principle and rateless coding is proposed. An innovative rateless decoding mechanism that is able to cope with asynchronous information updates is also proposed. The performance of the proposed system is evaluated both analytically and experimentally by simulation. The analytical results show that the proposed strategy guarantees quick diffusion of the information and scales well to large networks. Simulations show that the technique is effective also in presence of network and information dynamics. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Reducing Peak Power Consumption inMulti-Core Systems without ViolatingReal-Time Constraints

    Page(s): 1024 - 1033
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1415 KB) |  | HTML iconHTML  

    The potential of multi-core chips for high performance and reliability at low cost has made them ideal computing platforms for embedded real-time systems. As a result, power management of a multi-core chip has become an important issue in the design of embedded real-time systems. Most existing approaches have been designed to regulate the behavior of average power consumption, such as minimizing the total energy consumption or the chip temperature. However, little attention has been paid to the worst-case behavior of instantaneous power consumption on a chip, called chip-level peak power consumption, an important design parameter that determines the cost and/or size of chip design/packaging and the underlying power supply. We address this problem by reducing the chip-level peak power consumption at design time without violating any real-time constraints. We achieve this by carefully scheduling real-time tasks, without relying on any additional hardware implementation for power management, such as dynamic voltage and frequency scaling. Specifically, we propose a new scheduling algorithm FPΘ that restricts the concurrent execution of tasks assigned on different cores, and perform its schedulability analysis. Using this analysis, we develop a method that finds a set of concurrent executable tasks, such that the design-time chip-level peak power consumption is minimized and all timing requirements are met. We demonstrate via simulation that the proposed method not only keeps the design-time chip-level peak power consumption as low as the theoretical lower bound for trivial cases, but also reduces the peak power consumption for non-trivial cases by up to 12.9 percent compared to the case of no restriction on concurrent task execution. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Reliability of Heterogeneous Distributed Computing Systems in the Presence of Correlated Failures

    Page(s): 1034 - 1043
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (912 KB) |  | HTML iconHTML  

    While the reliability of distributed-computing systems (DCSs) has been widely studied under the assumption that computing elements (CEs) fail independently, the impact of correlated failures of CEs on the reliability remains an open question. Here, the problem of modeling and assessing the impact of stochastic, correlated failures on the service reliability of applications running on DCSs is tackled. The service reliability is modeled using an integrated analytical and Monte-Carlo (MC) approach. The analytical component of the model comprises a generalization of a previously developed model for reliability of non-Markovian DCSs to a setting where specific patterns of simultaneous failures in CEs are allowed. The analytical model is complemented by a MC-based procedure to draw correlated-failure patterns using the recently reported concept of probabilistic shared risk groups (PSRGs). The reliability model is further utilized to develop and optimize a novel class of dynamic task reallocation (DTR) policies that maximize the reliability of DCSs in the presence of correlated failures. Theoretical predictions, MC simulations, and results from an emulation testbed show that the reliability can be improved when DTR policies correctly account for correlated failures. The impact of correlated failures of CEs on the reliability and the key dependence of DTR policies on the type of correlated failures are also investigated. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Resource Availability Characteristicsand Node Selection in CooperativelyShared Computing Platforms

    Page(s): 1044 - 1054
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1276 KB) |  | HTML iconHTML  

    The focus of our work is on studying the resource availability characteristics of large-scale, cooperatively pooled, shared computing platforms. Our focus is on platforms in which resources at a node are allocated to competing users on fair-share basis, without any reserved resource capacities for any user, and there is no platform-wide resource manager for the placement of users on different nodes. The users independently select nodes for their applications. Our study is focused on the PlanetLab system which exemplifies such platforms. The goal of our study is to develop heuristics based on the observed resource availability characteristics for selecting nodes for deploying applications. Our approach uses the notion of eligibility period, which represents a contiguous duration for which a node satisfies a given resource requirement. We study the characteristics of the eligibility periods of Planetlab nodes for various resource capacity requirements. Based on this study we develop heuristics for identifying nodes that are likely to satisfy a given requirement for long durations. We also develop an online model for predicting the idle resource capacity that is likely to be available on a node over a short term. We evaluate and demonstrate the performance benefits of the node selection techniques and the prediction model using the PlanetLab node utilization data traces collected at different intervals over an extended period of several months. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Secure Time Synchronization in WirelessSensor Networks: A MaximumConsensus-Based Approach

    Page(s): 1055 - 1065
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1701 KB) |  | HTML iconHTML  

    Time synchronization is a fundamental requirement for the wide spectrum of applications with wireless sensor networks (WSNs). However, most existing time synchronization protocols are likely to deteriorate or even to be destroyed when the WSNs are attacked by malicious intruders. This paper is concerned with secure time synchronization for WSNs under message manipulation attacks. Specifically, the theoretical analysis and simulation results are first provided to demonstrate that the maximum consensus based time synchronization (MTS) protocol would be invalid under message manipulation attacks. Then, a novel secured maximum consensus based time synchronization (SMTS) protocol is proposed to detect and invalidate message manipulation attacks. Furthermore, we prove that SMTS is guaranteed to converge with simultaneous compensation of both clock skew and offset. Extensive numerical results show the effectiveness of our proposed protocol. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • SOS: A Distributed Mobile Q&A SystemBased on Social Networks

    Page(s): 1066 - 1077
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2327 KB) |  | HTML iconHTML  

    Recently, emerging research efforts have been focused on question and answer (Q&A) systems based on social networks. The social-based Q&A systems can answer non-factual questions, which cannot be easily resolved by web search engines. These systems either rely on a centralized server for identifying friends based on social information or broadcast a user's questions to all of its friends. Mobile Q&A systems, where mobile nodes access the Q&A systems through Internet, are very promising considering the rapid increase of mobile users and the convenience of practical use. However, such systems cannot directly use the previous centralized methods or broadcasting methods, which generate high cost of mobile Internet access, node overload, and high server bandwidth cost with the tremendous number of mobile users. We propose a distributed Social-based mObile Q&A System (SOS) with low overhead and system cost as well as quick response to question askers. SOS enables mobile users to forward questions to potential answerers in their friend lists in a decentralized manner for a number of hops before resorting to the server. It leverages lightweight knowledge engineering techniques to accurately identify friends who are able to and willing to answer questions, thus reducing the search and computation costs of mobile nodes. The trace-driven simulation results show that SOS can achieve a high query precision and recall rate, a short response latency and low overhead. We have also deployed a pilot version of SOS for use in a small group in Clemson University. The feedback from the users shows that SOS can provide high-quality answers. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The Generalized Loneliness Detector and Weak System Models for k-Set Agreement

    Page(s): 1078 - 1088
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (799 KB) |  | HTML iconHTML  

    This paper presents two weak partially synchronous system models Manti(n-k) and Msink(n-k), which are just strong enough for solving k-set agreement: We introduce the generalized (n-k)-loneliness failure detector L(k), which we first prove to be sufficient for solving k-set agreement, and show that L(k) but not L(k-1) can be implemented in both models. Manti(n-k) and Msink(n-k) are hence the first message passing models that lie between models where Ω (and therefore consensus) can be implemented and the purely asynchronous model. We also address k-set agreement in anonymous systems, that is, in systems where (unique) process identifiers are not available. Since our novel k -set agreement algorithm using L(k) also works in anonymous systems, it turns out that the loneliness failure detector L=L(n-1) introduced by Delporte et al. is also the weakest failure detector for set agreement in anonymous systems. Finally, we analyze the relationship between L(k) and other failure detectors suitable for solving k-set agreement. View full abstract»

    Open Access
  • Trajectory Improves Data Delivery in Urban Vehicular Networks

    Page(s): 1089 - 1100
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1573 KB) |  | HTML iconHTML  

    Efficient data delivery is of great importance, but highly challenging for vehicular networks because of frequent network disruption, fast topological change and mobility uncertainty. The vehicular trajectory knowledge plays a key role in data delivery. Existing algorithms have largely made predictions on the trajectory with coarse-grained patterns such as spatial distribution or/and the inter-meeting time distribution, which has led to poor data delivery performance. In this paper, we mine the extensive data sets of vehicular traces from two large cities in China, i.e., Shanghai and Shenzhen, through conditional entropy analysis, we find that there exists strong spatiotemporal regularity with vehicle mobility. By extracting mobility patterns from historical vehicular traces, we develop accurate trajectory predictions by using multiple order Markov chains. Based on an analytical model, we theoretically derive packet delivery probability with predicted trajectories. We then propose routing algorithms taking full advantage of predicted probabilistic vehicular trajectories. Finally, we carry out extensive simulations based on three large data sets of real GPS vehicular traces, i.e., Shanghai taxi data set, Shanghai bus data set and Shenzhen taxi data set. The conclusive results demonstrate that our proposed routing algorithms can achieve significantly higher delivery ratio at lower cost when compared with existing algorithms. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

IEEE Transactions on Parallel and Distributed Systems (TPDS) is published monthly. It publishes a range of papers, comments on previously published papers, and survey articles that deal with the parallel and distributed systems research areas of current importance to our readers.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
David Bader
College of Computing
Georgia Institute of Technology