By Topic

Computers, IEEE Transactions on

Issue 6 • Date June 2014

Filter Results

Displaying Results 1 - 21 of 21
  • A Geometric Deployment and Routing Scheme for Directional Wireless Mesh Networks

    Page(s): 1323 - 1335
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2172 KB) |  | HTML iconHTML  

    This paper first envisions the advent of the wireless mesh networks with multiple radios and directional antennas in future. Then, based on the observation that simplicity induces efficiency and scalability, the paper proposes a joint geometric deployment and routing strategy for such mesh networks, and also gives a concrete approach under this strategy. The main idea of this strategy is to deploy mesh networks in certain kind of geometric graph, and then design a geometric routing protocol by exploiting the routing properties of this graph. The proposed concrete approach comprises two parts: (1) a topology generation algorithm based on Delaunay triangulations and (2) a geometric routing protocol based on the greedy forwarding algorithm. Both parts are characterized by simplicity and appealing properties, with formal proofs provided when possible. The simulation results validate our proposed approach. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Stochastic Computational Approach for Accurate and Efficient Reliability Evaluation

    Page(s): 1336 - 1350
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2017 KB) |  | HTML iconHTML  

    Reliability is fast becoming a major concern due to the nanometric scaling of CMOS technology. Accurate analytical approaches for the reliability evaluation of logic circuits, however, have a computational complexity that generally increases exponentially with circuit size. This makes intractable the reliability analysis of large circuits. This paper initially presents novel computational models based on stochastic computation; using these stochastic computational models (SCMs), a simulation-based analytical approach is then proposed for the reliability evaluation of logic circuits. In this approach, signal probabilities are encoded in the statistics of random binary bit streams and non-Bernoulli sequences of random permutations of binary bits are used for initial input and gate error probabilities. By leveraging the bit-wise dependencies of random binary streams, the proposed approach takes into account signal correlations and evaluates the joint reliability of multiple outputs. Therefore, it accurately determines the reliability of a circuit; its precision is only limited by the random fluctuations inherent in the stochastic sequences. Based on both simulation and analysis, the SCM approach takes advantages of ease in implementation and accuracy in evaluation. The use of non-Bernoulli sequences as initial inputs further increases the evaluation efficiency and accuracy compared to the conventional use of Bernoulli sequences, so the proposed stochastic approach is scalable for analyzing large circuits. It can further account for various fault models as well as calculating the soft error rate (SER). These results are supported by extensive simulations and detailed comparison with existing approaches. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Aggregation Capacity of Wireless Sensor Networks: Extended Network Case

    Page(s): 1351 - 1364
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2733 KB) |  | HTML iconHTML  

    A critical function of wireless sensor networks (WSNs) is data gathering. One is often only interested in collecting a specific function of the sensor measurements at a sink node, rather than downloading all the raw data from all the sensors. In this paper, we study the capacity of computing and transporting the specific functions of sensor measurements to the sink node, called aggregation capacity, for WSNs. We focus on random WSNs that can be classified into two types: random extended WSN and random dense WSN. All existing results about aggregation capacity are studied for dense WSNs, including random cases and arbitrary cases, under the protocol model (ProM) or physical model (PhyM). In this paper, we propose the first aggregation capacity scaling laws for random extended WSNs. We point out that unlike random dense WSNs, for random extended WSNs, the assumption made in ProM and PhyM that each successful transmission can sustain a constant rate is over-optimistic and unpractical due to transmit power limitation. We derive the first result on aggregation capacity for random extended WSNs under the generalized physical model. Particularly, we prove that, for the type-sensitive divisible perfectly compressible functions and type-threshold divisible perfectly compressible functions, the aggregation capacities for random extended WSNs with n nodes are of order Θ ((logn)-α/2-1)) and Θ(((log n) - α /2)/(loglogn)), respectively, where α >2 denotes the power attenuation exponent in the generalized physical model. Furthermore, we improve the aggregation throughput for general divisible perfectly compressible functions to Ω((logn) - α/2)) by choosing Θ(logn) sensors from a small region (relative to the whole region) as sink nodes. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Efficient Memetic Algorithm for theMax-Bisection Problem

    Page(s): 1365 - 1376
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1906 KB) |  | HTML iconHTML  

    The max-bisection problem consists in partitioning the vertices of a weighted undirected graph into two equally sized subsets so as to maximize the sum of the weights of crossing edges. It is an NP-hard combinatorial optimization problem that arises in many applications. In this paper, we present a memetic algorithm for the max-bisection problem, which integrates a new fast local search procedure, a crossover operator, and a pool updating strategy. These strategies achieve a balance between intensification and diversification. Extensive experiments were performed on a number of benchmark instances with 800 to 10,000 vertices from the literature. The proposed memetic algorithm improved the best known solutions for all benchmark instances tested in this paper. The improvement in terms of cut value over the CirCut by Burer et al. ranging from 0.02 to 4.15 percent, and the average time of our proposed memetic algorithm is much lower than that of CirCut. It shows that the proposed memetic algorithm can find high quality solutions in an acceptable running time. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Analytical Leakage-Aware Thermal Modeling of a Real-Time System

    Page(s): 1378 - 1392
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2932 KB) |  | HTML iconHTML  

    We consider a firm real-time system with a single processor working in two power modes depending on whether it is idle or executing a job. The system is equipped with dynamic thermal management through a cooling subsystem which can switch between two cooling modes. Real-time jobs which arrive to the system have stochastic properties and are prone to soft errors. A successful job is one that enters the system and completes its execution with no timing or soft error. Appropriateness of the system is evaluated based on its performance, temperature behavior, reliability, and energy consumption. It is noteworthy that these criteria have mutual interactions to each other: the stochastic nature of the system affects the success ratio of jobs beside the system dynamic power, the leakage as well as dynamic power impacts the processor temperature, this temperature affects the leakage power, the cooling subsystem power, and the soft error rate, which the latter in turn impacts the system reliability and the success ratio of jobs. This paper proposes an analytical evaluation method with a Markovian view to the system which considers these reciprocal effects. A number of simulation experiments are carried out to validate the accuracy of the proposed method. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • BIRDS: A Bare-Metal Recovery Systemfor Instant Restoration of Data Services

    Page(s): 1392 - 1407
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1565 KB) |  | HTML iconHTML  

    We propose Birds: a bare-metal recovery system for instant restoration of data services, focusing on a general-purpose automatic backup-and-recovery approach to protect data and resume data services from scratch instantly after disasters. We design BIRDS to possess two appealing features: full automation in the backup-and-recovery process, and instant data service resumption after disasters. BIRDS achieves the former one with automatic whole system replication and restoration, by taking the backup process outside of the protected system with the help of a novel non-intrusive light-weight physical to virtual conversion method. The latter one is enabled by a novel pipelined parallel recovery mechanism, which allows data services being instantly resumed while data recovery between the backup data center and the production site is still in progress. We implemented a BIRDS prototype and evaluated it using standard benchmarks. We show that BIRDS outperforms existing disaster recovery techniques by the means of recovery efficiency while introducing relatively small runtime overhead. Furthermore, BIRDS can be directly applied to any existing system in a plug-and-protect fashion without requiring re-installation or any modification of the existing system. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Countering Power Analysis Attacks UsingReliable and Aggressive Designs

    Page(s): 1408 - 1420
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1471 KB) |  | HTML iconHTML  

    Recent events have indicated that attackers are banking on side-channel attacks, such as differential power analysis (DPA) and correlation power analysis (CPA), to exploit information leaks from physical devices. Random dynamic voltage frequency scaling (RDVFS) has been proposed to prevent such attacks and has very little area, power, and performance overheads. But due to the one-to-one mapping present between voltage and frequency of DVFS voltage-frequency pairs, RDVFS cannot prevent power attacks. In this paper, we propose a novel countermeasure that uses reliable and aggressive designs to break this one-to-one mapping. Our experiments show that our technique significantly reduces the correlation for the actual key and also reduces the risk of power attacks by increasing the probability for incorrect keys to exhibit maximum correlation. Moreover, our scheme also enables systems to operate beyond the worst-case estimates to offer improved power and performance benefits. For the experiments conducted on AES S-box implemented using 45 nm CMOS technology, our approach has increased performance by 22 percent over the worst-case estimates. Also, it has decreased the correlation for the correct key by an order and has increased the probability by almost 3.5X times for wrong keys when compared with the original key to exhibit maximum correlation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • E-Shadow: Lubricating Social Interaction Using Mobile Phones

    Page(s): 1422 - 1433
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1637 KB) |  | HTML iconHTML  

    In this paper, we propose E-Shadow, a distributed mobile phone-based local social networking system. E-Shadow has two main components: (1) Local profiles. They enable E-Shadow users to record and share their names, interests, and other information with fine-grained privacy controls. (2) Mobile phone based local social interaction tools. E-Shadow provides mobile phone software that enables rich social interactions. The software maps proximate users' local profiles to their human owners and enables user communication and content sharing. We have designed and implemented E-Shadow on mobile phones. In our E-Shadow system, we allow users to perform dynamic and layered information publishing, making use of interpersonal relevance in space and time. Our system also provides a mechanism to help users perform direction-driven localization of an E-Shadow and match it with its owner. Experiments on real world Windows Mobile phones and large-scale simulations show that our system disseminates information efficiently and helps receivers find the direction of a specific E-Shadow with accuracy. We believe our E-Shadow concept and system can lead to a more tightly-knit temporary community in one's physical vicinity. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Exploiting Implementation Diversity and Partial Connection of Routers in Application-Specific Network-on-Chip Topology Synthesis

    Page(s): 1434 - 1445
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2197 KB) |  | HTML iconHTML  

    This paper proposes a novel application-specific Network-on-Chip (NoC) topology synthesis method, in which the partial connection and the implementation diversity of routers are exploited. NoC has emerged as a promising solution to future system-on-chip (SoC), and many researchers have focused on the automatic synthesis of NoC topology. In our observation, those NoC topology synthesis methods resemble the logic synthesis in the following sense: both the NoC topology synthesis and the logic synthesis determine the connections among the components where the components are the routers in the former and the logic cells in the latter. However, an outstanding difference is that the existing NoC topology synthesis methods consider only a single implementation for each size of router, whereas modern logic synthesis tools utilize multiple implementations of a cell to produce better netlist by the feature called technology mapping. To tackle this drawback, we propose a novel NoC topology synthesis methodology where the implementation diversity of routers is exploited to produce optimal topologies in terms of area and/or power consumption. Two different approaches, the post-process approach and the in-process approach, are proposed for exploiting the implementation diversity to provide flexibility between synthesis time and design quality. Also, the proposed method for characterizing and modeling routers makes it feasible to consider the implementation diversity even when the partial connection of routers is considered during the synthesis. Compared to the method in which the implementation diversity is exploited but the partial connection is not, the experimental results demonstrate that the proposed method can reduce the power consumption by up to 67.8% and 40.0% on average. On the other hand, compared to the method in which the partial connection is exploited but the implementation diversity is not, the power consumption is reduced by up to 12.0% and 3.4% on average. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Floorplan Optimization of Fat-Tree-Based Networks-on-Chip for Chip Multiprocessors

    Page(s): 1446 - 1459
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3034 KB) |  | HTML iconHTML  

    Chip multiprocessor (CMP) is becoming increasingly popular in the processor industry. Efficient network-on-chip (NoC) that has similar performance to the processor cores is important in CMP design. Fat-tree-based on-chip network has many advantages over traditional mesh or torus-based networks in terms of throughput, power efficiency, and latency. It has a bright future in the development of CMP. However, the floorplan design of the fat-tree-based NoC is very challenging because of the complexity of topology. There are a large number of crossings and long interconnects, which cause severe performance degradation in the network. In electronic NoCs, the parasitic capacitance and inductance will be significant. In optical ones, large crosstalk noise and power loss will be introduced. The novel contribution of this paper is to propose a method to optimize the fat-tree floorplan, which can effectively reduce the number of crossings and minimize the interconnect length. Two types of floorplans are proposed, which could be applied to fat-tree-based networks of arbitrary size. Compared with the traditional one, our floorplans could reduce more than 87% of the crossings. Since the traversal distance for signals is related to the aspect ratio of the processor cores, we also present a method to calculate the optimum aspect ratio of the processor cores to minimize the traversal distance. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Guaranteeing the End-to-End Latency of an IMA System with an Increasing Workload

    Page(s): 1460 - 1473
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2237 KB) |  | HTML iconHTML  

    New features are often added incrementally to avionics systems to minimize the need for redesign and recertification. However, it then becomes necessary to check that the timing constraints of existing as well as new applications are met. We facilitate these checks by introducing a new data switch that bounds the latency of end-to-end communications across a network. This switch runs a clock-driven switching algorithm that is throughput-optimal with a bounded worst-case delay for all feasible traffic. We propose associated heuristics that determine whether the timing constraints of an integrated modular avionics (IMA) system network that uses this switch are met, even if new features have caused traffic to increase, and then search for alternative network configurations if necessary. Virtual integration is used to make a combined analysis of the worst-case delay in the network and the local buses of individual computing modules. This analysis considers the shared network topology, local hardware architectures, and specified IMA configurations. Our approach can be used by a system architect as an effective method for quickly determining which possible system architectures should be pursued to meet timing constraints, and it allows the cascading effects of changes to be tracked and managed. We demonstrate how these heuristics work through an example in which changes are made to an environmental monitoring facility within an avionics system that uses our switch. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Logical Computation on Stochastic Bit Streams with Linear Finite-State Machines

    Page(s): 1474 - 1486
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2693 KB) |  | HTML iconHTML  

    Most digital systems operate on a positional representation of data, such as binary radix. An alternative is to operate on random bit streams where the signal value is encoded by the probability of obtaining a one versus a zero. This representation is much less compact than binary radix. However, complex operations can be performed with very simple logic. Furthermore, since the representation is uniform, with all bits weighted equally, it is highly tolerant of soft errors (i.e., bit flips). Both combinational and sequential constructs have been proposed for operating on stochastic bit streams. Prior work has shown that combinational logic can implement multiplication and scaled addition effectively while linear finite-state machines (FSMs) can implement complex functions such as exponentiation and tanh effectively. Prior work on stochastic computation has largely been validated empirically.This paper provides a rigorous mathematical treatment of stochastic implementation of complex functions such as exponentiation and tanh implemented using linear FSMs. It presents two new functions, an absolute value function and exponentiation based on an absolute value, motivated by specific applications. Experimental results show that the linear FSM-based constructs for these functions have smaller area-delay products than the corresponding deterministic constructs. They also are much more tolerant of soft errors. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Low-Overhead Network-on-Chip Support for Location-Oblivious Task Placement

    Page(s): 1487 - 1500
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (2549 KB) |  | HTML iconHTML  

    Many-core processors will have many processing cores with a network-on-chip (NoC) that provides access to shared resources such as main memory and on-chip caches. However, locally-fair arbitration in multi-stage NoC can lead to globally unfair access to shared resources and impact system-level performance depending on where each task is physically placed. In this work, we propose an arbitration to provide equality-of-service (EoS) in the network and provide support for location-oblivious task placement. We propose using probabilistic arbitration combined with distance-based weights to achieve EoS and overcome the limitation of round-robin arbiter. However, the complexity of probabilistic arbitration results in high area and long latency which negatively impacts performance. In order to reduce the hardware complexity, we propose an hybrid arbiter that switches between a simple arbiter at low load and a complex arbiter at high load. The hybrid arbiter is enabled by the observation that arbitration only impacts the overall performance and global fairness at a high load. We evaluate our arbitration scheme with synthetic traffic patterns and GPGPU benchmarks. Our results shows that hybrid arbiter that combines round-robin arbiter with probabilistic distance-based arbitration reduces performance variation as task placement is varied and also improves average IPC. View full abstract»

    Open Access
  • Mini-Rank: A Power-EfficientDDRx DRAM Memory Architecture

    Page(s): 1500 - 1512
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1922 KB) |  | HTML iconHTML  

    Memory power consumption has become a severe concern in multi-core computer platforms. As memory data rate, capacity and bandwidth are being pushed higher and higher, the power consumption of memory systems becomes a significant part in the overall system power profile. Conventional memory systems do not provide an efficient mechanism for managing its power and performance tradeoff. We propose a novel mini-rank architecture for DDRx memories to reduce memory power consumption by breaking each DRAM rank into multiple narrow mini-ranks and activating fewer devices for each request. We also propose a heterogeneous mini-rank design to further improve the performance-power tradeoff for each workload based on its memory access behavior and bandwidth requirement. The evaluation results show that homogeneous mini-rank significantly reduces memory power with small performance loss. For instance, using four-core multiprogramming workloads, a x32 mini-rank configuration reduces memory power by 19.5 percent with 1.3 percent performance loss on average for memory-intensive workloads. Heterogeneous mini-rank further improves the balance between the performance and power saving. For instance, it reduces the memory power by up to 38.0 percent with an average performance loss of 2.4 percent, compared with a conventional memory system. In comparison, the x32 homogeneous mini-rank reduces memory power by up to 25.4 percent; while the x8 homogeneous mini-rank incurs performance loss by up to 19.3 percent. Furthermore, heterogeneous mini-rank achieves consistently good performance-power tradeoff for workloads made by programs of diverse memory access behavior and bandwidth requirement. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel Simulation of Pore Networks Using Multicore CPUs

    Page(s): 1513 - 1525
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2404 KB) |  | HTML iconHTML  

    Pore networks can be simulated in silico by using the dual site-bond Model. In this approach, a set of cavities (sites) are interconnected to each other by means of a set of throats (bonds), while considering that each site should be always larger than any of its delimiting bonds. The NoMISS greedy algorithm has been implemented recently in order to address this task; nevertheless, even if this procedure is relatively fast, there arises problems related to large memory consumption and long computing time, as pore networks become somewhat large. Here, three parallel methods are proposed to allow a proficient construction of large pore networks. The first method is a parallel Monte Carlo procedure, which applies a number of exchanges among pore sizes in order to obtain a valid pore network. The other two methods are parallel versions of the pioneering NoMISS greedy algorithm. The first version uses a static data partitioning to speed up the running time, whilst the second applies a dynamic data distribution policy to improve the pore network quality. The obtained results show the behavior of each proposed version with respect to their performance and quality, by employing the resources of a 125-core Linux cluster. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • PC-DUOS+: A TCAM Architecture for Packet Classifiers

    Page(s): 1527 - 1540
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2785 KB) |  | HTML iconHTML  

    We propose algorithms for distributing the classifier rules to two ternary content addressable memories (TCAMs) and for incrementally updating the TCAMs. The performance of our scheme is compared against the prevalent scheme of storing classifier rules in a single TCAM in priority order. Our scheme results in an improvement in average lookup speed by up to 49% and an improvement in update performance by up to 3.84 times in terms of the number of TCAM writes. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Scalable Application-Dependent Diagnosisof Interconnects of SRAM-Based FPGAs

    Page(s): 1540 - 1550
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2287 KB) |  | HTML iconHTML  

    This paper presents a new method for diagnosing (detection and location) multiple faults in an application-dependent interconnect of a SRAM-based FPGA. For fault detection, the proposed technique retains the original interconnect configuration and modifies the function of the LUTs using the new LUT programming function 1-Bit Sum Function (1-BSF); in addition, it utilizes features such as branches in the nets as well as the primary (unused) IOs of the FPGAs. The proposed method detects all possible stuck-at and bridging faults of all cardinalities in a single configuration; fault detection requires 1 + log2k test configurations for multiple stuck-at location and 2 + 2log2k additional test configurations to locate more than one pair-wise bridging faults (where k denotes the maximum combinational depth of the FPGA circuit). Following detection, the locations of multiple faults are hierarchically identified using the walking-1 test set and an adaptive approach for the interconnect structure. Net ordering independence is accomplished by utilizing features such as the presence of paths of nets that are either disjoint or joint between the primary input and at least one primary output. As validated by simulation on benchmark circuits, the proposed method scales extremely well for different Virtex FPGA families; this results in a significant reduction in the number of configurations for diagnosing multiple faults. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Synthesis of Processor Instruction Sets from High-Level ISA Specifications

    Page(s): 1552 - 1566
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2307 KB)  

    As processors continue to get exponentially cheaper for end users following Moore's law, the costs involved in their design keep growing, also at an exponential rate. The reason is ever increasing complexity of processors, which modern EDA tools struggle to keep up with. This paper focuses on the design of Instruction Set Architecture (ISA), a significant part of the whole processor design flow. Optimal design of an instruction set for a particular combination of available hardware resources and software requirements is crucial for building processors with high performance and energy efficiency, and is a challenging task involving a lot of heuristics and high-level design decisions. This paper presents a new compositional approach to formal specification and synthesis of ISAs. The approach is based on a new formalism, called Conditional Partial Order Graphs, capable of capturing common behavioural patterns shared by processor instructions, and therefore providing a very compact and efficient way to represent and manipulate ISAs. The Event-B modelling framework is used as a formal specification and verification back-end to guarantee correctness of ISA specifications. We demonstrate benefits of the presented methodology on several examples, including Intel 8051 microcontroller. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Toward a Scalable Working Set Size Estimation Method and Its Application for Chip Multiprocessors

    Page(s): 1567 - 1579
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1871 KB) |  | HTML iconHTML  

    It is essential to accurately estimate the working set size (WSS) of an application for various optimizations such as to partition cache among virtual machines or reduce leakage power dissipated in an over-allocated cache by switching it off. However, the state-of-the-art heuristics such as average memory access latency (AMAL) or cache miss ratio (CMR) are poorly correlated to the WSS of an application due to 1) over-sized caches and 2) their dispersed nature. Past studies focus on estimating WSS of an application executing on a uniprocessor platform. Estimating the same for a chip multiprocessor (CMP) with a large dispersed cache is challenging due to the presence of concurrently executing threads/processes. Hence, we propose a scalable, highly accurate method to estimate WSS of an application. We call this method “tagged WSS (TWSS)” estimation method. We demonstrate the use of TWSS to switch-off the over-allocated cache ways in Static and Dynamic NonUniform Cache Architectures (SNUCA, DNUCA) on a tiled CMP. In our implementation of adaptable way SNUCA and DNUCA caches, decision of altering associativity is taken by each L2 controller. Hence,this approach scales better with the number of cores present on a CMP. It gives overall (geometric mean) 26% and 19% higher energy-delay product savings compared to AMAL and CMR heuristics on SNUCA, respectively. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Toward Wireless Security without Computational Assumptions—Oblivious Transfer Based on Wireless Channel Characteristics

    Page(s): 1580 - 1593
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2391 KB) |  | HTML iconHTML  

    Wireless security has been an active research area since the last decade. A lot of studies of wireless security use cryptographic tools, but traditional cryptographic tools are normally based on computational assumptions, which may turn out to be invalid in the future. Consequently, it is very desirable to build cryptographic tools that do not rely on computational assumptions. In this paper, we focus on a crucial cryptographic tool, namely 1-out-of-2 oblivious transfer. This tool plays a central role in cryptography because we can build a cryptographic protocol for any polynomial-time computable function using this tool. We present a novel 1-out-of-2 oblivious transfer protocol based on wireless channel characteristics, which does not rely on any computational assumption. We also illustrate the potential broad applications of this protocol by giving two applications, one on private communications and the other on privacy preserving password verification. We have fully implemented this protocol on wireless devices and conducted experiments in real environments to evaluate the protocol. Our experimental results demonstrate that it has reasonable efficiency. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On 3-Extra Connectivity and 3-Extra Edge Connectivity of Folded Hypercubes

    Page(s): 1594 - 1600
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3173 KB) |  | HTML iconHTML  

    Given a graph mbiG and a non-negative integer g, the g-extra connectivity (resp. g-extra edge connectivity) of mbiG is the minimum cardinality of a set of vertices (resp. edges) in mbiG, if it exists, whose deletion disconnects mbiG and leaves each remaining component with more than g vertices. This study shows that the 3-extra connectivity (resp. 3-extra edge connectivity) of an mbin-dimensional folded hypercube is 4n - 5 for n ≥ 6 (resp. 4n - 4 for n ≥ 5). This study also provides an upper bound for the g-extra connectivity on folded hypercubes for g ≥ 6. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

The IEEE Transactions on Computers is a monthly publication with a wide distribution to researchers, developers, technical managers, and educators in the computer field.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Paolo Montuschi
Politecnico di Torino
Dipartimento di Automatica e Informatica
Corso Duca degli Abruzzi 24 
10129 Torino - Italy
e-mail: pmo@computer.org