By Topic

Computers, IEEE Transactions on

Issue 1 • Date Jan. 2009

Filter Results

Displaying Results 1 - 17 of 17
  • [Front cover]

    Page(s): c1
    Save to Project icon | Request Permissions | PDF file iconPDF (121 KB)  
    Freely Available from IEEE
  • [Inside front cover]

    Page(s): c2
    Save to Project icon | Request Permissions | PDF file iconPDF (78 KB)  
    Freely Available from IEEE
  • State of the Journal

    Page(s): 1 - 4
    Save to Project icon | Request Permissions | PDF file iconPDF (146 KB)  
    Freely Available from IEEE
  • Wire-Speed TCAM-Based Architectures for Multimatch Packet Classification

    Page(s): 5 - 17
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2728 KB) |  | HTML iconHTML  

    Most conventional packet classifiers find only the highest priority filter that matches the arriving packet. However, new networking applications such as network intrusion detection systems and load balancers require all (or the first few) matching packets during classification. In this paper, two TCAM-based architectures for multi-match search are introduced. The first one is a renovated TCAM design that can find all or the first r matches in a packet filter set. The second architecture is a novel partitioning scheme based on filter intersection properties allowing us to use off-the-shelf TCAMs for multi-match packet classification. Our classifier engine finds all matches in exactly one conventional TCAM cycle while reducing the power consumption by at least two orders of magnitude, which is far better than the existing hardware based designs. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • FPC: A High-Speed Compressor for Double-Precision Floating-Point Data

    Page(s): 18 - 31
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (4340 KB) |  | HTML iconHTML  

    Many scientific programs exchange large quantities of double-precision data between processing nodes and with mass storage devices. Data compression can reduce the number of bytes that need to be transferred and stored. However, data compression is only likely to be employed in high-end computing environments if it does not impede the throughput. This paper describes and evaluates FPC, a fast lossless compression algorithm for linear streams of 64-bit floating-point data. FPC works well on hard-to-compress scientific data sets and meets the throughput demands of high-performance systems. A comparison with five lossless compression schemes, BZIP2, DFCM, FSD, GZIP, and PLMI, on 4 architectures and 13 data sets shows that FPC compresses and decompresses one to two orders of magnitude faster than the other algorithms at the same geometric-mean compression ratio. Moreover, FPC provides a guaranteed throughput as long as the prediction tables fit into the L1 data cache. For example, on a 1.6-GHz Itanium 2 server, the throughput is 670 Mbytes/s regardless of what data are being compressed. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Networks-on-Chip in a Three-Dimensional Environment: A Performance Evaluation

    Page(s): 32 - 45
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (4607 KB) |  | HTML iconHTML  

    The Network-on-Chip (NoC) paradigm has emerged as a revolutionary methodology for integrating a very high number of intellectual property (IP) blocks in a single die. The achievable performance benefit arising out of adopting NoCs is constrained by the performance limitation imposed by the metal wire, which is the physical realization of communication channels. With technology scaling, only depending on the material innovation will extend the lifetime of conventional interconnect systems a few technology generations. According to International Technology Roadmap for Semiconductors (ITRS) for the longer term, new interconnect paradigms are in need. The conventional two dimensional (2D) integrated circuit (IC) has limited floor-planning choices, and consequently it limits the performance enhancements arising out of NoC architectures. Three dimensional (3D) ICs are capable of achieving better performance, functionality, and packaging density compared to more traditional planar ICs. On the other hand, NoC is an enabling solution for integrating large numbers of embedded cores in a single die. 3D NoC architectures combine the benefits of these two new domains to offer an unprecedented performance gain. In this paper we evaluate the performance of 3D NoC architectures and demonstrate their superior functionality in terms of throughput, latency, energy dissipation and wiring area overhead compared to traditional 2D implementations. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimized Custom Precision Function Evaluation for Embedded Processors

    Page(s): 46 - 59
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3608 KB) |  | HTML iconHTML  

    Fixed-point processors are utilized in an enormous variety of applications, often for tasks that require the evaluation of mathematical functions. We present an automated method for mapping functions to such processors via polynomials that explicitly targets the native word-length of the processor, thereby significantly reducing the execution time relative to commonly used floating-point emulation approaches based on traditional mathematical libraries. The methods presented here also contrast with hand-tuned processor-specific code, which has the potential to deliver efficient implementations but at the cost of significant design time. We describe an automated design flow utilizing multi-word arithmetic to provide overflow protection and precision accurate to one unit in the last place (ulp). Analytical approaches are used to minimize the number of fixed-width operands required for each operation and to ensure that precision requirements are met. This allows automated generation of processor-optimized code and characterization of a design space representing a rich range of tradeoffs among precision, latency, and memory cost. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Limit on the Addressability of Fault-Tolerant Nanowire Decoders

    Page(s): 60 - 68
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (830 KB) |  | HTML iconHTML  

    Although prone to fabrication error, the nanowire crossbar is a promising candidate compoent for next generation nanometer-scale circuits. In the nanowire crossbar architecture, nanowires are addressed by controlling voltages on the mesowires. For area efficiency, we are interested in the maximum number of nanowires N(m,e) that can be addressed by m mesowires, in the face of up to e fabrication errors. Asymptotically tight bounds on N(m,e) are established in this paper. In particular, it is shown that N(m,e) = Theta(2m / mepsiv+1/2). Interesting observations are made on the equivalence between this problem and the problem of constructing optimal EC/AUED codes, superimposed distance codes, pooling designs, and diffbounded set systems. Results in this paper also improve upon those in the EC/AUEC codes literature. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Sensitivity-Based Optimization of Disk Architecture

    Page(s): 69 - 81
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2934 KB) |  | HTML iconHTML  

    Storage plays a pivotal role in the performance of many applications. Many applications, especially those that run on servers, are I/O intensive and therefore require high performance storage systems. These high-end storage systems consume a large amount of power, the bulk of which is due to the disk drives. Optimizing disk architectures is a design time as well as a run time issue and requires balancing between performance and power. There are different figures of merit, such as performance and energy, and a large space of design and runtime "knobs" that can be used to optimize disk drive behavior. Given such a large space, it is desirable to have a systematic methodology to optimally set these knobs to satisfy our figures of merit as efficiently as possible. In this paper we present the sensitivity-based optimization methodology for disk architectures (SODA), which leverages results previously obtained in digital circuit design optimization scenarios. Using detailed models of the electro-mechanical behavior of disk drives and a suite of realistic workloads, we show how SODA can aid in design and runtime optimization of disk drive architectures. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Delay-Constrained Multicast Routing Using the Noisy Chaotic Neural Networks

    Page(s): 82 - 89
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2091 KB) |  | HTML iconHTML  

    We present a method to compute the delay constrained multicast routing tree by employing chaotic neural networks. Experimental result shows that the noisy chaotic neural network (NCNN) provides optimal solution more often compared to the transiently chaotic neural network (TCNN) and the Hopfield neural network (HNN). Furthermore, compared with the bounded shortest multicast algorithm (BSMA), the noisy chaotic neural network is able to find multicast trees with lower cost. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Draco: Efficient Resource Management for Resource-Constrained Control Tasks

    Page(s): 90 - 105
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1543 KB) |  | HTML iconHTML  

    In many application areas, including control systems, careful management of system resources is key to providing the best application performance. Traditional control systems with multiple control loops statically allocate a fixed portion of the system resources to each controller based on their average or worst-case resource requirements. However, controllers' resource needs vary depending on the jobs they perform and the state of the systems they control. A controller of a plant operating close to its equilibrium requires fewer resources than a controller of a plant operating far from its equilibrium point. The Draco dynamic rate control system exploits this fact by dynamically allocating resources to control systems based on system state. Our research demonstrates that Draco provides significantly better overall control performance with much less resources than static controllers. Our experimental evaluation shows that in the control scenarios we examined Draco provides up to 25 percent better control performance with 30 percent less resources. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Complexities of Graph-Based Representations for Elementary Functions

    Page(s): 106 - 119
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2244 KB) |  | HTML iconHTML  

    This paper analyzes complexities of decision diagrams for elementary functions such as polynomial, trigonometric, logarithmic, square root, and reciprocal functions. These real functions are converted into integer-valued functions by using fixed-point representation. This paper presents the numbers of nodes in decision diagrams representing the integer-valued functions. First, complexities of decision diagrams for polynomial functions are analyzed, since elementary functions can be approximated by polynomial functions. A theoretical analysis shows that binary moment diagrams (BMDs) have low complexity for polynomial functions. Second, this paper analyzes complexity of edge-valued binary decision diagrams (EVBDDs) for monotone functions, since many common elementary functions are monotone. It introduces a new class of integer functions, Mp-monotone increasing function, and derives an upper bound on the number of nodes in an EVBDD for the Mp-monotone increasing function. A theoretical analysis shows that EVBDDs have low complexity for Mp-monotone increasing functions. This paper also presents the exact number of nodes in the smallest EVBDD for the n-bit multiplier function, and a variable order for the smallest EVBDD. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Localized Minimum-Energy Broadcasting for Wireless Multihop Networks with Directional Antennas

    Page(s): 120 - 131
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1962 KB) |  | HTML iconHTML  

    We propose several localized algorithms to achieve energy-efficient broadcasting in wireless multihop networks using directional antennas. Each node needs to know only geographic position of itself and its neighbors. Our first protocol is called DRBOP and it follows the one-to-one communication model to reach to all nodes in the relative neighborhood graph (RNG). Each node that receives a message for the first time from one of its RNG neighbors will rebroadcast it to each of its remaining RNG neighbors separately. The transmission power is adjusted for each transmission to the minimal necessary for reaching the particular neighbor. Next, we describe DLBOP, where RNG is replaced by the localized minimum spanning tree (LMST) graph which is a localized topology resembling the minimum spanning tree. We then observe that, for very dense networks, it is more energy-efficient to reach more than one neighbor at a time. A one-to-many protocol efficient for dense networks is proposed. We then describe an efficient localized protocol which adaptively switches (without any threshold) between one-to-one and one-to-many communication models and is efficient for both sparse and dense networks. Our simulation results show that for different energy models, the adaptive protocol is able to achieve a competitive performance to globalized algorithms while having a fully localized operation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Improved Search Method for Accumulator-Based Test Set Embedding

    Page(s): 132 - 138
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (775 KB) |  | HTML iconHTML  

    In this paper we present a new search method for test set embedding using an accumulator driven with an additive constant C. We formulate the problem of finding the location of a test pattern in the generated sequence in terms of a linear Diophantine equation with two variables, which is known to be solved quickly in linear time. We show that only one Diophantine equation needs to be solved per test set irrespective of its size. Next we show how to find the starting state, for a given constant C and test set T, such that the generated sequence can reproduce T with minimum length. Finally, we show that the best constant Copt (in terms of shortest test length) for the embedding of T using an accumulator of size n can be found in O(2ldrn+Fldr|T|) steps, instead of O(nldr2nldr|T|) steps of a previous approach, where F depends on the particular test set and can be significantly smaller than its worst case value of 2n-2. The value of F can also be further reduced while providing a guaranteed approximation bound of the shortest test length. Experimental results show the computational improvements. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 2008 TC Reviewers List

    Page(s): 139 - 144
    Save to Project icon | Request Permissions | PDF file iconPDF (40 KB)  
    Freely Available from IEEE
  • TC Information for authors

    Page(s): c3
    Save to Project icon | Request Permissions | PDF file iconPDF (78 KB)  
    Freely Available from IEEE
  • [Back cover]

    Page(s): c4
    Save to Project icon | Request Permissions | PDF file iconPDF (121 KB)  
    Freely Available from IEEE

Aims & Scope

The IEEE Transactions on Computers is a monthly publication with a wide distribution to researchers, developers, technical managers, and educators in the computer field.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Albert Y. Zomaya
School of Information Technologies
Building J12
The University of Sydney
Sydney, NSW 2006, Australia
http://www.cs.usyd.edu.au/~zomaya
albert.zomaya@sydney.edu.au