By Topic

Computers, IEEE Transactions on

Issue 12 • Date Dec. 2004

Filter Results

Displaying Results 1 - 17 of 17
  • [Front cover]

    Page(s): c1
    Save to Project icon | Request Permissions | PDF file iconPDF (72 KB)  
    Freely Available from IEEE
  • [Inside front cover]

    Page(s): c2
    Save to Project icon | Request Permissions | PDF file iconPDF (55 KB)  
    Freely Available from IEEE
  • Editor's note

    Page(s): 1505 - 1507
    Save to Project icon | Request Permissions | PDF file iconPDF (130 KB)  
    Freely Available from IEEE
  • A-combined approach to high-level synthesis for dynamically reconfigurable systems

    Page(s): 1508 - 1522
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2344 KB) |  | HTML iconHTML  

    The increase in complexity of programmable hardware platforms results in the need to develop efficient high-level synthesis tools since that allows more efficient exploration of the design space while predicting the effects of technology specific tools on the design space. Much of the previous work, however, neglects the delay of interconnects (e.g. multiplexers) which can heavily influence the overall performance of the design. In addition, in the case of dynamic reconfigurable logic circuits, unless an appropriate design methodology is followed, an unnecessarily large number of configurable logic blocks may end up being used for communication between contexts, rather than for implementing function units. The aim of this paper is to present a new technique to perform interconnect-sensitive synthesis, targeting dynamic reconfigurable circuits. Further, the proposed technique exploits multiple hardware contexts to achieve efficient designs. Experimental results on several benchmarks, which have been done on our DRL LSI circuit [M. Meribout et al. [200]], [M. Meribout et al. (1997)], demonstrate that, by jointly optimizing the interconnect communication, and function unit cost, we can achieve higher quality designs than is possible with such previous techniques as Force-Directed-Scheduling. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Gaussian noise generator for hardware-based simulations

    Page(s): 1523 - 1534
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (784 KB) |  | HTML iconHTML  

    Hardware simulation offers the potential of improving code evaluation speed by orders of magnitude over workstation or PC-based simulation. We describe a hardware-based Gaussian noise generator used as a key component in a hardware simulation system, for exploring channel code behavior at very low bit error rates (BERs) in the range of 10-9 to 10-10. The main novelty is the design and use of nonuniform piecewise linear approximations in computing trigonometric and logarithmic functions. The parameters of the approximation are chosen carefully to enable rapid computation of coefficients from the inputs while still retaining high fidelity to the modeled functions. The output of the noise generator accurately models a true Gaussian Probability Density Function (PDF) even at very high σ values. Its properties are explored using: 1) several different statistical tests, including the chi-square test and the Anderson-Darling test, and 2) an application for decoding of low-density parity-check (LDPC) codes. An implementation at 133 MHz on a Xilinx Virtex-II XC2V4000-6 FPGA produces 133 million samples per second, which is seven times faster than a 2.6 GHz Pentium-IV PC; another implementation on a Xilinx Spartan-IIE XC2S300E-7 FPGA at 62 MHz is capable of a three times speedup. The performance can be improved by exploiting parallelism: an XC2V4000-6 FPGA with nine parallel instances of the noise generator at 105 MHz can run 50 times faster than a 2.6 GHz Pentium-IV PC. We illustrate the deterioration of clock speed with the increase in the number of instances. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Lower bounds on the loading of multiple bus networks for binary tree algorithms

    Page(s): 1535 - 1546
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (664 KB) |  | HTML iconHTML  

    A multiple bus network (MBN) connects a set of processors via set of buses. Two important parameters of an MBN are its loading (largest number of connections on a bus) and its degree (largest number of connections to a processor). These parameters determine the cost, speed, and implementability of the MBN. The smallest degree that any useful MBN can have is 2. In this paper, we study the relationship between running time, degree, and loading of degree-2 MBNs running a fundamental class of algorithms called binary tree algorithms. (A binary tree algorithm reduces 2n inputs at the leaves of a balanced-binary tree to a single result at the root of the tree.) Specifically, we establish a nontrivial Ω(n/logn) loading lower bound for any degree-2 MBN running a 2n input binary tree algorithm optimally in n steps. We show that this bound does not hold if the restriction on the degree or the running time is relaxed. That is, optimal-time, degree-3, constant loading MBNs and suboptimal-time, degree-2, constant loading MBNs exist for binary tree algorithms. We also derive a lower bound on the additional time (beyond the optimal) needed to run binary tree algorithms on a degree-2, loading-L MBN, for any L>3. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Designing WDM optical interconnects with full connectivity by using limited wavelength conversion

    Page(s): 1547 - 1556
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1152 KB) |  | HTML iconHTML  

    Optical communication, in particular, wavelength-division-multiplexing (WDM) technique, has become a promising networking choice to meet ever-increasing demands on bandwidth from emerging bandwidth-intensive computing/networking applications. A major challenge in designing WDM optical interconnects is how to provide maximum connectivity while keeping minimum hardware cost. The overall hardware cost of a WDM optical interconnect includes not only the cost of switching elements, but also the cost of wavelength conversion. Previous work mainly focused on minimizing hardware cost without taking into consideration the type of wavelength converters used. In this paper, we design WDM optical interconnects with full connectivity by using the low cost limited wavelength converters. We present optimal WDM optical interconnects for both permutation and multicast in single stage and multistage implementations. We also discuss the impact of the relationship between the number of fibers and the number of wavelengths per fiber on the optimal design. As can be seen, the newly designed WDM optical interconnects have minimum hardware cost in terms of the number of crosspoints and wavelength conversion cost. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Susceptibility of commodity systems and software to memory soft errors

    Page(s): 1557 - 1568
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1040 KB) |  | HTML iconHTML  

    It is widely understood that most system downtime is accounted for by programming errors and administration time. However, a growing body of work has indicated an increasing cause of downtime may stem from transient errors in computer system hardware due to external factors, such as cosmic rays. This work indicates that moving to denser semiconductor technologies at lower voltages has the potential to increase these transient errors. In this paper, we investigate the susceptibility of commodity operating systems and applications on commodity PC processors to these soft-errors and we introduce ideas regarding the improved recovery from these transient errors in software. Our results indicate that, for the Linux kernel and a Java virtual machine running sample workloads, many errors are not activated, mostly due to overwriting. In addition, given current and upcoming microprocessor support, our results indicate that those errors activated, which would normally lead to system reboot, need not be fatal to the system if software knowledge is used for simple software recovery. Together, they indicate the benefits of simple memory soft error recovery handling in commodity processors and software. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Static test compaction for full-scan circuits based on combinational test sets and nonscan input sequences and a lower bound on the number of tests

    Page(s): 1569 - 1581
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1720 KB) |  | HTML iconHTML  

    A new class of static compaction procedures is described that generate test sets with reduced test application times for scan circuits. The proposed class of procedures combines the advantages of two earlier static compaction procedures, one that tends to generate large numbers of tests with a short primary input sequence included in every test and one that tends to generate small numbers of tests with a long, primary input sequence included in one of the tests. A procedure of the proposed class starts from an initial test set that has a large number of tests and long primary input sequences and it selects a subset of the tests and subsequences of their primary input sequences. It thus has the flexibility of finding an appropriate balance between the number of tests and the lengths of the primary input sequences in order to minimize the test application time. Several ways of computing the primary input sequences for the initial test set are considered. The most compact test sets are obtained when a test sequence for the nonscan circuit is available and this sequence is used as part of every test in the initial test set. However, it is shown that high levels of compaction can also be achieved without the overhead of test generation for the nonscan circuit. Specifically, we show that the industry practice of holding a primary input vector constant between scan operations can be accommodated. We estimate the ability of the procedure to achieve optimum test sets by computing a lower bound on the number of tests and demonstrating that the procedure achieves or approaches this lower bound. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Diagnosability of t-connected networks and product networks under the comparison diagnosis model

    Page(s): 1582 - 1590
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (632 KB) |  | HTML iconHTML  

    Diagnosability is an important factor in measuring the reliability of an interconnection network, while the (node) connectivity is used to measure the fault tolerance of an interconnection network. We observe that there is a close relationship between the connectivity and the diagnosability. According to our results, a t-regular and t-connected network with at least 2t + 3 nodes is t-diagnosable. Furthermore, the diagnosability of the product networks is also investigated in this work. The product networks, including hypercube, mesh, and tori, comprise very important classes of interconnection networks. Herein, different combinations of t-diagnosable and t-connected are employed to study the diagnosability of the product networks. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Task synchronization in reservation-based real-time systems

    Page(s): 1591 - 1601
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (664 KB) |  | HTML iconHTML  

    In this paper, we present the BandWidth Inheritance (BWI) protocol, a new strategy for scheduling real-time tasks in dynamic systems, which extends the resource reservation framework to systems where tasks can interact through shared resources. The proposed protocol provides temporal isolation between independent groups of tasks and enables a schedulability analysis for guaranteeing the performance of hard real-time tasks. We show that BWI is the natural extension of the well-known priority inheritance protocol to dynamic reservation systems. A formal analysis of the protocol is presented and a guarantee test for hard real-time tasks is proposed that takes into account the case in which hard real-time tasks interact with soft real-time tasks. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • CoPTUA: Consistent Policy Table Update Algorithm for TCAM without locking

    Page(s): 1602 - 1614
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1304 KB) |  | HTML iconHTML  

    Due to deterministic and fast lookup performance, ternary content addressable memory (TCAM) has recently been gaining popularity in general policy filtering (PF) for packet classification in high-speed networks. However, the PF table update poses significant challenges for efficient use of TCAM. To avoid erroneous and inconsistent rule matching, the traditional approach is to lock the PF table during the rule update period, but table locking has a negative impact on data path processing. In this paper, we propose a novel scheme, called Consistent Policy Table Update Algorithm (CoPTUA), for TCAM. Instead of minimizing the number of rule moves to reduce the locking time, CoPTUA maintains a consistent PF table throughout the update process, thus eliminating the need for locking the PF table while-ensuring correctness of rule matching. Our analysis and simulation show that, even for a PF table with 100,000 rules, an arbitrary number of rules can be updated simultaneously within 1 second in the worst case, provided that 2 percent of the PF table entries are empty. Thus, CoPTUA enforces any new rule in less than 1 second for practical PF table size with high memory utilization and without impacting data path processing. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Enhanced interval trees for dynamic IP router-tables

    Page(s): 1615 - 1628
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1664 KB) |  | HTML iconHTML  

    We develop an enhanced interval tree data structure that is suitable for the representation of dynamic IP router tables. Several refinements of this enhanced structure are proposed for a variety of IP router tables. For example, the data structure called BOB (binary tree on binary tree) is developed for dynamic router tables in which the rule filters are nonintersecting ranges and in which ties are broken by selecting the highest-priority rule that matches a destination address. Prefix filters are a special case of nonintersecting ranges and the commonly used longest-prefix tie breaker is a special case of the highest-priority tie breaker. When an n-rule router table is represented using BOB, the highest-priority rule that matches a destination address may be found in O(log2n) time; a new rule may be inserted and an old one deleted in O(logn) time. For general ranges, the data structure CBOB (compact BOB) is proposed. For the case when all rule filters are prefixes, the data structure PBOB (prefix BOB) permits highest-priority matching as well as rule insertion and deletion in O(W) time, where W is the length of the longest prefix, each. When all rule filters are prefixes and longest-prefix matching is to be done, the data structure LMPBOB (longest matching-prefix BOB) permits longest-prefix matching in O(W) time; rule insertion and deletion each take O(logn) time. On practical rule tables, BOB and PBOB perform each of the three dynamic-table operations in O(logn) time and with O(logn) cache misses. The number of cache misses incurred by LMPBOB is also O(logn). Experimental results also are presented. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Topology control of ad hoc wireless networks for energy efficiency

    Page(s): 1629 - 1635
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (632 KB) |  | HTML iconHTML  

    In ad hoc wireless networks, to compute the transmission power of each wireless node such that the resulting network is connected and the total energy consumption is minimized is defined as a Minimum Energy Network Connectivity (MENC) problem, which is an NP-complete problem. In this paper, we consider the approximated solutions for the MENC problem in ad hoc wireless networks. We present a theorem that reveals the relation between the energy consumption of an optimal solution and that of a spanning tree and propose an optimization algorithm that can improve the result of any spanning tree-based topology. Two polynomial time approximation heuristics are provided in the paper that can be used to compute the power assignment of wireless nodes in both static and low mobility ad hoc wireless networks. The two heuristics are implemented and the numerical results verify the theoretical analysis. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Annual Index

    Page(s): 1636 - 1648
    Save to Project icon | Request Permissions | PDF file iconPDF (183 KB)  
    Freely Available from IEEE
  • TC Information for authors

    Page(s): c3
    Save to Project icon | Request Permissions | PDF file iconPDF (52 KB)  
    Freely Available from IEEE
  • [Back cover]

    Page(s): c4
    Save to Project icon | Request Permissions | PDF file iconPDF (100 KB)  
    Freely Available from IEEE

Aims & Scope

The IEEE Transactions on Computers is a monthly publication with a wide distribution to researchers, developers, technical managers, and educators in the computer field.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Albert Y. Zomaya
School of Information Technologies
Building J12
The University of Sydney
Sydney, NSW 2006, Australia
http://www.cs.usyd.edu.au/~zomaya
albert.zomaya@sydney.edu.au