By Topic

Computers, IEEE Transactions on

Issue 8 • Date Aug. 2006

Filter Results

Displaying Results 1 - 20 of 20
  • [Front cover]

    Page(s): c1
    Save to Project icon | Request Permissions | PDF file iconPDF (152 KB)  
    Freely Available from IEEE
  • [Inside front cover]

    Page(s): c2
    Save to Project icon | Request Permissions | PDF file iconPDF (95 KB)  
    Freely Available from IEEE
  • Efficient m-ary balanced codes which are invariant under symbol permutation

    Page(s): 929 - 946
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (7701 KB) |  | HTML iconHTML  

    A symbol permutation invariant balanced (SPI-balanced) code over the alphabet Zopfm = {0, 1, ..., m - 1} is a block code over Zopfm such that each alphabet symbol occurs as many times as any other symbol in every codeword. For this reason, every permutation among the symbols of the alphabet changes an SPI-balanced code into an SPI-balanced code. This means that SPI-balanced words are "the most balanced" among all possible m-ary balanced word types and this property makes them very attractive from the application perspective. In particular, they can be used to achieve m-ary DC-free communication, to detect/correct asymmetric/unidirectional errors on the m-ary asymmetric/unidirectional channel, to achieve delay-insensitive communication, to maintain data integrity in digital optical disks, and so on. This paper gives some efficient methods to convert (encode) m-ary information sequences into m-ary SPI-balanced codes whose redundancy is equal to roughly double the minimum possible redundancy rmin. It is proven that rmin sime [(m - 1)/2]logm n - (1/2)[1 - (1/log2pi m)]m - (1/log2pi m) for any code which converts k information digits into an SPI-balanced code of length n = k + r. For example, the first method given in the paper encodes k information digits into an SPI-balanced code of length n = k + r, with r = (m - 1) logm k + O(m logm logm k). A second method is a recursive method, which uses the first as base code and encodes k digits into an SPI-balanced code of length n = k + r, with r sime (m - 1) logm n - logm[(m - 1)!] View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • DPPC-RE: TCAM-based distributed parallel packet classification with range encoding

    Page(s): 947 - 961
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2207 KB) |  | HTML iconHTML  

    Packet classification has been a critical data path function for many emerging networking applications. An interesting approach is the use of ternary content addressable memory (TCAM) to achieve deterministic, high-speed packet classification performance. However, apart from high cost and power consumption, due to slow growing clock rate for memory technology, in general, the traditional single TCAM-based solution has difficulty to keep up with fast growing line rates. Moreover, the TCAM storage efficiency is largely affected by the need to support rules with ranges or range matching. In this paper, a distributed TCAM scheme that exploits chip-level-parallelism is proposed to greatly improve the throughput performance. This scheme seamlessly integrates with a range encoding scheme which not only solves the range matching problem, but also ensures a balanced high throughput performance. A thorough theoretical worst-case analysis of throughput, processing delay, and power consumption, as well as the experimental results show that the proposed solution can achieve scalable throughput performance matching up to OC768 line rate or higher. The added TCAM storage overhead is found to be reasonably small for the five real-world classifiers studied View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design trade-offs for user-level I/O architectures

    Page(s): 962 - 973
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1513 KB) |  | HTML iconHTML  

    To address the growing I/O bottleneck, next-generation distributed I/O architectures employ scalable point-to-point interconnects and minimize operating system overhead by providing user-level access to the I/O subsystem. Reduced I/O overhead allows I/O intensive applications to efficiently employ latency hiding techniques for improved throughput. This paper presents the design of a novel scalable user-level I/O architecture and evaluates the impact of various architectural mechanisms in terms of overall performance improvement. Results demonstrate that eliminating data movement across protection domains is the dominant contributor to improved scalability. Eliminating system call and interrupt overhead only has a small additional benefit that may not justify the additional hardware support required. While this evaluation is based on one specific design, the conclusions can be generalized to other user-level I/O architectures View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Software multiplication using Gaussian normal bases

    Page(s): 974 - 984
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (539 KB) |  | HTML iconHTML  

    Fast algorithms for multiplication in finite fields are required for several cryptographic applications, in particular for implementing elliptic curve operations over binary fields F2m. In this paper, we present new software algorithms for efficient multiplication over F2m that use a Gaussian normal basis representation. Two approaches are presented, direct normal basis multiplication and a method that exploits a mapping to a ring where fast polynomial-based techniques can be employed. Our analysis, including experimental results on an Intel Pentium family processor, shows that the new algorithms are faster and can use memory more efficiently than previous methods. Despite significant improvements, we conclude that the penalty in multiplication is still sufficiently large to discourage the use of normal bases in software implementations of elliptic curve systems View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Minimizing sum of completion times and makespan in master-slave systems

    Page(s): 985 - 999
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (875 KB) |  | HTML iconHTML  

    We consider scheduling problems in the master-slave model. In this model, each job has to be processed sequentially in three stages. In the first stage, a preprocessing task runs on a master machine, in the second stage, a slave task runs on a dedicated slave machine, and, in the last stage, a postprocessing task again runs on a master machine, possibly different from the master machine in the first stage. It has been shown that the problem of minimizing the makespan or the sum of completion times is NP-hard in the strong sense even if preemption is allowed. In this paper, we design efficient approximation algorithms to minimize the sum of completion times in various settings. These are the first general results for the minsum problem in the master-slave model. We also show that these algorithms generate schedules with small makespan as well View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Pipelined computation of scalar multiplication in elliptic curve cryptosystems (extended version)

    Page(s): 1000 - 1010
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1748 KB) |  | HTML iconHTML  

    In the current work, we propose a pipelining scheme for implementing elliptic curve cryptosystems (ECC). The scalar multiplication is the dominant operation in ECC. It is computed by a series of point additions and doublings. The pipelining scheme is based on a key observation: to start the subsequent operation, one need not wait until the current one exits. The next operation can begin while a part of the current operation is still being processed. To our knowledge, this is the first attempt to compute the scalar multiplication in such a pipelined manner. Also, the proposed scheme can be made resistant to sidechannel attacks (SCA). Our scheme compares favorably with all SCA-resistant sequential and parallel methods View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Simultaneous interconnect delay and crosstalk noise optimization through gate sizing using game theory

    Page(s): 1011 - 1023
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2130 KB) |  | HTML iconHTML  

    The continuous scaling trends of interconnect wires in deep submicron (DSM) circuits result in increased interconnect delay and crosstalk noise. In this work, we develop a new postlayout gate sizing algorithm for simultaneous optimization of interconnect delay and crosstalk noise. The problem of postlayout gate sizing is modeled as a normal form game and solved using Nash equilibrium. The crosstalk noise induced on a net depends on the size of its driver gate and the size of the gates driving its coupled nets. Increasing the gate size of the driver increases the noise induced by the net on its coupled nets, whereas increasing the size of the drivers of coupled nets increases the noise induced on the net itself, resulting in a cyclic order dependency leading to a conflicting situation. It is pointed out that solving the postroute gate sizing problem for crosstalk noise optimization is difficult due to its conflicting nature. Game theory provides a natural framework for handling such conflicting situations and allows optimization of multiple parameters. By utilizing this property of game theory, the cyclic dependency of crosstalk noise on its gate sizes can be solved as well as the problem of gate sizing for simultaneous optimization of interconnect delay and crosstalk noise can be effectively modeled, whose objective function is again conflicting in nature. We have implemented two different strategies in which games are ordered according to 1) the noise criticality and 2) delay criticality of nets. The time and space complexities of the proposed gate sizing algorithm are linear in terms of the number of gates in the design. Experimental results for a noise critically ordered game theoretic approach on several medium and large open core designs indicate average improvements of 15.48 percent and 18.56 percent with respect to Cadence place and route tools in terms of interconnect delay and crosstalk noise, respectively, without any area overhead or the need for reroutin- - g. Further, the algorithm performs significantly better than simulated annealing and genetic search as established through experimental results. A mathematical proof of existence for the Nash equilibrium solution for the proposed gate sizing formulation is also provided View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An effective visibility culling method based on cache block

    Page(s): 1024 - 1032
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2331 KB) |  | HTML iconHTML  

    As the complexity of 3D scenes is on the increase, the search for an effective visibility culling method has become one of the most important issues to be addressed in the design of 3D rendering processors. In this paper, we propose a new rasterization pipeline with visibility culling; the proposed architecture performs the visibility culling at an early stage of the rasterization pipeline (especially at the traversal stage) by retrieving data in a pixel cache without any significant hardware logics such as the hierarchical z-buffer. If the data to be retrieved does not exist in the pixel cache, the proposed architecture performs a prefetch operation in order to reduce the miss penalty of the pixel cache. That is, the cache miss penalty can be reduced as the transfer of a missed cache block from the frame memory into the pixel cache can be handled simultaneously with the rasterization pipeline executions. Simulation results show that the proposed architecture can achieve a performance gain of about 32 percent compared with the conventional pretexturing architecture and about 7 percent compared to the hierarchical z-buffer visibility scheme View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Self-organizing sensor networks for integrated target surveillance

    Page(s): 1033 - 1047
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (5492 KB) |  | HTML iconHTML  

    Self-organization is critical for a distributed wireless sensor network due to the spontaneous and random deployment of a large number of sensor nodes over a remote area. Such a network is often characterized by its abilities to form an organizational structure without much centralized intervention. An important design goal for a smart sensor network is to be able have an energy-efficient, self-organized configuration of sensor nodes that can scan, detect, and track targets of interest in a distributed manner. In this paper, we propose a novel self-organization protocol and describe other relevant, indigenous building blocks that can be combined to build integrated surveillance applications for self-organized sensor networks. Experiments in both simulated and real-world platforms indicate that this protocol can be useful for tracking targets that follow a predictable course View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An online heuristic for maximum lifetime routing in wireless sensor networks

    Page(s): 1048 - 1056
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2587 KB) |  | HTML iconHTML  

    We show that the problem of routing messages in a wireless sensor network so as to maximize network lifetime is NP-hard. In our model, the online model, each message has to be routed without knowledge of future route requests. We also develop an online heuristic to maximize network lifetime. Our heuristic, which performs two shortest path computations to route each message, is superior to previously published heuristics for lifetime maximization - our heuristic results in greater lifetime and its performance is less sensitive to the selection of heuristic parameters. Additionally, our heuristic is superior on the capacity metric View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Using indexing functions to reduce conflict aliasing in branch prediction tables

    Page(s): 1057 - 1061
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (980 KB) |  | HTML iconHTML  

    High-accuracy branch prediction is crucial for high-performance processors. Inspired by the work on indexing functions to eliminate conflict-misses in memory hierarchy, this paper explores different indexing approaches to reduce conflict aliasing in branch-prediction tables. Our results show that indexing functions provide a highly complexity-effective way to enhance prediction accuracy View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Integer multipliers with overflow detection

    Page(s): 1062 - 1066
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1067 KB) |  | HTML iconHTML  

    This paper presents a general approach for designing array and tree integer multipliers with overflow detection. The overflow detection techniques are based on an analysis of the magnitudes of the input operands. The overflow detection circuits operate in parallel with a simplified multiplier to reduce the overall area and delay View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A comment on "Boolean functions classification via fixed polarity Reed-Muller form"

    Page(s): 1067 - 1069
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (296 KB) |  | HTML iconHTML  

    A correction to the classification of the space of n-variable Boolean functions proposed in the paper cited in C.C. Tsai et al. (1997) is reported here and commented upon View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Call for papers - Special issue on Emergent Systems, Algorithms, and Architectures for Speech-Based Human-Machine Interaction

    Page(s): 1070
    Save to Project icon | Request Permissions | PDF file iconPDF (27 KB)  
    Freely Available from IEEE
  • IEEE Computer Society celebrates two 60-year anniversaries

    Page(s): 1071
    Save to Project icon | Request Permissions | PDF file iconPDF (104 KB)  
    Freely Available from IEEE
  • Join the IEEE Computer Society - Now with 800 Course Modules for Distance Learning!

    Page(s): 1072
    Save to Project icon | Request Permissions | PDF file iconPDF (52 KB)  
    Freely Available from IEEE
  • TC Information for authors

    Page(s): c3
    Save to Project icon | Request Permissions | PDF file iconPDF (95 KB)  
    Freely Available from IEEE
  • [Back cover]

    Page(s): c4
    Save to Project icon | Request Permissions | PDF file iconPDF (152 KB)  
    Freely Available from IEEE

Aims & Scope

The IEEE Transactions on Computers is a monthly publication with a wide distribution to researchers, developers, technical managers, and educators in the computer field.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Albert Y. Zomaya
School of Information Technologies
Building J12
The University of Sydney
Sydney, NSW 2006, Australia
http://www.cs.usyd.edu.au/~zomaya
albert.zomaya@sydney.edu.au