Scheduled System Maintenance:
On May 6th, system maintenance will take place from 8:00 AM - 12:00 PM ET (12:00 - 16:00 UTC). During this time, there may be intermittent impact on performance. We apologize for the inconvenience.
By Topic

Computers, IEEE Transactions on

Issue 1 • Date Jan. 2003

Filter Results

Displaying Results 1 - 7 of 7
  • 2002 reviewers list

    Publication Year: 2003 , Page(s): 93 - 96
    Save to Project icon | Request Permissions | PDF file iconPDF (187 KB)  
    Freely Available from IEEE
  • Minimum register instruction sequencing to reduce register spills in out-of-order issue superscalar architectures

    Publication Year: 2003 , Page(s): 4 - 20
    Cited by:  Papers (7)  |  Patents (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1062 KB) |  | HTML iconHTML  

    In this paper, we address the problem of generating an optimal instruction sequence S for a Directed Acyclic Graph (DAG), where S is optimal in terms of the number of registers used. We call this the Minimum Register Instruction Sequence (MRIS) problem. The motivation for revisiting the MRIS problem stems from several modern architecture innovations/requirements that has put the instruction sequencing problem in a new context. We develop an efficient heuristic solution for the MRIS problem. This solution is based on the notion of instruction lineage-a set of instructions that can definitely share a single register. The formation of lineages exploits the structure of the dependence graph to facilitate the sharing of registers not only among instructions within a lineage, but also across lineages. Our efficient heuristics to "fuse" lineages further reduce the register requirement. This reduced register requirement results in generating a code sequence with fewer register spills. We have implemented our solution in the MIPSpro production compiler and measured its performance on the SPEC95 floating point benchmark suite. Our experimental results demonstrate that the proposed instruction sequencing method significantly reduces the number of spill loads and stores inserted in the code, by more than 50 percent in each of the benchmarks. Our approach reduces the average number of dynamic loads and stores executed by 10.4 percent and 3.7 percent, respectively. Further, our approach improves the execution time of the benchmarks on an average by 3.2 percent. In order to evaluate how efficiently our heuristics find a near-optimal solution to the MRIS problem, we develop an elegant integer linear programming formulation for the MRIS problem. Using a commercial integer linear programming solver, we obtain the optimal solution for the MRIS problem. Comparing the optimal solution from the integer linear programming tool with our heuristic solution reveals that, in a very large majority (99.2 percent) of the cases, our heuristic solution is optimal. For this experiment, we used a set of 675 dependence graphs representing basic blocks extracted from scientific benchmark programs. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance impact of coarse timer granularities on QoS guarantees in Unix-Based systems

    Publication Year: 2003 , Page(s): 51 - 58
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (927 KB) |  | HTML iconHTML  

    Owing to the Internet's rapid expansion and fast advancing PC technology, there are many PC-based network systems now. For increasingly many applications running over the Internet, guaranteeing QoS on these PC-based systems has become an issue of some concern. In this paper, we investigate QoS failures that occur on PC-based systems and focus on one aspect of the problem that arises from coarse timer granularities. While it is usually assumed that packet schedulers in routers have sufficiently fine-grain timers, network systems frequently have timers of coarse granularity. Therefore, users cannot obtain the desired QoS even if they reserve the required bandwidth for transmission. Based on the investigation of QoS failures due to coarse timer granularities, we experiment with two methods to cure the problems. We implement them into real PC Unix-based systems and show that they can satisfy QoS requirements of TCP connections by helping them transmit the traffic at the reserved bandwidth. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Evaluating integrated hardware-software optimizations using a unified energy estimation framework

    Publication Year: 2003 , Page(s): 59 - 76
    Cited by:  Papers (6)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (4597 KB)  

    In embedded and portable applications, energy dissipation is a major design constraint. Designers must consider energy consumption. SimplePower evaluates the energy considering the system as a whole rather than just as a sum of parts, and concurrently supports both compiler and architectural experimentation. It includes a transition-sensitive, cycle-accurate datapath energy model that interfaces with analytical and transition-sensitive energy models for the memory, clock and bus subsystems, respectively. We analyzed the energy consumption of 10 codes from the multidimensional array domain, and find datapath energy hotspots, bottlenecks and helpful features. Optimized codes saved 21 percent more energy using the most recently used way-prediction cache scheme as compared to executing unoptimized codes from the multidimensional array domain. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • AQuA: an adaptive architecture that provides dependable distributed objects

    Publication Year: 2003 , Page(s): 31 - 50
    Cited by:  Papers (21)  |  Patents (12)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1648 KB) |  | HTML iconHTML  

    Building dependable distributed systems from commercial off-the-shelf components is of growing practical importance. For both cost and production reasons, there is interest in approaches and architectures that facilitate building such systems. The AQuA architecture is one such approach; its goal is to provide adaptive fault tolerance to CORBA applications by replicating objects. The AQuA architecture allows application programmers to request desired levels of dependability during applications' runtimes. It provides fault tolerance mechanisms to ensure that a CORBA client can always obtain reliable services, even if the CORBA server object that provides the desired services suffers from crash failures and value faults. AQuA includes a replicated dependability manager that provides dependability management by configuring the system in response to applications' requests and changes in system resources due to faults. It uses Maestro/Ensemble to provide group communication services. It contains a gateway to intercept standard CORBA IIOP messages to allow any standard CORBA application to use AQuA. It provides different types of replication schemes to forward messages reliably to the remote replicated objects. All of the replication schemes ensure strong, data consistency among replicas. This paper describes the AQuA architecture and presents, in detail, the active replication pass-first scheme. In addition, the interface to the dependability manager and the design of the dependability manager replication are also described. Finally, we describe performance measurements that were conducted for the active replication pass-first scheme, and we present results from our study of fault detection, recovery, and blocking times. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • General models and a reduction design technique for FPGA switch box designs

    Publication Year: 2003 , Page(s): 21 - 30
    Cited by:  Papers (11)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (813 KB) |  | HTML iconHTML  

    An FPGA switch box is said to be hyper-universal if it is detailed-routable for any set of multipin nets specifying a routing requirement over the switch box. Comparing with the known "universal switch modules", where only 2-pin nets are considered, the hyper-universal switch box model is more general and powerful. This paper studies the generic problem and proposes a systematic designing methodology for hyper-universal (k, W)-switch boxes, where k is the number of sides and W is the number of terminals on each side. We formulate this hyper-universal (k, W)-switch box design problem as a k-parfite graph design problem and propose an efficient reduction design technique. Applying this technique, we can design hyper-universal (k, W)-switch boxes with low O(W) switches for any fixed k. For illustration, we provide optimum hyper-universal (2, W) and (3, W)-switch boxes and a hyper-universal (4, W)-switch box with switch number quite close to the lower bound 6W, which is used in a well-known commercial design without hyper-universal routability. We also conclude that the proposed reduction method can yield an efficient detailed routing algorithm for any given routing requirement as well. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Analysis of the statistical cipher feedback mode of block ciphers

    Publication Year: 2003 , Page(s): 77 - 92
    Cited by:  Papers (5)  |  Patents (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (442 KB) |  | HTML iconHTML  

    In this paper, we examine a recently proposed mode of operation for block ciphers which we refer to as statistical cipher feedback (SCFB) mode. SCFB mode configures the block cipher as a keystream generator for use in a stream cipher such that it has the property of statistical self -synchronization, thereby allowing the stream cipher to recover from bit slips in the communication channel. Statistical self-synchronization involves feeding back ciphertext to the input of the block cipher similar to the conventional cipher feedback (CFB) mode, except that the feedback only occurs when a special synchronization pattern is recognized in the ciphertext. In the paper, we examine the efficiency, resynchronization, and error propagation characteristics of SCFB and compare these to conventional modes such as CFB and output feedback (OFB). In particular, we study these characteristics of SCFB as a function of the synchronization pattern size. As well, we examine implementation issues of SCFB, focusing on the buffer requirements and resulting delay for a practical realization of the cipher. We conclude that SCFB mode can be used to provide practical, efficient, self-synchronizing implementations for stream ciphers. In particular, SCFB mode is best used in circumstances where slips are a concern and where implementation efficiency is a high priority in comparison to encryption latency. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

The IEEE Transactions on Computers is a monthly publication with a wide distribution to researchers, developers, technical managers, and educators in the computer field.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Paolo Montuschi
Politecnico di Torino
Dipartimento di Automatica e Informatica
Corso Duca degli Abruzzi 24 
10129 Torino - Italy
e-mail: pmo@computer.org