By Topic

Computers, IEEE Transactions on

Issue 12 • Date Dec. 1999

Filter Results

Displaying Results 1 - 10 of 10
  • Author index

    Page(s): 1380 - 1384
    Save to Project icon | Request Permissions | PDF file iconPDF (120 KB)  
    Freely Available from IEEE
  • Subject index

    Page(s): 1384 - 1392
    Save to Project icon | Request Permissions | PDF file iconPDF (700 KB)  
    Freely Available from IEEE
  • Distributed generation of weighted random patterns

    Page(s): 1364 - 1368
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (136 KB)  

    This paper describes the design details, operation, cost, and performance of a distributed weighted pattern test approach at the chip level. The traditional LSSD SRLs are being replaced by WRP SRLs designed specifically to facilitate a weighted random pattern (WRP) test. A two-bit code is transmitted to each WRP SRL to determine its specific weight. The WRP test is then divided into groups, where each group is activated with a different set of weights. The weights are dynamically adjusted during the course of the test to “go after” the remaining untested faults. The cost and performance of this design system are explored on ten pilot chips. Results of this experiment are provided in the paper View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Diagnosability of hypercubes and enhanced hypercubes under the comparison diagnosis model

    Page(s): 1369 - 1374
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (476 KB)  

    A. Sengupta and A. Dahbura (1992) discussed how to characterize a diagnosable system under the comparison diagnosis model proposed by J. Maeng and M. Malek (1981) and a polynomial algorithm was given to identify the faulty processors provided that the system's diagnosability is known. However, for a general system, the determination of its diagnosability is not algorithmically easy. This paper proves that, for the important hypercube structured multiprocessor systems (n-cubes), the diagnosability under the comparison model is n when n⩾5. The paper also studies the diagnosability of enhanced hypercube, which is obtained by adding 2n-1 more links to a regular hypercube of 2n processors. It is shown that the augmented communication ability among processors also increases the system's diagnosability under the comparison model. We prove that the diagnosability is n+1 for an enhanced hypercube when n⩾6 View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Two operand binary adders with threshold logic

    Page(s): 1324 - 1337
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (964 KB)  

    The central topic of this paper is the implementation of binary adders with threshold logic using a new methodology that introduces two innovations: the use of the input and output carries of each bit for obtaining all the sum bits and a modification of the classic carry lookahead adder technique that allows us to obtain the expressions of the generation and propagation carries in a more appropriate way for threshold logic. In this way, it has been possible to systematize the process of design of a binary adder with threshold logic relating all its important parameters: number of bits of the operands, depth, size, maximum fan-in, and maximum weight. The results obtained are an improvement on those published to date and are summarized as follows: Depth 2 adder: s=2n, wmax=2n, fmax=2n+1. Depth 3 adder: s=4n-2[n/[√n]], wmax =2[n/[√n]], fmax=2[n/[√n]]+1. Depth d adder (asymptotic behavior): s=O(n), wmax=O(2d-1√n), fmax=O(d-1√n). If the weights are bounded by wmax:nmax=O(logd-1 wmax), d min=O(log n/log(log wmax)) View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Distributed path reservation algorithms for multiplexed all-optical interconnection networks

    Page(s): 1355 - 1363
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (380 KB)  

    In this paper, we study distributed path reservation protocols for multiplexed all-optical interconnection networks. The path reservation protocols negotiate the reservation and establishment of connections that arrive dynamically to the network. These protocols can be applied to both wavelength division multiplexing (WDM) and time division multiplexing (TDM) networks. Two classes of protocols are discussed: forward reservation protocols and backward reservation protocols. Simulations of multiplexed two-dimensional torus interconnection networks are used to evaluate and compare the performance of the protocols and to study the impact of system parameters, such as the multiplexing degree and the network size, speed, and load, on both network throughput and communication delay. The simulation results show that, in most cases, the backward reservation schemes provide better performance than their forward reservation counterparts View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Run-time cache bypassing

    Page(s): 1338 - 1354
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1208 KB)  

    The growing disparity between processor and memory performance has made cache misses increasingly expensive. Additionally, data and instruction caches are not always used efficiently, resulting in large numbers of cache misses. Therefore, the importance of cache performance improvements at each level of the memory hierarchy will continue to grow. In numeric programs, there are several known compiler techniques for optimizing data cache performance. However, integer (nonnumeric) programs often have irregular access patterns that are more difficult for the compiler to optimize. In the past, cache management techniques such as cache bypassing were implemented manually at the machine-language-programming level. As the available chip area grows, it makes sense to spend more resources to allow intelligent control over the cache management. In this paper, we present an approach to improving cache effectiveness, taking advantage of the growing chip area, utilizing run-time adaptive cache management techniques, optimizing both performance and cost of implementation. Specifically, we are aiming to increase data cache effectiveness for integer programs. We propose a microarchitecture scheme where the hardware determines data placement within the cache hierarchy based on dynamic referencing behavior. This scheme is fully compatible with existing instruction set architectures. This paper examines the theoretical upper bounds on the cache hit ratio that cache bypassing can provide for integer applications, including several Windows applications with OS activity. Then, detailed trace-driven simulations of the integer applications are used to show that the implementation described in this paper can achieve performance close to that of the upper bound View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Synthesis for testability of highly complex controllers by functional redundancy removal

    Page(s): 1305 - 1323
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1432 KB)  

    This paper presents a testable synthesis methodology applicable to any top-down design method based on hardware-description-language descriptions, or graphical representations. The methodology is targeted on control-dominated applications and it is based on the identification and removal of a new class of redundant faults, called functionally redundant faults. The formal relation between functionally redundant faults and sequentially redundant faults is introduced. Moreover, the relation between functionally redundant faults and logic synthesis algorithms based on local don't cares is shown. Functionally redundant faults are identified and removed by comparing the implemented synchronous sequential circuit, which can be technology dependent, to its specification. The specification can be a single finite state machine (FSM), a set of interacting FSMs, or a hierarchical FSM that allows the description of highly complex controllers. The proposed methodology produces testable circuits, with area reduction, still mapped on the same technology library, and it manages circuits which cannot be handled by other methods presented in the literature View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A performance model for Duato's fully adaptive routing algorithm in k-ary n-cubes

    Page(s): 1297 - 1304
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (396 KB)  

    Analytical models of deterministic routing in wormhole-routed k-ary n-cubes have widely been reported in the literature. Although many fully adaptive routing algorithms have been proposed to overcome the performance limitations of deterministic routing, there have been hardly any studies that describe analytical models for these algorithms. This paper proposes a new analytical model for obtaining latency measures in high-radix k-ary n-cubes with fully adaptive routing, based on Duato's algorithm (1998). The validity of the model is demonstrated by comparing analytical results with those obtained through simulation experiments View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Statistical prediction of task execution times through analytic benchmarking for scheduling in a heterogeneous environment

    Page(s): 1374 - 1379
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (276 KB)  

    In this paper, a method for estimating task execution times is presented in order to facilitate dynamic scheduling in a heterogeneous metacomputing environment. Execution time is treated as a random variable and is statistically estimated from past observations. This method predicts the execution time as a function of several parameters of the input data and does not require any direct information about the algorithms used by the tasks or the architecture of the machines. Techniques based upon the concept of analytic benchmarking/code profiling are used to characterize the performance differences between machines, allowing observations from dissimilar machines to be used when making a prediction. Experimental results are presented which use actual execution time data gathered from 16 heterogeneous machines View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

The IEEE Transactions on Computers is a monthly publication with a wide distribution to researchers, developers, technical managers, and educators in the computer field.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Albert Y. Zomaya
School of Information Technologies
Building J12
The University of Sydney
Sydney, NSW 2006, Australia
http://www.cs.usyd.edu.au/~zomaya
albert.zomaya@sydney.edu.au