By Topic

Computers, IEEE Transactions on

Issue 3 • Date Mar 1991

Filter Results

Displaying Results 1 - 16 of 16
  • A performance study of the ISO transport protocol

    Page(s): 253 - 262
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (912 KB)  

    The performance of different implementations of the ISO transport protocol class 4 on top of a connectionless link-layer protocol in the presence of transmission errors is studied by means of simulation. A main goal of the study is to find robust protocol versions yielding good performance for all possible combinations of implementations. The performance measure investigated is the throughput that can be achieved in transferring files between two stations over a local area or long-distance network. The results are compared to those obtained for the ISO transport protocol class 2 on top of a connection-oriented link-layer protocol. It is assumed that data and acknowledgement frames could be lost or corrupted by transmission errors. The results show that, at the sender side, retransmit all always yields higher performance than retransmit first, and, at the receiver side, store results in higher performance than discard View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Designing storage efficient decision trees

    Page(s): 315 - 320
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (484 KB)  

    The problem of designing storage-efficient decision trees from decision tables is examined. It is shown that for most cases, the construction of the storage optimal decision tree is an NP-complete problem, and therefore a heuristic approach to the problem is necessary. A systematic procedure analogous to the information-theoretic heuristic is developed. The algorithm has low computational complexity and performs well experimentally View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The expected (not worst-case) throughput of the Ethernet protocol

    Page(s): 245 - 252
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (540 KB)  

    The Ethernet protocol is analyzed using a space-time model in which the occurrence of a collision is determined not only by frame arrival times but also by locations of nodes. The space-time model is described, and the expected throughput of the Ethernet protocol is derived assuming that frames arrive by a Poisson process. Numerical results show that the expected throughput can be significantly higher than earlier time-domain results when the spatial dimension a>0.01. This result is attributed to, among other factors, a significant reduction of the vulnerable area View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Arithmetic spectrum applied to fault detection for combinational networks

    Page(s): 320 - 324
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (412 KB)  

    A method for the derivation of fault signatures for the detection of faults in single-output combinational networks is described. The approach uses the arithmetic spectrum instead of the Rademacher-Walsh spectrum. It is a form of data compression that serves to reduce the volume of the response data at test time. The price which is paid for the reduction in the storage requirements is that some of the knowledge of exact fault location is lost. The derived signatures are short and easily tested using very simple test equipment. The test circuitry could be included on the chip since the overhead involved is comparatively small. The test procedure requires a high-speed counter cycling at maximum speed through selected subsets of all input combinations. Hence, the network under test is exercised at speed, and a number of dynamic errors that are not testable by means of conventional test-set approaches will be detected View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance analysis of single stage interconnection networks

    Page(s): 357 - 365
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (556 KB)  

    A single-stage interconnection network (SSIN) consisting of only one stage of switches and recirculation through processors is studied. An analytical probability model for SSINs using 2×2 switches is introduced, and results obtained with the model are compared with simulation results. Four SSINs with different network sizes, loading, and routing strategies are discussed. Processors with and without buffers are considered, and three different routing strategies are applied to resolve conflicts. The analytical model is seen to be in close agreement with the simulation results while providing at least an order of magnitude reduction in CPU time View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Generalization of min-cut partitioning to tree structures and its applications

    Page(s): 307 - 314
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (760 KB)  

    A generalization of the min-cut partitioning problem, called min-cost tree partitioning, is introduced. In the generalized problem. the nodes of a hypergraph G are to be mapped onto the vertices of a tree structure T, and the cost function to be minimized is the cost of routing the hyperedges of G on the edges of T . The standard min-cut problem is the simple case in which the tree T is a single edge connecting two vertices. Several VLSI design applications for this problem are discussed. An iterative improvement heuristic for this problem in which nodes of the hypergraph are moved between the vertices of the tree is described. The running time of a single pass of the heuristic for the unweighted version of the problem is Q(P×D×t3), where P is the total number of pins in the hypergraph G, D is the maximum number of nodes in a hyperedge of G, and t is the number of vertices in the tree T . Several test results are discussed View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Enhanced hypercubes

    Page(s): 284 - 294
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (944 KB)  

    A hypercube with extra connections added between pairs of nodes through otherwise unused links is investigated. The extra connections are made in a way that maximizes the improvement of the performance measure of interest under various traffic distributions. The resulting hypercube, called the enhanced hypercube, requires a simple routing algorithm and is guaranteed not to create any traffic-congested points or links. The enhanced hypercube achieves noticeable improvement in diameter, mean internode distance, and traffic density, and it also is more cost effective than a regular hypercube. An efficient broadcast algorithm that can considerably speed up the broadcast process in enhanced hypercubes is provided View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Improved algorithms for mapping pipelined and parallel computations

    Page(s): 295 - 306
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1104 KB)  

    Recent work on the problem of mapping pipelined or parallel computations onto linear array, shared memory, and host-satellite systems is extended. It is shown how these problems can be solved even more efficiently when computation module execution times are bounded from below, intermodule communication times are bounded from above, and the processors satisfy certain homogeneity constraints. The improved algorithms have significantly lower time and space complexities than the more general algorithms: in one case, an O(nm3 ) time algorithm for mapping m modules onto n processors is replaced with an O(nm log m) time algorithm, and the space requirements are reduced from O( nm2) to O(m). Run-time complexity is reduced further with parallel mapping algorithms based on these improvements, which run on the architectures for which they create mappings View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Analysis of packet-switched multiple-bus multiprocessor systems

    Page(s): 352 - 357
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (560 KB)  

    Performance analyses of packet-switched multiple-bus multiprocessor systems are presented. Approximate queuing network models are developed for both synchronous and asynchronous control schemes, and the results are shown to be in good agreement with simulation results. The analysis of the synchronous system is based on a decomposition technique, with each of the shared resources in the system being represented as a single-server queue. For asynchronous systems, the analysis is based on the flow equivalence technique. Numerical results obtained from the analyses indicate that packet-switched multiple-bus multiprocessors with only a few buses perform almost as well as crossbar-based multiprocessors View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Conflict-free vector access using a dynamic storage scheme

    Page(s): 276 - 283
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (704 KB)  

    An approach whereby conflict-free access of any constant stride can be made by selecting a storage scheme for each vector based on the accessing patterns used with that vector is considered. By factoring the stride into two components, one a power of 2 and the other relatively prime to 2, a storage scheme that allows conflict-free access to the vector using the specified stride can be synthesized. All such schemes are based on a variation of the row rotation mechanism proposed by P. Budnik and D. Kuck (ibid., vol.C-20. no.12, pp.1566-9, Dec. 1971). Each storage scheme is based on two parameters, one describing the type of rotation to perform and the other describing the amount of memory to be rotated as a single block. The performance of the memory under access strides other than the stride used to specify the storage scheme is also considered. Modeling these other strides represents a vector being accessed with multiple strides as well as situations when the stride cannot be determined prior to initializing the vector. Simulation results show that if a single buffer is added to each memory port, then the average performance of the dynamic scheme surpasses that of the interleaved scheme for arbitrary stride accesses View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimal scheduling of signature analysis for VLSI testing

    Page(s): 336 - 341
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (468 KB)  

    A simple algorithm that shows how to optimally schedule the test-application and the signature-analysis phases of VLSI testing is presented. The testing process is broken into subintervals, the signature is analyzed at the end of each subinterval, and future tests are aborted if the circuit is found to be faulty, thus saving test time. The mathematical proofs associated with the algorithm are given View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A decomposition procedure for the analysis of a closed fork/join queueing system

    Page(s): 365 - 370
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (444 KB)  

    An iterative approximation algorithm for analyzing a closed queueing system with a K-sibling fork/join queue is presented. The iterative procedure is based on a combination of nearly complete decomposability and the Gauss-Seidel method. The approximation procedure gives good results for the mean response time and the system throughput. The iterative procedure converges to the exact solution in the case of the closed 3-sibling fork/join queue View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Heuristic technique for processor and link assignment in multicomputers

    Page(s): 325 - 333
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (876 KB)  

    A graph-based solution to the mapping problem using the simulated annealing optimization heuristic is developed. An automated two-phase mapping strategy is formulated: process annealing assigns parallel processes to processing nodes, and connection annealing schedules traffic connections on network data links so that interprocess communication conflicts are minimized. To evaluate the quality of generated mappings. cost functions suitable for simulated annealing that accurately quantify communications overhead are derived. Communication efficiency is formulated to measure the quality of assignments when the optimal mapping is unknown. The mapping scheme is implemented using the hypercube as a host architecture, and results for several image graphs are presented View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Subcube allocation in hypercube computers

    Page(s): 341 - 352
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1216 KB)  

    A precise characterization of the subcube allocation problem and a general methodology to solve it are presented. Subcube allocation and coalescing algorithms that have the goal of minimizing fragmentation are developed. The concept of a maximal set of subcubes (MSS), which is useful in making allocations that result in a tightly packed hypercube, is introduced. The problems of allocating subcubes and of forming an MSS are formulated as decision problems and shown to be NP-hard. It is proved analytically that the buddy strategy is optimal under restricted conditions, and it is shown using simulation that its performance is actually poor under more realistic conditions. A heuristic procedure for efficiently coalescing a released cube with the existing free cubes is suggested. This coalescing approach is coupled with a simple best-fit allocation scheme to form the basis of a class of MSS-based strategies that give a substantial performance (hit ratio) improvement over the buddy strategy. Simulation results comparing several different allocation and coalescing strategies, which show that the MSS-based schemes provide a marked performance improvement over previous techniques, are presented View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A special function unit for database operations (SFU-DB): design and performance evaluation

    Page(s): 263 - 275
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1040 KB)  

    The design and analysis of a special function unit for database operations (SFU-DB) that uses a novel hardware sorting module, the automatic retrieval memory (ARM), are described. The SFU-DB is a functionally independent unit that efficiently performs certain nonnumeric operations. It can function as a coprocessor for a host CPU or as a special processing unit in a highly parallel processing system. The ARM implements in hardware a true distribution-based sort algorithm that requires no comparison operations. Without performing any comparison, the SFU-DB avoids the lower bound constraint on comparison-based sorting algorithms and achieves, for the worst case, a complexity of O(n) for both execution time and main memory size. Using the fundamental sort algorithm with slight modifications. the SFU-DB also uses the ARM as an engine for other primitive database operations such as relational join, elimination of duplicates, set union, set intersection, and set difference, also with complexity of O(n). The SFU-DB/ARM architecture is rather simple and requires only a modest amount of specialized hardware. The specialized hardware has been designed and simulated for fabrication using CMOS gate arrays, and the remainder of the SFU-DB has been simulated in software using Turbo Pascal running on an IBM-PC View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A stabilized parallel algorithm for direct-form recursive filters

    Page(s): 333 - 336
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (328 KB)  

    A stabilized parallel algorithm for direct-form recursive filters is obtained, using a method of derivation in the Z domain. The degree of parallelism, stability, and complexity of the algorithm is examined. It is shown how to reduce the number of multiplications compared to the number required in a naive implementation. The algorithm is regular and modular, so very efficient VLSI architectures can be constructed to implement it. The degree of parallelism in these implementations can be chosen freely and is not restricted to be a power of two View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

The IEEE Transactions on Computers is a monthly publication with a wide distribution to researchers, developers, technical managers, and educators in the computer field.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Albert Y. Zomaya
School of Information Technologies
Building J12
The University of Sydney
Sydney, NSW 2006, Australia
http://www.cs.usyd.edu.au/~zomaya
albert.zomaya@sydney.edu.au