By Topic

Parallel and Distributed Systems, IEEE Transactions on

Issue 2 • Date Apr 1990

Filter Results

Displaying Results 1 - 13 of 13
  • IPS-2: the second generation of a parallel program measurement system

    Publication Year: 1990 , Page(s): 206 - 217
    Cited by:  Papers (46)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1064 KB)  

    IPS, a performance measurement system for parallel and distributed programs, is currently running on its second implementation. IPS's model of parallel programs uses knowledge about the semantics of a program's structure to provide two important features. First, IPS provides a large amount of performance data about the execution of a parallel program, and this information is organized so that access to it is easy and intuitive. Secondly, IPS provides performance analysis techniques that help to guide the programmer automatically to the location of program bottlenecks. The first implementation of IPS was a testbed for the basic design concepts, providing experience with a hierarchical program and measurement model, interactive program analysis, and automatic guidance techniques. It was built on the Charlotte distributed operating system. The second implementation, IPS-2, extends the basic system with new instrumentation techniques, an interactive and graphical user interface, and new automatic guidance analysis techniques. This implementation runs on 4.3BSD UNIX systems, on the VAX, DECstation, Sun 4, and Sequent Symmetry multiprocessor View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Depth-first search approach for fault-tolerant routing in hypercube multicomputers

    Publication Year: 1990 , Page(s): 152 - 159
    Cited by:  Papers (63)  |  Patents (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (748 KB)  

    Using depth-first search, the authors develop and analyze the performance of a routing scheme for hypercube multicomputers in the presence of an arbitrary number of faulty components. They derive an exact expression for the probability of routing messages by way of optimal paths (of length equal to the Hamming distance between the corresponding pair of nodes) from the source node to an obstructed node. The obstructed node is defined as the first node encountered by the message that finds no optimal path to the destination node. It is noted that the probability of routing messages over an optimal path between any two nodes is a special case of the present results and can be obtained by replacing the obstructed node with the destination node. Numerical examples are given to illustrate the results, and they show that, in the presence of component failures, depth-first search routing can route a message to its destination by means of an optimal path with a very high probability View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Prefetching in file systems for MIMD multiprocessors

    Publication Year: 1990 , Page(s): 218 - 230
    Cited by:  Papers (19)  |  Patents (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1284 KB)  

    The question of whether prefetching blocks on the file into the block cache can effectively reduce overall execution time of a parallel computation, even under favorable assumptions, is considered. Experiments have been conducted with an interleaved file system testbed on the Butterfly Plus multiprocessor. Results of these experiments suggest that (1) the hit ratio, the accepted measure in traditional caching studies, may not be an adequate measure of performance when the workload consists of parallel computations and parallel file access patterns, (2) caching with prefetching can significantly improve the hit ratio and the average time to perform an I/O (input/output) operation, and (3) an improvement in overall execution time has been observed in most cases. In spite of these gains, prefetching sometimes results in increased execution times (a negative result, given the optimistic nature of the study). The authors explore why it is not trivial to translate savings on individual I/O requests into consistently better overall performance and identify the key problems that need to be addressed in order to improve the potential of prefetching techniques in the environment View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Iterative instructions in the Manchester Dataflow Computer

    Publication Year: 1990 , Page(s): 129 - 139
    Cited by:  Papers (5)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (952 KB)  

    The authors investigate the nature and extent of the benefits and adverse effects of iterative instructions in the prototype Manchester Dataflow Computer. Iterative instructions are shown to be highly beneficial in terms of the number of instructions executed and the number of tokens transferred between modules during a program run. This benefit is apparent at hardware level, giving significantly reduced program execution times. However, the full benefits are not realized due to interference between lengthy iterative instructions. It is suggested that restructuring of buffers and the function unit array in the prototype hardware configuration can reduce this interference. Other possibilities for improvement are suggested. For example, the slowdown effect observed in hardware speedup curves could be tackled by treating iterative instructions differently from fine-grain instructions. An alternative structure for the processing element in which certain function units are specialized for executing iterative instructions is being investigated in this connection View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Error recovery in shared memory multiprocessors using private caches

    Publication Year: 1990 , Page(s): 231 - 240
    Cited by:  Papers (30)  |  Patents (12)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (992 KB)  

    The problem of recovering from processor transient faults in shared memory multiprocessor systems is examined. A user-transparent checkpointing and recovery scheme using private caches is presented. Processes can recover from errors due to faulty processors by restarting from the checkpointed computation state. Implementation techniques using checkpoint identifiers and recovery stacks are examined as a means of reducing performance degradation in processor utilization during normal execution. This cache-based checkpointing technique prevents rollback propagation, provides rapid recovery, and can be integrated into standard cache coherence protocols. An analytical model is used to estimate the relative performance of the scheme during normal execution. Extensions to take error latency into account are presented View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The banyan-hypercube networks

    Publication Year: 1990 , Page(s): 160 - 169
    Cited by:  Papers (26)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (916 KB)  

    The authors introduce a family of networks that are a synthesis of banyans and hypercubes and are called the banyan-hypercubes (BH). They combine the advantageous features of banyans and hypercubes and thus have better communication capabilities. The networks can be viewed as consisting of interconnecting hypercubes. It is shown that many hypercube features can be incorporated into BHs with regard to routing, embedding of rings and meshes, and partitioning, and that improvements over the hypercube result are made. In particular, it is shown that BHs have better diameters and average distances than hypercubes, and they embed pyramids and multiple pyramids with dilation cost 1. An optimal routing algorithm for BHs and an efficient partitioning strategy are presented View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Deciding properties of timed transition models

    Publication Year: 1990 , Page(s): 170 - 183
    Cited by:  Papers (30)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1348 KB)  

    Real-time distributed systems are modeled by a times transition model (TTM). For any finite-state TTM, decision procedures are provided for checking a small but important class of properties (specified in real-time temporal logic). The procedures are linear in the size of the system reachability graph. The class of properties includes invariance, precedence, eventuality and real-time response specifications View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design, analysis, and simulation of I/O architectures for hypercube multiprocessors

    Publication Year: 1990 , Page(s): 140 - 151
    Cited by:  Papers (9)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1156 KB)  

    Several issues concerning the design of an I/O (input/output) system for a multiprocessor such as a hypercube are examined. A methodology is proposed for connecting the I/O processors to such a system for efficient I/O access. The effect of I/O communication on the multiprocessor network is analyzed. Different disk organizations that can be employed within such a system are evaluated to see which organization has a better performance. It is observed that parallelism in serving an I/O request plays a dominant role in the scientific workload. The problem of mapping specific data structures such as matrices onto the disks so that the data can be accessed efficiently is considered View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Modelling speedup (n) greater than n

    Publication Year: 1990 , Page(s): 250 - 256
    Cited by:  Papers (13)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (604 KB)  

    A simple model of parallel computation which is capable of explaining speedups greater than n on n processors is presented. Necessary and sufficient conditions for these exceptional speedups are derived from the model. Several of the contradictory previous results relating to parallel speedup are resolved by using the model View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel binary search

    Publication Year: 1990 , Page(s): 247 - 250
    Cited by:  Papers (5)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (340 KB)  

    Two arrays of numbers sorted in nondecreasing order are given: an array A of size n and an array B of size m, where n<m. It is required to determine, for every element of A, the smallest element of B (if one exists) that is larger than or equal to it. It is shown how to solve this problem on the EREW PRAM (exclusive-read exclusive-write parallel random-access machine) in O(logm logn/log log m) time using n processors. The solution is then extended to the case in which fewer than n processors are available. This yields an EREW PRAM algorithm for the problem whose cost is O(n log m, which is O(m)) for nm/log m. It is shown how the solution obtained leads to an improved parallel merging algorithm View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Designing efficient parallel algorithms on mech-connected computers with multiple broadcasting

    Publication Year: 1990 , Page(s): 241 - 246
    Cited by:  Papers (32)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (592 KB)  

    Semigroup and prefix computations on two-dimensional mesh-connected computers with multiple broadcasting (2-MCCMBs) are studied. Previously, only square 2-MCCMBs with N processing elements were considered for semigroup computations of N data items, and O(N1/6) time was required. It is found that square machines are not the best form for semigroup computations, and an O(N1/8)-time algorithm is derived on an N5/8×N3/8 rectangular 2-MCCMB. This time complexity can be further reduced to O(N1/9) if fewer processing elements are used. Parallel algorithms for prefix computations with the same time complexities are derived View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Experimental application-driven architecture analysis of an SIMD/MIMD parallel processing system

    Publication Year: 1990 , Page(s): 195 - 205
    Cited by:  Papers (20)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1068 KB)  

    An experimental analysis of the architecture of an SIMD/MIMD parallel processing system is presented. Detailed implementations of parallel fast Fourier transform (FFT) programs were used to examine the performance of the prototype of the PASM (Partitionable SIMD/MIMD) parallel processing system. Detailed execution-time measurements using specialized timing hardware were made for the complete FFT and for components of SIMD, MIMD, and barrier-synchronized MIMD implementations. The component measurements isolated the effects of floating-point arithmetic operations, interconnection network transfer operations, and program control overhead. The measurements allow an accurate extrapolation of the execution time, speedup, and efficiency of the MIMD, SIMD, and barrier-synchronized MIMD programs to a full 1024-processor PASM system. This constitutes one of the first results of this kind, in which controlled experiments on fixed hardware were used to make comparisons of these fundamental modes of computing. Overall, the experimental results demonstrate the value of mixed-mode SIMD/MIMD computing and its suitability for computational intensive algorithms such as the FET View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient scheduling algorithms for real-time multiprocessor systems

    Publication Year: 1990 , Page(s): 184 - 194
    Cited by:  Papers (115)  |  Patents (26)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1028 KB)  

    Efficient scheduling algorithms based on heuristic functions are developed for scheduling a set of tasks on a multiprocessor system. The tasks are characterized by worst-case computation times, deadlines, and resources requirements. Starting with an empty partial schedule, each step of the search extends the current partial schedule by including one of the tasks yet to be scheduled. The heuristic functions used in the algorithm actively direct the search for a feasible schedule, i.e. they help choose the task that extends the current partial schedule. Two scheduling algorithms are evaluated by simulation. To extend the current partial schedule, one of the algorithms considers, at each step of the search, all the tasks that are yet to be scheduled as candidates. The second focuses its attention on a small subset of tasks with the shortest deadlines. The second algorithm is shown to be very effective when the maximum allowable scheduling overhead is fixed. This algorithm is hence appropriate for dynamic scheduling in real-time systems View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

IEEE Transactions on Parallel and Distributed Systems (TPDS) is published monthly. It publishes a range of papers, comments on previously published papers, and survey articles that deal with the parallel and distributed systems research areas of current importance to our readers.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
David Bader
College of Computing
Georgia Institute of Technology