By Topic

Parallel and Distributed Systems, IEEE Transactions on

Issue 10 • Date Oct 1994

Filter Results

Displaying Results 1 - 10 of 10
  • Efficient mappings of pyramid networks

    Page(s): 1009 - 1017
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (780 KB)  

    We consider primarily the simulation of large networks by smaller ones-an important consideration, because interconnection networks are typically of a fixed size, and yet applications may employ networks of a larger size. Current research (Dingle and Sudborough, 1993) describes methods to simulate common data structures and network architectures on the pyramid. However, these simulations assume that the pyramid grows with the size of the network or data structure. Because unbounded growth is not feasible, we address the issue of mapping several points of the guest data structure or network to a single host processor. We determine how a small pyramid may efficiently simulate the computation of a larger pyramid as well as that of tree networks View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Unstructured tree search on SIMD parallel computers

    Page(s): 1057 - 1072
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1400 KB)  

    We present new methods for load balancing of unstructured tree computations on large-scale SIMD machines, and analyze the scalability of these and other existing schemes. An efficient formulation of tree search on an SIMD machine consists of two major components: a triggering mechanism, which determines when the search space redistribution must occur to balance the search space over processors, and a scheme to redistribute the search space. We have devised a new redistribution mechanism and a new triggering mechanism. Either of these can be used in conjunction with triggering and redistribution mechanisms developed by other researchers. We analyze the scalability of these mechanisms and verify the results experimentally. The analysis and experiments show that our new load-balancing methods are highly scalable on SIMD architectures. Their scalability is shown to he no worse than that of the best load-balancing schemes on MIMD architectures. We verify our theoretical results by implementing the 15-puzzle problem on a CM-2 SIMD parallel computer View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design of algorithm-based fault-tolerant multiprocessor systems for concurrent error detection and fault diagnosis

    Page(s): 1099 - 1106
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (808 KB)  

    Algorithm-based fault tolerance (ABPT) is a low-overhead system-level concurrent error detection and fault location scheme for multiprocessor systems. We present new methods for the design of ABFT systems. Our design procedure is applicable to a wide range of systems in which processors share data elements. A feature of our design approach is that the type of checks to be used in the final system can be controlled by the system designer. We also present some new bounds on the number of checks needed in ABFT system design View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A reconfigurable modular fault-tolerant hypercube architecture

    Page(s): 1018 - 1032
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (980 KB)  

    We propose a new fault-tolerant design of a hypercube system. We first build the fault-tolerant modules (FTM's), then we interconnect these FTM's as the modular hypercube. Finally, we obtain our proposed system by augmenting links, called the spare-sharing links (SSL's), in the modular hypercube, which forms a ring connection in our architecture. The characteristic of our system is that the spare nodes in an FTM can be used as local spares to replace the faulty nodes in the FTM, or as remote spares to replace the faulty nodes in other FTM's via the spare-sharing links in the architecture. Thus, the use of spare nodes in any FTM will increase, and the proposed system reliability will improve. In the system, the switch and link failures are also considered. The modular diagnosis and modular reconfiguration are proposed to identify and reconfigure the failure of nodes, switches, and links View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Real-time communication in multihop networks

    Page(s): 1044 - 1056
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1292 KB)  

    Communication in real-time systems has to be predictable, because unpredictable delays in the delivery of messages can adversely affect the execution of tasks dependent on these messages. We develop a scheme for providing predictable interprocess communication in real-time systems with (partially connected) point-to-point interconnection networks, which provide guarantees on the maximum delivery time for messages. This scheme is based on the concept of a real-time channel, a unidirectional connection between source and destination. A real-time channel has parameters that describe the performance requirements of the source-destination communication, e.g., from a sensor station to a control site. Once such a channel is established, the communications subsystem guarantees that these performance requirements will be met. We concentrate on methods to compute guarantees for the delivery time of messages belonging to real-time channels. We also address problems associated with allocating buffers for these messages and develop a scheme that preserves delivery time guarantees View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel architecture for fast transforms with trigonometric kernel

    Page(s): 1091 - 1099
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (792 KB)  

    We present an unified parallel architecture for four of the most important fast orthogonal transforms with trigonometric kernel: Complex Valued Fourier (CFFT), Real Valued Fourier (RFFT), Hartley (FHT), and Cosine (FCT). Out of these, only the CFFT has a data flow coinciding with the one generated by the successive doubling method, which can be transformed on a constant geometry flow using perfect unshuffle or shuffle permutations. The other three require some type of hardware modification to guarantee the constant geometry of the successive doubling method. We have defined a generalized processing section (PS), based on a circular CORDIC rotator, for the four transforms. This PS section permits the evaluation of the CFFT and FCT transforms in n data recirculations and the RFFT and FHT transforms in n-1 data recirculations, with n being the number of stages of a transform of length N=rn. Also, the efficiency of the partitioned parallel architecture is optimum because there is no cycle loss in the systolic computation of all the butterflies for each of the four transforms View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The classification, fusion, and parallelization of array language primitives

    Page(s): 1113 - 1120
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (640 KB)  

    We present a classification scheme for array language primitives that quantifies the variation in parallelism and data locality that results from the fusion of any two primitives. We also present an algorithm based on this scheme that efficiently determines when it is beneficial to fuse any two primitives. Experimental results show that five LINPACK routines report 50% performance improvement from the fusion of array operators View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A scalable parallel formulation of the backpropagation algorithm for hypercubes and related architectures

    Page(s): 1073 - 1090
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1480 KB)  

    We present a new technique for mapping the backpropagation algorithm on hypercube and related architectures. A key component of this technique is a network partitioning scheme called checkerboarding. Checkerboarding allows us to replace the all-to-all broadcast operation performed by the commonly used vertical network partitioning scheme, with operations that are much faster on the hypercubes and related architectures. Checkerboarding can be combined with the pattern partitioning technique to form a hybrid scheme that performs better than either one of these schemes. Theoretical analysis and experimental results on nCUBE and CM5 show that our scheme performs better than the other schemes, for both uniform and nonuniform networks View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Using Petri nets for the design of conversation boundaries in fault-tolerant software

    Page(s): 1106 - 1112
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (612 KB)  

    Only a few mechanisms have been proposed for the design of fault-tolerant software. One of these is the conversation, which, though it has some drawbacks, is a potentially promising structure. One of the problems with conversations is that they must be defined and verified by the user. In this short note, a systematic method for generating the boundaries of conversations directly from the specification is proposed. This method can also be used to verify conversations selected by the user. The specification is described by a high-level modified Petri net which can easily be transformed into a state model called an action-ordered tree. The conversation boundaries are then determined from this tree. It is proved that the method proposed is complete in the sense that all of the possible boundaries can be determined, and it has the merit of simplicity. A robot arm control system is used to illustrate the idea. The proposed method can serve as the basis of a tool to assist in conversation designs View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The performance of cache-based error recovery in multiprocessors

    Page(s): 1033 - 1043
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (788 KB)  

    Several variations of cache-based checkpointing for rollback error recovery from transient errors in shared-memory multiprocessors have been recently developed. By modifying the cache replacement policy, these techniques use the inherent redundancy in the memory hierarchy to periodically checkpoint the computation state. Three schemes, different in the manner in which they avoid rollback propagation, are evaluated in this paper. By simulation with address traces from parallel applications running on an Encore Multimax shared-memory multiprocessor, we evaluate the performance effect of integrating the recovery schemes in the cache coherence protocol. Our results indicate that the cache-based schemes can provide checkpointing capability with low performance overhead, but with uncontrollable high variability in the checkpoint interval View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

IEEE Transactions on Parallel and Distributed Systems (TPDS) is published monthly. It publishes a range of papers, comments on previously published papers, and survey articles that deal with the parallel and distributed systems research areas of current importance to our readers.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
David Bader
College of Computing
Georgia Institute of Technology