By Topic

Parallel and Distributed Systems, IEEE Transactions on

Issue 1 • Date Jan 1991

Filter Results

Displaying Results 1 - 10 of 10
  • An efficient modular spare allocation scheme and its application to fault tolerant binary hypercubes

    Publication Year: 1991 , Page(s): 117 - 126
    Cited by:  Papers (19)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (872 KB)  

    Consideration is given to fault tolerant systems that are built from modules called fault tolerant basic blocks (FTBBs), where each module contains some primary nodes and some spare nodes. Full spare utilization is achieved when each spare within an FTBB can replace any other primary or spare node in that FTBB. This, however, may be prohibitively expensive for larger FTBBs. Therefore, it is shown that for a given hardware overhead more reliable systems can be designed using bigger FTBBs without full spare utilization than using smaller FTBBs with full spare utilization. Sufficient conditions for maximizing the reliability of a spare allocation strategy in an FTBB for a given hardware overhead are presented. The proposed spare allocation strategy is applied to two fault tolerant reconfiguration schemes for binary hypercubes. One scheme uses hardware switches to replace a faulty node, and the other scheme uses fault tolerant routing to bypass faulty nodes in the system and deliver messages to the destination node View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Specifying graceful degradation

    Publication Year: 1991 , Page(s): 93 - 104
    Cited by:  Papers (10)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1160 KB)  

    A description is given of the relaxation lattice method, a new approach to specifying graceful degradation for a large class of programs. A relaxation lattice is a lattice of specifications parameterized by a set of constraints, where the stronger the set of constraints, the more restrictive the specification. While a program is able to satisfy its strongest set of constraints, it satisfies its preferred specification, but if changes to the environment force it to satisfy a weaker set, then it will permit additional weakly consistent computations which are undesired but tolerated. The use of relaxation lattices is illustrated by specifications for programs that tolerate (1) faults, such as site crashes and network partitions, (2) timing anomalies, such as attempting to read a value too soon after it was written, (3) synchronization conflicts, such as choosing the oldest unlocked item from a queue, and (4) security breaches, such as acquiring unauthorized capabilities View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Orthogonal graphs for the construction of a class of interconnection networks

    Publication Year: 1991 , Page(s): 3 - 19
    Cited by:  Papers (12)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1008 KB)  

    A graph theoretical representation for a class of interconnection networks is suggested. The idea is based on a definition of orthogonal binary vectors and leads to a construction rule for a class of orthogonal graphs. An orthogonal graph is first defined as a set of 2 m nodes, which in turn are linked by 2m-n edges for every link model defined in an integer set Q*. The degree and diameter of an orthogonal graph are determined in terms of the parameters n, m, and the number of link modes defined in Q*. Routing in orthogonal graphs is shown to reduce to the node covering problem in bipartite graphs. The proposed theory is applied to describe a number of well-known interconnection networks such as the binary m-cube and spanning-bus meshes. Multidimensional access (MDA) memories are also shown as examples of orthogonal shared memory multiprocessing systems. Finally, orthogonal graphs are applied to the construction of multistage interconnection networks. Connectivity and placement rules are given and shown to yield a number of well-known networks View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A top-down processor allocation scheme for hypercube computers

    Publication Year: 1991 , Page(s): 20 - 30
    Cited by:  Papers (56)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1108 KB)  

    An efficient processor allocation policy is presented for hypercube computers. The allocation policy is called free list since it maintains a list of free subcubes available in the system. An incoming request of dimension k (2k nodes) is allocated by finding a free subcube of dimension k or by decomposing an available subcube of dimension greater than k. This free list policy uses a top-down allocation rule in contrast to the bottom-up approach used by the previous bit-map allocation algorithms. This allocation scheme is compared to the buddy, gray code (GC), and modified buddy allocation policies reported for the hypercubes. It is shown that the free list policy is optimal in a static environment, as are the other policies, and it also gives better subcube recognition ability compared to the previous schemes in a dynamic environment. The performance of this policy, in terms of parameters such as average delay, system utilization, and time complexity, is compared to the other schemes to demonstrate its effectiveness. The extension of the algorithm for parallel implementation, noncubic allocation, and inclusion/exclusion allocation is also given View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A template-based approach to the generation of distributed applications using a network of workstations

    Publication Year: 1991 , Page(s): 52 - 67
    Cited by:  Papers (18)  |  Patents (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1268 KB)  

    A computational model and system for the generation of distributed applications in a workstation environment are presented. The well-known RPC model is modified by a novel concept known as template attachment. A computation consists of a network of sequential procedures which have been encapsulated in templates. A small selection of templates is available from which a distributed application with the desired communication behavior can be rapidly built. The system generates all the required low-level code for correct synchronization, communication, and scheduling. This results in a system that is easy to use and flexible and can provide a programmer with the desired amount of control in using idle processing power over a network of workstations. The practical feasibility of the model has been demonstrated by implementing it for Unix-based workstation environments View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Properties and performance of folded hypercubes

    Publication Year: 1991 , Page(s): 31 - 42
    Cited by:  Papers (62)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (820 KB)  

    A new hypercube-type structure, the folded hypercube (FHC), which is basically a standard hypercube with some extra links established between its nodes, is proposed and analyzed. The hardware overhead is almost 1/n, n being the dimensionality of the hypercube, which is negligible for large n. For this new design, optimal routing algorithms are developed and proven to be remarkably more efficient than those of the conventional n-cube. For one-to-one communication, each node can reach any other node in the network in at most [n/2] hops (each hop corresponds to the traversal of a single link), as opposed to n hops in the standard hypercube. One-to-all communication (broadcasting) can also be performed in only [n/2] steps, yielding a 50% improvement in broadcasting time over that of the standard hypercube. All routing algorithms are simple and easy to implement. Correctness proofs for the algorithms are given. For the proposed architecture, communication parameters such as average distance, message traffic density, and communication time delay are derived. In addition, some fault tolerance capabilities of this architecture are quantified and compared to those of the standard cube. It is shown that this structure offers substantial improvement over existing hypercube-type networks in terms of the above-mentioned network parameters View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance of synchronous parallel algorithms with regular structures

    Publication Year: 1991 , Page(s): 105 - 116
    Cited by:  Papers (31)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (984 KB)  

    New methods are presented for bounding and approximating the mean execution time of partitioning algorithm, and these methods are compared to previous approaches. Distribution-driven and program-driven simulations show that two of the methods are usually accurate to within 10% and give good estimates even when certain independence assumptions are violated. Asymptotic approximations and upper bounds are derived for the average execution time of multiphase algorithms when there is no contention for processes in the parallel phase. In addition, the authors bound the average execution time under static and dynamic scheduling policies and determine the optimum number of parallel tasks to be created to minimize the execution time bounds with constant scheduling overhead View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Block, multistride vector, and FFT accesses in parallel memory systems

    Publication Year: 1991 , Page(s): 43 - 51
    Cited by:  Papers (28)  |  Patents (5)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (836 KB)  

    A discussion is presented of the use of dynamic storage schemes to improve parallel memory performance during three important classes of data accesses: vector accesses in which multiple strides are used to access a single vector, block accesses, and constant-geometry FFT accesses. The schemes investigated are based on linear address transformations, also known as XOR schemes. It has been shown that this class of schemes can be implemented more efficiently in hardware and has more flexibility than schemes based on row rotations or other techniques. Several analytical results are shown. These include: quantitative analysis of buffering effects in pipelined memory systems; design rules for storage schemes that provide conflict-free access using multiple strides, blocks, and FFT access patterns; and an analysis of the effects of memory bank cycle time on storage scheme capabilities View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A parallel execution model of logic programs

    Publication Year: 1991 , Page(s): 79 - 92
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1372 KB)  

    A parallel-execution model that can concurrently exploit AND and OR parallelism in logic programs is presented. This model employs a combination of techniques in an approach to executing logic problems in parallel, making tradeoffs among number of processes, degree of parallelism, and combination bandwidth. For interpreting a nondeterministic logic program, this model (1) performs frame inheritance for newly created goals, (2) creates data-dependency graphs (DDGs) that represent relationships among the goals, and (3) constructs appropriate process structures based on the DDGs. (1) The use of frame inheritance serves to increase modularity. In contrast to most previous parallel models that have a large single process structure, frame inheritance facilitates the dynamic construction of multiple independent process structures, and thus permits further manipulation of each process structure. (2) The dynamic determination of data dependency serves to reduce computational complexity. In comparison to models that exploit brute-force parallelism and models that have fixed execution sequences, this model can reduce the number of unification and/or merging steps substantially. In comparison to models that exploit only AND parallelism, this model can selectively exploit demand-driven computation, according to the binding of the query and optional annotations. (3) The construction of appropriate process structures serves to reduce communication complexity. Unlike other methods that map DDGs directly onto process structures, this model can significantly reduce the number of data sent to a process and/or the number of communication channels connected to a process View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Automatic generation of self-scheduling programs

    Publication Year: 1991 , Page(s): 68 - 78
    Cited by:  Papers (3)  |  Patents (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (976 KB)  

    Techniques are described for the automatic generation of self-scheduling parallel programs. Both scheduling algorithms and the concurrent components of applications are expressed in a high-level concurrent language. Partitioning and data dependency information are expressed by simple control statements, which may be generated either automatically or manually. A self-scheduling compiler, implemented as a source-to-source transformation, takes application code, control statements, and scheduling routines and generates a new program that can schedule its own execution on a parallel computer. The approach has several advantages compared to previous proposals. It generates programs that are portable over a wide range of parallel computers. There is no need to embed special control structures in application programs. The use of a high-level language to express applications and scheduling algorithms facilitates the development, modification, and reuse of parallel programs View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

IEEE Transactions on Parallel and Distributed Systems (TPDS) is published monthly. It publishes a range of papers, comments on previously published papers, and survey articles that deal with the parallel and distributed systems research areas of current importance to our readers.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
David Bader
College of Computing
Georgia Institute of Technology