By Topic

Parallel and Distributed Systems, 1996. Proceedings., 1996 International Conference on

Date 3-6 June 1996

Filter Results

Displaying Results 1 - 25 of 71
  • Proceedings of 1996 International Conference on Parallel and Distributed Systems

    Save to Project icon | Request Permissions | PDF file iconPDF (543 KB)  
    Freely Available from IEEE
  • Author index

    Save to Project icon | Request Permissions | PDF file iconPDF (112 KB)  
    Freely Available from IEEE
  • Support of cooperating and distributed business processes

    Page(s): 22 - 31
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (900 KB)  

    Workflow management systems/business process management systems (BPMS) provide for an integral support of computer-based information processing, personal activities, business procedures and their relationships to organizational structures. They support the modeling and analysis of so-called business processes and offer means for the application-near design and implementation of computer-based business process assistance. Mainly, the BPMSs concentrate on the support of enterprise-internal processes. Our approach extends the scope of business process management. Enterprise-internal processes are viewed as sub-processes of global inter-enterprise processes. Additional global process assistance is based on the definition of global activity models and global information models. Features of dynamic naming and binding can be provided by business process brokers, which extend the concepts of object trading to the trading of opportunities to participate in global processes View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast parallel chessboard distance transform algorithms

    Page(s): 488 - 493
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (500 KB)  

    In this paper, based on the diagonal propagation approach, we first provide an O(N2) time sequential algorithm to compute the chessboard distance transform (CDT) of an N×N image, which is a DT using the chessboard distance metrics. Based on the proposed sequential algorithm, the CDT of a 2-D binary image array of size N×N can be computed in O (log N) time on the EREW PRAM model using O(N2/log N) processors, O(log N/log log N) time on the CRCW PRAM model using O(N2log log N/log N) processors and O(log N) time on the hypercube computer using O(N2) processors. Following the mapping as proposed by Y.H. Lee and S.J. Horng (1995), the algorithm for the MAT is also efficiently derived. The medial axis transform of a 2-D binary image array of size N×N can be computed in O(log N) time on the EREW PRAM model using O(N2/log N) processors, O(log N/log log N) time on the CRCW PRAM model using O(N2log log N/log N) processors, and O(log N) time on the hypercube computer using O(N2) processors. Our algorithms are faster than the best previous results as proposed by J.F. Jenq and S. Sahni (1992) View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • PPD: A practical parallel loop detector for parallelizing compilers

    Page(s): 274 - 281
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (640 KB)  

    It is well known that extracting parallel loops plays a significant role in designing parallelizing compilers. The execution efficiency of a loop is enhanced when the loop can be executed in parallel or partial parallel, like a DOALL or DOACROSS loop. This paper reports on the practical parallelism detector (PPD) that is implemented in PFPC (a portable FORTRAN parallelizing compiler running on OSF/1) at NCTU to concentrate on finding the parallelism available in loops. The PPD can extract the potential DOALL and DOACROSS loops in a program by invoking a combination of the ZIV test and the I test for verifying array subscripts. Furthermore, if DOACROSS loops are available, an optimization of synchronization statement is made. Experimental results show that PPD is more reliable and accurate than previous approaches View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Distributed arithmetic-based architectures for high speed IIR filter design

    Page(s): 156 - 161
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (588 KB)  

    Dedicated VLSI has been considered as an effective method to realize the DSP algorithms which require massive amount of computations. To speed up the computing, parallel processing and pipelining techniques are often employed. However, for those recursive algorithms where the available computing concurrency is very limited, these design tactics do not help. Previous results suggest a Look-ahead transform method applied to the algorithm first to create the parallelism at the cost of drastically increased hardware complexity. In this paper, we present a Distributed Arithmetic based scheme to solve the problem without resorting to the expensive look-ahead methods. In contrast to the conventional “bit-parallel word-serial” computing paradigm, the new scheme features a “bit-serial word-parallel” approach. In this scheme, instead of waiting the entire data word available from the previous recursion, current recursion's computation can start as soon as the LSB from the previous recursion is obtained. This means the initiation interval between two successive input data is reduced from the delay of computing one word to the delay of computing one bit. To illustrate the merits of this new scheme, we present a DA based VLSI design for an IIR filter. The design is implemented using a 0.8 μm SPDM technology and can achieve high throughput rate operation for real time applications View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Expressing concurrency in Griffin

    Page(s): 292 - 299
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (592 KB)  

    Griffin is a statically typed language designed specifically for the rapid prototyping of Ada software. In this paper we describe the Griffin language constructs for expressing concurrency. There are two salient innovations: (1) an extended select statement that provides greater flexibility in managing non-deterministic behavior and (2) the capability of a receiving thread to concurrently rendezvous with multiple sending threads. Thus by increasing concurrency, we alleviate the chief drawback to synchronous communication. We then apply the Griffin constructs in implementing solutions to the readers-writers problem, group lock mechanism, and scheduling groups of concurrent operations. The Griffin constructs facilitate the expression of high-level algorithms that maximize concurrency View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On parallelism of hyper-linking theorem proving: a preliminary report

    Page(s): 494 - 499
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (532 KB)  

    This paper exploits the parallelism of a hyper-linking based theorem prover. We analyze the unique properties of the the hyper-linking proof procedure and present the preliminary results. With respect to these properties four parallel strategies, phase-level, clause-level, literal-level, search level parallelism are designed for different implementation schemes of the prover. Results and analysis of the experiments on these parallel strategies are presented View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Program dependence analysis of concurrent logic programs and its applications

    Page(s): 282 - 291
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (948 KB)  

    In this paper a formal model for program dependence analysis of concurrent logic programs is proposed with the following contributions. First, two language-independent program representations are presented for explicitly representing control flows and/or data flows in a concurrent logic program. Then based on these representations, program dependences between literals in concurrent logic programs are defined formally, and a dependence-based program representation named the Literal Dependence Net (LDN) is presented for explicitly representing primary program dependences in a concurrent logic program. Finally, as applications of the LDNs, some important software engineering activities including program slicing, debugging, testing, complexity measurement, and maintenance are discussed in a programming environment for concurrent logic programs View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • All-fault-tolerant embedding of a complete binary tree in a group of Cayley graphs

    Page(s): 34 - 40
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (468 KB)  

    This paper proposes an approach for embedding a complete binary tree with height k×(n-2k+1)+(k-2)×2k+1, where k=[log n], into an n-dimensional complete transposition graph (CTn), star graph (STn), and bubblesort graph (BSn) with dilations 1,3, and 2n-3 respectively. Furthermore, a fault-tolerant scheme is developed to recover multiple faults, and the dilations after recovery become at most 3,5, and 2n-1 for the CTn, STn , and BSn respectively View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Distributed checkpointing based on influential messages

    Page(s): 440 - 447
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (648 KB)  

    In distributed applications, a group of multiple objects are cooperated to achieve some objectives. The computation on the objects are based on the massage passing, i.e. remote procedure call. The objects may suffer from different kinds of faults. In the presence of the object faults, the states of the objects in the system have to be kept consistent. If some object o is faulty, o is rolled back to the checkpoint and objects which have received messages from o are also required to be rolled back. In this paper, we define influential messages whose receivers are required to be rolled back from the application point of view if the senders are rolled back on the basis of the message semantics. By using the influential messages, we would like to define a significant checkpoint which denotes a consistent global state of the system but might be inconsistent from the traditional definition. We would like to present protocols for taking the significant checkpoint and for rolling back the objects by using the influential messages View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Deadlock-free routing in an optical interconnect for high-speed wormhole routing networks

    Page(s): 256 - 264
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (896 KB)  

    The Supercomputer SuperNet (SSN) is a two-level hierarchical high-speed network. The lower level is a high speed electronic mesh fabric; the higher level is a WDM optical backbone network interconnecting the high-speed fabrics distributed across a campus or metropolitan area. The salient characteristics of this architecture are the use of wormhole routing and backpressure hop-by-hop flow control mechanism. Because of these features, deadlocks are possible in SSN. In this paper, we address the issue of deadlock-free routing which is an essential prerequisite for the proper operation of SSN. To this end, we first present a deadlock free routing scheme for the WDM backbone which is implemented with a shufflenet multihop virtual topology. We use the notion of virtual channels to obtain mappings of virtual channels to physical channels such that deadlock-free routing is achieved for any (p,k) shufflenet (uni and bidirectional). Then, we compare the virtual channels scheme with the more conventional up/down deadlock free routing scheme for the bidirectional shufflenet and show that the former yields much better performance. Finally, we address the problem of deadlock prevention across the entire network (i.e., lower level fabric as well as the optical backbone) and develop an integrated solution combining different schemes best suited for the different levels View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fault-tolerant causal delivery in group communication

    Page(s): 302 - 309
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (672 KB)  

    In distributed systems, a group of processes are cooperated to execute an application program. A group is established among multiple processes and only processes in the group communicate with each other. This type of group communication is named intra-group communication. The communication system has to support the reliable intra-group communication in the presence of the process fault. In order to tolerate the process fault, each process in the group is replicated into a collection of multiple replicas named a cluster. In this paper, we would like to propose a new intra-group communication protocol which supports the causally ordered delivery of messages for the processes within the group. In addition, the protocol supports the reliable delivery of messages in the presence of the Byzantine faults of the processes View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Double parity sparing for performance improvement in disk arrays

    Page(s): 169 - 174
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (468 KB)  

    RAID level 5 disk arrays provide highly reliable, cost effective secondary storage with high performance for reads and large writes. Small writes, however, cause the degradation of their performance, since small writes need additional disk accesses to update parity data. There have been many studies to overcome this problem. In this paper, we propose a new technique, double parity sparing, which is a variation of disk array with a spare. To improve the parity update process, the proposed scheme uses a spare block and a parity block as two available parity blocks. We present results from simulations to show that the proposed scheme offers better performance while there is no failure in disk arrays View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Bubblesort star graphs: a new interconnection network

    Page(s): 41 - 48
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (704 KB)  

    In this paper, we propose and analyze a new interconnection network called bubblesort star graph, which is the merger of the bubblesort graph and the star graph. We present the deadlock-free wormhole routing algorithm for the proposed network. We also develop the method to embed a mesh into a bubblesort star graph with dilation two and expansion one. Besides, we use the recursive scheme to embed the multiple disjoint copies of the hypercube into a bubblesort star graph with all faults recovery capacity as well as constant expansion and dilation one or two. This reflects the fact that the embeddability of the bubblesort star graph is much better than that of the star graph View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance of scheduling strategies for client-server systems

    Page(s): 448 - 455
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (648 KB)  

    Scheduling on client-server systems has not received much attention from researchers. Based on simulation this research presents a number of insights into system behavior and scheduling. Two phenomena, CPU monopolization by large service requests and software bottlenecking are observed to have a strong influence on system performance. Software bottlenecking is a new phenomenon, observed on distributed client server systems with multiple levels of servers and occurs when a higher level server is blocked waiting for a response to a service request from a lower level server. Policies based on request characteristics such as service times and path lengths are found to effectively control these effects and improve system performance View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Roll-forward error recovery in embedded real-time systems

    Page(s): 414 - 421
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (716 KB)  

    Roll-forward checkpointing schemes are developed in order to avoid rollback in the presence of independent faults and to increase the possibility that a task completes within a tight deadline. However, despite of the adoption of roll-forward recovery, these schemes are not necessarily appropriate for time-critical applications because interactions with the external environment and communications between processes must be deferred during checkpoint validation steps (typically, two checkpoint intervals) until the fault-free processors are identified. The deadlines on providing services may thus be violated. In this paper we present and discuss two alternative roll-forward recovery schemes, especially for time-critical and interaction-intensive applications, that deliver correct, timely results even when checkpoint validation is required View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multimedia task reliability analysis based on token ring network

    Page(s): 265 - 272
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (588 KB)  

    In this paper, we attempt to develop reliability models for the reliability analyses in distributed multimedia system. We propose two polynomial-time algorithms to compute the multimedia task reliability (MTR) and time-constraint multimedia task reliability (TCMTR) for distributed multimedia system based on token ring networks. Two main algorithms, multimedia task reliability for ring (MTRR) and multimedia task reliability for path (MTRP), are used to compute the MTR. For time constraint media access, we extend our proposed algorithms to compute TCMTR View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Programming concurrency and synchronisation in Actel

    Page(s): 189 - 196
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (736 KB)  

    This paper introduces mechanisms for exploiting concurrency and synchronisation in Actel: a concurrent object based language. It focuses on issues of combining parallelism with object orientation, performance, and synchronisation. Actel offers several mechanisms such as a new mode of message passing called `semi-reference' to achieve an efficient inter-object concurrency. Parallel functions and parallel compound statements are provided to exploit concurrency inside an object at several levels without recourse to explicit synchronisation. Implicit synchronisation is obtained through future variables View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Distributed fault detection in communication protocols using extended finite state machines

    Page(s): 310 - 318
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (808 KB)  

    Run-time fault detection in communication protocols is essential because of faults that occur in the form of coding defects, memory problems, and external disturbances. Finite State Machine models have been used in the past to detect and diagnose protocol faults. However, the fault coverage of these models is limited to vocabulary faults and sequencing faults. We present an Extended Finite State Machine Model (EFSM) to augment the fault coverage of the FSM model. We extend the parallel decomposition method to EFSMs in order to reduce the size of the observer used to detect faults. The decomposition of the EFSM into several independent EFSMs results in multiple observers. The distributed fault detection mechanism increases the reliability of the fault detection and the EFSM model improves the fault coverage View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Optical Bus Computer Cluster with a deferred cache coherence protocol

    Page(s): 175 - 182
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (784 KB)  

    In this paper, we first propose a class of workstation cluster which utilizes optical wavelength-division multiplexing (WDM) technology to connect nodes (work-stations) of the cluster. The Optical Bus Computer Cluster (OBCC) falls in the class of cache coherent non-uniform memory access (CC-NUMA) multiprocessors. The basic topology of the cluster is star-shaped with an optical star-coupler in the center to enable one-hop simultaneous broadcasting of information packets from one node to all other cluster nodes. WDM technology not only multiplies by N times the network bandwidth using a single optical fiber, where N is the degree of wavelength multiplexing, but also provides independent communication paths between pairs of cluster nodes by properly assigning wavelengths to inter-node communication. Then we identify the cache subsystem requirements for the OBCC and propose or deferred cache coherence protocol suitable for the OBCC. The basic coherence maintenance scheme is to lazy-evaluate the cache coherence transactions among cluster nodes, utilizing the weak consistency memory model. By deferring the transactions, it is possible to combine multiple transaction issues into one transaction by accumulating modified status bits in the enhanced cache status fields. Since not only the remote memory access but also coherence transaction are costly operations in CC-NUMA systems, the deferred invocation of coherence transactions is particularly useful in CC-NUMA systems such as OBCC. We then give a performance evaluation by simulation that the coherence protocol effectively reduces coherence transactions, particularly in situations where false sharing of longer cache lines becomes noticeable View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Scalable routing schemes for massively parallel processing using reconfigurable optical interconnect

    Page(s): 500 - 507
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (660 KB)  

    We consider the message routing/broadcasting problem in an optically interconnected massively parallel processing system, where each node in the system sends/broadcasts randomly generated packets to others. The network model considered is the reconfigurable optical interconnect (ROI). It is based on the new device capabilities enabled by recent advances in optical technology. A ROI node can use light beams to transmit messages to any other nodes in the network, provided that no others transmit to the same destination concurrently. We present communication schemes that can achieve near optimal throughput with significantly lower delay. The difference in performance for routing is on the order of Ω(n2/3 log log n/log n) when the number of nodes n in the network is large View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A flexible protocol synthesis method for adopting requirement changes

    Page(s): 319 - 326
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (736 KB)  

    Communicating entities in a protocol specification communicate with each other and provide services to their users. Once the behaviours of the entities (specification of the protocol) are specified they are not changed. The entities provide a fixed set of services to their users. However, different users have different requirements and their requirements change often. The requirement changes in general are small in terms of the size of the behaviour expressions of the entities. Traditional protocol synthesis techniques have given considerable attention in construction of new protocols for fixed set of services but less attention to the attractive maintenance issue of protocol to adopt new protocol requirement changes. What is desirable is a protocol synthesis method which adopts new protocol requirement changes into the behaviours of entities or protocol specifications. In this paper, we propose a protocol synthesis method which adopts the new protocol requirement changes into the protocol specification. In this way, we can use existing protocol specifications and maintain them to adopt requirement changes. We use the formal specification language LOTOS to specify requirement changes and the behaviours of entities View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance evaluation of a WDMA OIDSM multiprocessors

    Page(s): 162 - 168
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (556 KB)  

    Optically interconnected distributed shared memory (OIDSM) systems offer significant performance advantages due to the fast interconnection network. The photonic network of the proposed approach is based on a wavelength-division multiplexed (WDMA) passive star-coupled configuration. Optical self-routing is achievable which partitions the traffic, relaxing the design constraints on the receiver subsystem since a node now only receives and processes a fraction of the network traffic. A major concern with the multi-access approach is that a media access control and a cache coherence protocol are required to provide access to a distributed arbitration of the WDMA photonic network. In particular, one class of media access protocol (TDMAC) requires a control channel to broadcast reservation requests, and the broadcast capability is also able to support coherence level control signals such as invalidations which enable a snooping based coherence protocol. This paper evaluates how OIDSM can ease the traffic in large-scale snooping-based shared memory multiprocessors View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficiency of the domain decomposition method for the parallelization of implicit finite element code

    Page(s): 49 - 56
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (524 KB)  

    An analytical model is presented for estimating parallel efficiency of the domain decomposition method which is used for the parallelization of implicit finite element code. Serial and parallel finite element codes with domain decomposition and direct LDU solution of equation systems are developed. Dependencies of parallel efficiency on problem size are obtained for IBM SP2 with 4,6 and 8 processor nodes. It is shown that interprocessor load balancing during assembly-decomposition phase which can be achieved by partitioning into unequal subdomains increase parallel efficiency considerably. Predicted and measured values of parallel efficiency are in reasonable agreement View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.