By Topic

Advances in Parallel and Distributed Computing, 1997. Proceedings

Date 19-21 March 1997

Filter Results

Displaying Results 1 - 25 of 60
  • Proceedings. Advances in Parallel and Distributed Computing

    Publication Year: 1997
    Request Permissions | PDF file iconPDF (2262 KB)  
    Freely Available from IEEE
  • Author index

    Publication Year: 1997 , Page(s): 425 - 426
    Request Permissions | PDF file iconPDF (148 KB)  
    Freely Available from IEEE
  • Control mechanism for software pipelining on nested loop

    Publication Year: 1997 , Page(s): 345 - 350
    Cited by:  Papers (2)  |  Patents (1)
    Request Permissions | Click to expandAbstract | PDF file iconPDF (624 KB)  

    ILSP (Interlaced inner and outer Loop Software Pipelining) is an efficient algorithm of optimizing operations in the nested loops. To ensure the ILSP has a good time efficiency and a good space efficiency, there must be an efficient nested control mechanism to support the algorithm. Our control mechanism is realized by hardware, it avoids adding many extra instructions and minimises the II (Initia... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A lifetime-sensitive scheduling method

    Publication Year: 1997 , Page(s): 351 - 354
    Request Permissions | Click to expandAbstract | PDF file iconPDF (328 KB)  

    This paper presents a lifetime-sensitive scheduling method. By shortening lifetimes of variables in scheduling phase, it can lighten register pressure in register allocation phase, lessen spill codes and result in more efficient object codes. The preliminary experimental results show that this method is an effective scheduling method View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Interaction nets revisited

    Publication Year: 1997 , Page(s): 108 - 115
    Request Permissions | Click to expandAbstract | PDF file iconPDF (584 KB)  

    Past attempts to apply Girard's linear logic to Lafont's interaction nets by treating “symbols” as logical rules, however, failed to come to a significant explanation. In this paper, we try to model “symbols” as external axioms and use “tensor” to describe partition of auxiliary ports. We show that our solution leads to a very natural logical interpretation of t... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Adaptive hybrid scheduling of nonuniform loops on UMA models

    Publication Year: 1997 , Page(s): 383 - 387
    Request Permissions | Click to expandAbstract | PDF file iconPDF (500 KB)  

    It is very difficult to keep load balancing among processors for the nonuniform loop in compile-time and it must be at the price of extra overhead to use dynamic methods. This paper proposes an adaptive hybrid scheduling way, in which the processes of distribution of loop are divided into a few rounds and the block size in each round is determined adaptively according to the average overhead due t... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Precise dependence test for scalars within nested loops

    Publication Year: 1997 , Page(s): 356 - 361
    Request Permissions | Click to expandAbstract | PDF file iconPDF (420 KB)  

    Exact direction and distance vectors are essential for detecting hierarchical parallelism and examining legality of loop transformation for a multiple level loop nest. Much of this work has been concentrated on array references. Little has been done to address the problems of finding precise dependences between scalar references, except to use extended SSA form with factored use-def links. In this... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Coherent parallel programming in C∥

    Publication Year: 1997 , Page(s): 116 - 122
    Cited by:  Papers (1)
    Request Permissions | Click to expandAbstract | PDF file iconPDF (680 KB)  

    This paper presents the coherent parallel programming concept using a new parallel language called C|| (pronounced C Parallel). The C|| language is based on the standard C language with a small set of extended constructs for parallelism and process interaction. At the core of C|| is a structured construct called coherent region, which facilitates the development of coherent programs, i.e., paralle... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A versatile directory scheme (Dir2NB+L) and its implementation on BY91-1 multiprocessors system

    Publication Year: 1997 , Page(s): 180 - 185
    Request Permissions | Click to expandAbstract | PDF file iconPDF (740 KB)  

    Cache coherence and synchronization between processors have been two critical issues in designing a shared memory multiprocessors system. From the perspective of hardware design, a directory based cache coherence protocol and lock mechanism are employed to prevent inconsistency of caches and warrant atomic memory accesses. The BY91-1 multiprocessors efficiently integrate supports for cache coheren... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An efficient parallel texture classification for image retrieval

    Publication Year: 1997 , Page(s): 18 - 25
    Cited by:  Papers (2)
    Request Permissions | Click to expandAbstract | PDF file iconPDF (1056 KB)  

    This paper proposes an efficient parallel approach to texture classification for image retrieval. The idea behind this method is to pre-extract texture features in terms of texture energy measurement associated with a `tuned' mask and store them in a multi-scale and multi-orientation texture class database via a two-dimensional linked list for query. Thus each texture class sample in the database ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A new architecture for branch-intensive loops

    Publication Year: 1997 , Page(s): 241 - 246
    Cited by:  Papers (1)
    Request Permissions | Click to expandAbstract | PDF file iconPDF (580 KB)  

    A new VLIW architecture, called GPMB (Global Pipelining of Multi-Branch), is discussed in this paper. The GPMB architecture can handle branch-intensive programs efficiently. With the concept of next address function, GPMB regards branching as correctly calculating the next address. The next address function is implemented by hardware and software in GPMB. A brief description of GPMB and a detailed... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel recursive algorithm for tridiagonal systems

    Publication Year: 1997 , Page(s): 124 - 130
    Request Permissions | Click to expandAbstract | PDF file iconPDF (616 KB)  

    In this paper, a parallel algorithm for solving tridiagonal equations based on recurrence is presented. Compared with the parallel prefix method (PP) which is also based on the recursive method, the computation cost is reduced by a factor of two while maintaining the same communication cost. The method can be viewed as a modified prefix method or prefix with substructuring. The complexity of the a... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficiency issues of a parallel FEM implementation on shared memory computers

    Publication Year: 1997 , Page(s): 156 - 161
    Request Permissions | Click to expandAbstract | PDF file iconPDF (400 KB)  

    In the field of parallel FEM methods a number of highly efficient solutions for distributed memory systems exist, but the passage to adaptive parallel FEM simulations leads, in all probability, to a more dynamic behaviour with respect to data placement and load balancing. Therefore shared-memory architecture seems to be a more appropriate solution for getting efficient implementations. This paper ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Utilization of disk drives for RAID

    Publication Year: 1997 , Page(s): 186 - 189
    Request Permissions | Click to expandAbstract | PDF file iconPDF (388 KB)  

    A stochastic Petri nets (SPN) model of RAID-5 is constructed. With the model and its isomorphic Markov chain, the average utilization of disk drives in RAID for small write and large I/O request can be calculated. It provides us a good method to evaluate the performance of RAID in the paper View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel design and implementation of SOM neural computing model in PVM environment of a distributed system

    Publication Year: 1997 , Page(s): 26 - 31
    Cited by:  Papers (2)
    Request Permissions | Click to expandAbstract | PDF file iconPDF (556 KB)  

    A parallel design and implementation of the Self-Organizing Map (SOM) neural computing model is proposed. The parallel design of SOM is implemented in a parallel virtual machine (PVM) environment of a distributed system. A practical realization of SOM algorithm is investigated, the construction of computing module in parallel virtual machine is discussed, the communication methods and an optimizat... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Small, scalable, and efficient, microkernels for highly parallel computers are possible: Cosy as an example

    Publication Year: 1997 , Page(s): 196 - 203
    Request Permissions | Click to expandAbstract | PDF file iconPDF (864 KB)  

    Although highly parallel distributed memory computers exist for several years, the operating systems used on them did not fit the requirements very well. Most of them are designed for sequential, shared memory parallel or distributed computers. Examples are Unix on the IBM SP/2 and Mach on the Intel Paragon. This results in poor scalability caused by inefficient communication primitives designed f... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An environment for the parallel execution of multigrain clustered tasks

    Publication Year: 1997 , Page(s): 320 - 327
    Request Permissions | Click to expandAbstract | PDF file iconPDF (896 KB)  

    In this paper, we present an original approach for the design and execution of distributed applications that require numerous tasks of variable grain. The approach is based on the concept of task cluster which is an entity that groups tasks with strong logical interaction and that guarantees efficient communications between them. We describe the implementation of the model, that mainly relies on t... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An improved parallel algorithm for Delaunay triangulation on distributed memory parallel computers

    Publication Year: 1997 , Page(s): 131 - 138
    Cited by:  Papers (1)
    Request Permissions | Click to expandAbstract | PDF file iconPDF (640 KB)  

    Delaunay triangulation has been much used in such applications as volume rendering, shape representation, terrain modeling and so on. The main disadvantage of Delaunay triangulation is large computation time required to obtain the triangulation on an input points set. This time can be reduced by using more than one processor, and several parallel algorithms for Delaunay triangulation have been pro... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Automatic generation of parallel compiler-partial evaluation of parallel lambda language

    Publication Year: 1997 , Page(s): 390 - 397
    Request Permissions | Click to expandAbstract | PDF file iconPDF (708 KB)  

    We describe in this paper a partial evaluator for a parallel programming language. The parallel language we present is a combination of lambda calculus and message passing communication mechanism. By improving some techniques originally used for partial evaluation of sequential language and introducing some new methods, we successfully solve the problems caused by some internal semantic difference... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient implementation of portable C*-like data-parallel library in C++

    Publication Year: 1997 , Page(s): 398 - 405
    Request Permissions | Click to expandAbstract | PDF file iconPDF (772 KB)  

    The C* language is a data-parallel extension of the C language which incorporates parallel data types. Since the C++ language provides operator overloading, a C++ library can implement the C* parallel extensions with a similar syntax. Although library implementations are highly portable, some overheads make them impractical. The two major overheads incurred are temporaries in each operator applica... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The χ-calculus

    Publication Year: 1997 , Page(s): 74 - 81
    Cited by:  Papers (1)
    Request Permissions | Click to expandAbstract | PDF file iconPDF (748 KB)  

    The paper proposes a new process algebra, called χ-calculus. The language differs from π-calculus in several aspects. First it takes a more uniform view on input and output. Second, the closed names of the language are homogeneous in the sense that there is only one kind of bound name. Thirdly, the effects of communications in χ-calculus are delimited by localization operators, not by s... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An effective parallelizing scheme of MPEG-1 video encoding on Ethernet-connected workstations

    Publication Year: 1997 , Page(s): 4 - 11
    Cited by:  Papers (4)
    Request Permissions | Click to expandAbstract | PDF file iconPDF (972 KB)  

    Although MPEG-1 Video is a promising and the most widely used moving picture compression standard it requires a lot of computational resources to encode the moving pictures with a reasonable frame size and quality. In this paper we propose and implement an efficient parallelizing scheme for an MPEG-1 Video encoding algorithm on Ethernet-connected workstations which is the most widely available com... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Eliminating two kinds of data flow inaccuracy in the presence of pointer aliasing

    Publication Year: 1997 , Page(s): 410 - 415
    Request Permissions | Click to expandAbstract | PDF file iconPDF (644 KB)  

    Program languages with sophisticated usage of pointers as C are hard to analyze. Recent researches on pointer analysis focus on tracking the possible values of pointers, when a program point is reached, and great progress has been achieved. However, how to apply the result of pointer analysis to dataflow analysis and other program optimization/parallelization is not well studied. This paper presen... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • GPR-Tree: a global parallel index structure for multiattribute declustering on cluster of workstations

    Publication Year: 1997 , Page(s): 300 - 306
    Cited by:  Papers (4)
    Request Permissions | Click to expandAbstract | PDF file iconPDF (796 KB)  

    R-tree is a very popular dynamic access structure cable of storing multidimensional and spatial data. Considering it's merit of the efficient global balance and dynamic reorganization, we try to use R-tree to decluster the multiattribute data in database system or file system. As many previous multiattribute declustering mechanisms do not take into account the properties of the Cluster of Workstat... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Construction of multimedia server in a distributed multimedia system

    Publication Year: 1997 , Page(s): 248 - 252
    Request Permissions | Click to expandAbstract | PDF file iconPDF (460 KB)  

    The framework of constructing a distributed multimedia system based on the server/client architecture is described in this paper. We focus our attention on the realization of synchronization presentation of different media in a multimedia application, and a set of QoS (qualify of service) parameters is given as a criterion to make a trade-off between overall performance of the system and the synch... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.