By Topic

High Performance Computing in the Asia-Pacific Region, 2000. Proceedings. The Fourth International Conference/Exhibition on

Date 14-17 May 2000

Go

Filter Results

Displaying Results 1 - 25 of 123
  • The Fourth International Conference/Exhibition on High Performance Computing in the Asia-Pacific Region [front matter]

    Page(s): i - xxv
    Save to Project icon | Request Permissions | PDF file iconPDF (892 KB)  
    Freely Available from IEEE
  • Parallelization of decision tree algorithm and its performance evaluation

    Page(s): 574 - 579 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (504 KB)  

    Data mining is a typical application of high performance computing in the business field. An efficient data mining system which can deal with huge amount of data is desired. This paper describes the parallel processing of decision tree which is a typical algorithm for classification of large database. A free software C4.5 is parallelized for SMP machine using thread library. Parallelism in generating a decision tree can be classified into intra-node parallelism and inter-node parallelism. Intra-node parallelism can be further classified into record parallelism, attribute parallelism, and their combination. We have implemented these four kinds of parallelizing methods, and evaluated their effects with four kinds of test data. The result shows that there is a relation between the characteristics of data and the parallelizing methods, and combination of multiple parallelizing methods is the most effective one. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Construction of a parallel and shortest routing algorithm on recursive circulant networks

    Page(s): 580 - 585 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (596 KB)  

    In this paper, we investigate the routing of a message in recursive circulant, that is a key to the performance of this network. On recursive circulant network, we would like to transmit m packets from a source node to a destination node simultaneously along m paths, where the ith packet will traverse along the ith path (0/spl les/i/spl les/m-1). In order for all packets to arrive at the destination node quickly and securely, these m paths must be node-disjoint and the sum of lengths of paths be the smallest. Employing the Hamiltonian circuit Latin square (HCLS), we present O(m2) parallel routing algorithm for constructing a set of m shortest and node-disjoint paths. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient algorithms for protein solvent accessible surface area

    Page(s): 586 - 592 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (652 KB)  

    We present faster sequential and parallel algorithms for computing the solvent accessible surface area (ASA) of protein molecules. The ASA can be obtained by calculating the exposed surface area of the spheres obtained by increasing the van der Waals' radii of the atoms with the van der Waals' radius of the solvent. Using domain specific knowledge, we show that the number of sphere intersections is O(n) and present algorithms to compute the same in O(nlogn) sequential time and O(nlogn/p) parallel time, where n is the number of atoms and p is the number of processors. We also present a heuristic based on space-filling curves to improve performance in practice. These are significant improvements over previously known algorithms which take /spl Omega/(n/sup 2/) time sequentially and /spl Omega/(n/sup 2//p) time in parallel. While existing parallel algorithms achieve their run-time by dynamic load balancing, our algorithms are faster and do not need load balancing. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An efficient parallel algorithm for Lagrange interpolation and its performance

    Page(s): 593 - 598 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (427 KB)  

    This paper introduces a parallel algorithm for computing an N=n2/sup n/ point Lagrange interpolation on n-dimensional cube-connected cycles (CCC/sub n/). The algorithm exploits several communication techniques in a novel way which can be adapted for computing similar functions. The performance of the algorithm is also evaluated by means of a speedup measure. It shows a near to optimal speedup for a state-of-the-art implementation technology. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Programming FFT on DSM multiprocessors

    Page(s): 599 - 606 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (725 KB)  

    The performance of the shared address space programming model for the kinds of coarse-grained communicating programs which have traditionally been common in scientific computing, is not clear today. We use the challenging 1-dimensional FFT, a regular coarse-grained program, as our driving application to study how to get high performance for such kind of applications under the shared address space programming model on a hardware supported cache-coherent distributed memory machine. We find that its performance is highly affected by the data placement. Proper data placement will be critical to the success of this kind of application. Prefetching could further improve the performance to a degree of 10 percent to 50 percent for the data sets we studied. Naive programming will easily cause the performance bottleneck by introducing much more contention and lead to great performance loss. If the shared address space programs are properly programmed, it will deliver much better performance than the other popular programming models, such as MPI and SHMEM. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel complexity for solving tridiagonal linear systems with multiple right-hand sides on 2-D torus interconnection networks

    Page(s): 607 - 612 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (469 KB)  

    Precise upper and lower bounds on running time are derived for the problem of solving tridiagonal linear systems with multiple RHS vectors on 2-dimensional torus interconnection networks. We present various important lower bounds on execution time for solving these systems utilizing odd-even cyclic reduction. Furthermore, algorithms are designed in order to achieve running times that are within a small constant factor of the lower bounds provided. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A fast digital terrain simplification algorithm with a partitioning method

    Page(s): 613 - 618 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (576 KB)  

    We introduce a fast simplification algorithm for terrain height fields to produce a triangulated irregular network, based on the greedy insertion algorithm in (Anjyo et al., 1992; Floriani et al., 1984; 1985). Our algorithm partitions terrain height data into rectangular blocks with the same size and simplifies blocks one by one with the greedy insertion algorithm. Our algorithm references only to the points and the triangles within each current block for adding a point into the triangulation. Therefore, our algorithm runs faster than the greedy insertion algorithm, which references all input points and triangles in the terrain. Our experiment shows that the partitioning method runs from 4 to more than 20 times faster and it approximates test height fields as accurately as the greedy insertion algorithms. Most greedy insertion algorithms suffer from elongated triangles that usually appear near the boundaries. However, we insert the four corner points into each block to produce the base triangulation of the block before the point addition step begins so that elongated triangles could not appear in the simplified terrain. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Average optimal branch-and-bound algorithm on distributed memory systems

    Page(s): 619 - 623 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (473 KB)  

    In this paper, a new data structure called string-queue is proposed in order to implement more efficiently the selection rule and the elimination rule of the general branch-and-bound algorithm. A new general parallel branch-and-bound algorithm on a distributed memory multiprocessor system is also presented. Its communication complexity in single iteration is down to its lower bound O(p) on a 2D mesh network. Both theoretical analysis and experimental results show that its average computational complexity is nearly in its lower bound. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An identification scheme based on the elliptic curve discrete logarithm problem

    Page(s): 624 - 625 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (168 KB)  

    We present a three-move interactive identification scheme based on the elliptic curve discrete logarithm problem. Our identification scheme is as efficient as the elliptic curve version of the Schnorr identification scheme. The scheme inherits almost all the merits of the Schnorr identification scheme and of the elliptic curve version of the Schnorr identification scheme. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A quick adjustable large page B-tree on multiprocessors-QALPB-tree

    Page(s): 626 - 628 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (204 KB)  

    We consider how to exploit multiprocessors to improve the performance of a B-tree structured index. We presented a new large page B-tree, called QALPB-tree. The B-tree can adjust itself quickly, and it has perfect load balance. Preliminary performance results confirm that the QALPB-tree improves query response time, especially for the indexing of massive data. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Three term weighting and classification algorithms in text automatic classification

    Page(s): 629 - 630 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (118 KB)  

    Three automatic text classification algorithms are provided. They are the Bayes method based on Bayes theorem and IDF (Invert Document Frequency), VSM based on Shannon entropy and a fuzzy method based on fuzzy theory. Furthermore, the method of combining term weighting methods with three classification algorithms is also provided in the paper. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A kind of four-term recurrence method of minimal residual

    Page(s): 631 - 632 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (112 KB)  

    We put forward a kind of short recurrence method of a minimal residual for large scale nonsymmetric linear systems. Both theoretical analysis and numerical experiments show that it is better than all the well-known methods such as BCG, CGS and the newly coming GPBCG. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A new approach for implementing the arithmetic Fourier transform (AFT)

    Page(s): 633 - 634 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (180 KB)  

    The arithmetic Fourier transform (AFT) is an important Fourier analysis technique. Since AFT algorithms require lots of non-uniform samples, zero-order interpolation is used for implementing AFT, but this method can produce significant errors. To reduce the errors, over-sampling is needed, meaning the sampling rate should be a number of times the Nyquist rate. The over-sampling problem is the main drawback of AFT. In this paper a new method for implementing AFT with no need for over-sampling is presented. This method gains nearly the same effect as the method using over-sampling, so it makes it possible for us to implement AFT with sampling at the Nyquist rate. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • BSP-based parallel simplex method

    Page(s): 635 - 639 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (215 KB)  

    We introduce a BSP-based parallel simplex method algorithm and analyze its computational cost. Then we give some experimental results on a PC cluster and draw some conclusions. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Two-stage parallel partial retraining scheme for defective multi-layer neural networks

    Page(s): 642 - 647 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (479 KB)  

    We address a high-speed defect compensation method for multi-layer neural networks implemented in hardware devices. To compensate stuck defects of the neurons and weights, we have proposed a partial retraining scheme that adjusts the weights of a neuron affected by stuck defects between two layers by a backpropagation (BP) algorithm. Since the functions of defect compensation can be achieved by using learning circuits, we can save chip area. To reduce the number of weights to adjust, it also leads to high-speed defect compensation. We propose a two-stage partial retraining scheme to compensate input unit stuck defects. Our simulation results show that the two-stage partial retraining scheme can be about 100 times faster than whole network retraining by the BP algorithm. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Queuing analysis of polling system with mixed serve

    Page(s): 648 - 653 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (326 KB)  

    In this paper, the mean cyclic period of polling systems with mixed serve, in which the gated serve and limited serve are adapted to different classes of messages respectively, is given. The mixed serve is introduced first, and then the mean queue length and mean cyclic period are obtained through the embedded Markov chain, generating function and Laplace-Stieltjes transform. Further, the station stability and system stability are distinguished, and the mean cyclic period of the polling system under part or all unstable queues, in which messages are adapted to the limit serve, is given. Lastly, the performance of the polling system under mixed serve is compared with that under gated serve based on the mean cyclic period. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance potentials based stochastic optimization and parallel algorithm for a class of CQN

    Page(s): 654 - 658 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (416 KB)  

    We provide new derivative formulas of the steady-state performance cost for a class of CPN (Closed Queuing Network) defined on an admissible policy set. Three fundamental quantities, performance potentials, realization factors and group inverse of the infinitesimal generator involved in the derivative formulas are given. Some simulation-based algorithms are used to estimate these performance potentials by analyzing a single sample path of CQN, and the two main methods, parallel matrix computation and CRN, are introduced to calculate these quantities. The algorithm of the optimal service policy to minimize the performance cost is obtained by using a parallel stochastic optimization method driven by a performance potential-based gradient estimate. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A GA-based systematic reasoning approach for solving traveling salesman problems using an orthogonal array crossover

    Page(s): 659 - 663 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (737 KB)  

    This paper proposes a novel genetic algorithm-based systematic reasoning approach using an orthogonal array crossover (OAX) for solving the traveling salesman problem (TSP). OAX makes use of the systematic reasoning ability of orthogonal arrays that can effectively preserve superior sub-paths from parents and guide the solution towards better quality. OAX combines the advantages of two traditional approaches: canonical approach and heuristic approach. It is shown empirically that OAX outperforms various superior crossovers in both accuracy and speed. An improved OAX with a well-known heuristic method is also presented. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parameter-free genetic algorithm in distributed manner

    Page(s): 668 - 669 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (140 KB)  

    The genetic algorithm has many parameters to set and adjust. The paper proposes a distributed parameter-free crossover-only genetic algorithm. With adaptive crossover probability and operator, the algorithm can be independent of the initial choice of crossover related parameters. To obtain an appropriate population size, multiple trials are executed in a mobile agent based distributed virtual machine while doubling the population size if the original one has converged. The validity and efficiency of this algorithm are shown by an example involving heterogeneous scheduling in a unified resource framework. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Logical properties of rough sets

    Page(s): 670 - 671 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (143 KB)  

    The authors present a representation theorem of topological Boolean algebras. The result is similar to Stone's representation theorem of Boolean algebras, which establishes the relationship between topological Boolean algebras and rough sets in the general sets. The article is motivated by the problem of logic of rough sets proposed by Z. Pawlak (1995). View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A method to build ontology

    Page(s): 672 - 673 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (142 KB)  

    After many years research work, many intelligent systems based on knowledge have been created. But the differences in creating methods and applying background contexts make it difficult to share and reuse knowledge. This situation leads to the difficulty of building knowledge systems. In order to solve this problem, we use ontology as a foundation to realize knowledge sharing and reuse. As an important research area in AI, the ontology building method has not acquired a common view. The authors mainly discuss the method for building ontology, its principles and implementation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A general evolutionary algorithm and its property analysis

    Page(s): 674 - 675 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (197 KB)  

    EP (evolutionary programming), ES (evolutionary strategy) and GA (genetic algorithm) are three approaches of optimization inspired by the natural evolution process; they are essentially much more in common in terms of computing models and algorithms. A general evolutionary algorithm is proposed and its convergence properties are analyzed. It is claimed that if there exist some quasi-stable states under a design strategy, the algorithm will definitely converge on one of those states. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An MIS security strategy based on client/server architecture

    Page(s): 676 - 677 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (147 KB)  

    Information security is always a major problem for administrators and computer professionals in MIS (management information systems). It is becoming more severe, especially in today's networked environments. Either the manager or the computer professional staff have exhausted various methods for solving this problem. Generally speaking, information security is a composite and systematic problem, which should be considered from every aspect of MIS. Besides controlling data access on the database, we need to plan overall, from system network level to MIS application level, and to formulate a complete MIS system security control strategy. The article presents a study of MIS system security based on the client/server architecture. It gives the data and system security strategies and methods in database, computer network, and MIS application levels. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An integrated development environment for concurrent software developing based on object oriented Petri nets

    Page(s): 678 - 680 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (196 KB)  

    An approach to concurrent software development is proposed. The article introduces the implemented principles of an integrated development environment for concurrent software based on object oriented Petri nets (OOPN) with multi-computer coordinate working. Under this method, two programming levels of concurrent software design are adopted. The top level is for concurrent part modeling by drawing object oriented Petri net pictures. The bottom level is for sequential part programming by using traditional high level languages. OOPN-IDE (Object Oriented Petri Net based Integrated Development Environment) is the research result for the methodology (J. Niu, 1999). It can make both the concurrent part of Petri net pictures and the sequential part of high level language programs be integrated as a whole concurrent program autamatically. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.