By Topic

Parallel Processing Workshops, 2003. Proceedings. 2003 International Conference on

Date 6-9 Oct. 2003

Filter Results

Displaying Results 1 - 25 of 55
  • A parallel tabu search heuristic for clustering data sets

    Page(s): 230 - 235
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (363 KB) |  | HTML iconHTML  

    Clustering methods partition a set of objects into clusters such that objects in the same cluster are more similar to each other than objects in different clusters according to some defined criteria. In this paper, a parallel tabu search heuristic for solving this problem is developed and implemented on a cluster of PCs. We observe that parallelization does not affect the quality of clustering results, but provides a large saving of the computational times in practice. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Using a distributed active tree in Java for the parallel and distributed implementation of a nested optimization algorithm

    Page(s): 244 - 251
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (315 KB) |  | HTML iconHTML  

    Large tree structured optimization problems can be solved efficiently with decomposition methods. We present a parallel and distributed Java implementation and compare a synchronous version with an asynchronous one. We describe a Java distributed active tree middleware on top of which the algorithm has been implemented. In the algorithmic layer, active tree node objects perform loop iterations and interact via coordination objects provided by the coordination layer which has been implemented on top of Java remote method invocation. The programming model allows for an object oriented, data parallel, shared memory formulation of parallel and distributed algorithms operating on tree structures, and is extensible to other structures as well. We discuss our implementation strategy and experimental results obtained on a Beowulf SMP-cluster and on a network of workstations. The optimization algorithm is part of a high performance decision support tool for asset and liability management. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Practical considerations of using tunable lasers for packet routing in multiwavelength optical networks

    Page(s): 325 - 331
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3697 KB)  

    Using tunable lasers for burst or packet routing in practical IP-centric networks faces many challenges. The wavelength accuracy and stability will affect the ways of encoding payload and headers. In particular, using subcarrier multiplexing to carry the header will make the wavelength stability of a tunable laser a very challenging problem. We investigate the effects of tuning hysteresis on the switching speed and channel control. Optimizing the device structure and lowering the bias current can reduce the hysteresis effect. Integration of tunable laser with semiconductor optical amplifiers can provide output gating during tuning and can allow for reducing the hysteresis. We demonstrate tunable lasers that can switch to accurate and stable DWDM wavelengths in 10 nanoseconds. We conclude that fast tunable lasers can be applied for optical burst or packets switching by improving the tuning control as well as optimizing the device structure. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • From distributed sequential computing to distributed parallel computing

    Page(s): 255 - 262
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (326 KB) |  | HTML iconHTML  

    One approach to distributed parallel programming is to utilize self-migrating threads. Computations can be distributed first, and parallelized second. The first step produces a distributed sequential thread, which can be incrementally parallelized by the second step. This paper prescribes three transformations that turn distributed sequential programs into distributed parallel programs. Real-life examples and performance data are presented, and the advantages of our approach are discussed. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A novel mobile IP registration scheme for hierarchical mobility management

    Page(s): 367 - 374
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (294 KB) |  | HTML iconHTML  

    The use of portable computing devices has been booming in recent years. Making sure of the security in mobile systems has become a very important issue. A secure Mobile IP is designated to solve this problem, but it has suffered from a long delay caused by the mobile host frequently roaming to different agents in the same visited network. The new foreign agent must authenticate the mobile host via the mobile host's home agent. To reduce the overhead of authentication and home registration, we propose a secure Mobile IP registration scheme with hierarchical mobility management. We employ one-way hash function and symmetric cryptosystem to reduce the computation cost of authentication. Furthermore, we deploy a group key for each foreign agent, to simplify the authentication procedure between the mobile host and visited foreign agent. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An energy efficient data reaccess scheme for data broadcast in mobile computing environments

    Page(s): 5 - 12
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (225 KB) |  | HTML iconHTML  

    Data broadcast approach is an efficient technique for disseminating data in mobile computing environments. To reduce the response time and the power consumption of the data broadcast approach, a mobile client may store frequently accessed data items in its cache. When a cached data item becomes out-of-date, the mobile client has to reaccess the new content of the data item from the broadcast channel. Reaccessing a cached data item may incur significant power consumption and suffer from a long delay. In this paper, we propose a data reaccess scheme which enables a mobile client to efficiently reaccess a cached data item. The strength of the proposed scheme lies in its capability to allow a mobile client to correctly reaccess its cached data items while the server inserts data items into or deletes data items from the broadcast structure in the course of data broadcasting. Our experiment shows that the proposed scheme significantly reduces the tuning time required in reaccessing a cached data item. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Towards a single system image for high-performance Java

    Page(s): 143 - 145
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (207 KB) |  | HTML iconHTML  

    Summary form only given. We present our attempts to provide a partial SSI (single system image) in a cluster for concurrent Java programmers, and discuss how the design of the Java Virtual Machine (JVM) has made it possible. At the core of our present design are a thread migration mechanism that works for Java threads compiled in just-in-time (JIT) mode, and an efficient global object space that enables cross-machine access of Java objects. We close with some thoughts on what can be done next to popularize our or similar approaches. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A hierarchical proxy architecture with load-based scheduling scheme to support network mobility

    Page(s): 13 - 20
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3980 KB)  

    A mobile router and all its attached nodes constitute a mobile network. A mobile network may be deployed in a mass transportation to enable people to access the Internet when traveling. However, people in the mobile network have to share scarce wireless bandwidth, and thus tolerate unreasonable long delay if their requests were not scheduled appropriately. In this paper, we present hierarchical-proxy architecture and a load-based scheduling scheme to enhance the performance of the mobile networks with two-tier wireless interfaces. The load-based scheduling scheme classifies Internet services into lightweight and heavyweight services according to their response data sizes and allows heavyweight services to be transferred only in the low-tier network. With the load-based scheduling scheme, we can prevent heavyweight services from occupying the scarce high-tier bandwidth and hampering the transmission of other lightweight services. Simulation results show that the proposed architecture and scheduling scheme can significantly increase the completion rate and decrease the average waiting time of both lightweight and heavyweight services, and thus serve more concurrent requests of multi-services in the mobile network. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A self-determinant scatternet formation algorithm for multi-hop Bluetooth networks

    Page(s): 289 - 296
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (335 KB) |  | HTML iconHTML  

    In this paper we propose a distributed algorithm to construct a scatternet for multi-hop ad hoc networks of Bluetooth devices. This algorithm is fully distributed and does not require the nodes in the networks being in-range (i.e., each pair of nodes in the network may be unable to communicate with each other directly). The role-selection process in existing scatternet formation mostly uses the strategy of message exchange and comparing their weights like IDs or power strength. This results in a large amount of control messages to be sent and a longer scatternet formation time. In our algorithm, the role selection procedure is simple. Nodes can decide their role by a randomly generated counter rather than their 'weights'. According to the proposed approach, nodes can determine their role of either a master or a slave of the piconet without recognizing its neighbors' 'weight'. The algorithm performs better time and reduces the number of control messages remarkably during the role-selection process. In this paper, we also define the gateways of 2-hops and 3-hops for evaluating the distance between two piconets. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An indoor geolocation system for wireless LANs

    Page(s): 29 - 34
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (477 KB) |  | HTML iconHTML  

    With the development of wireless local area networks (WLAN), people are interested in developing the location-based services for WLAN users especially in an indoor environment. The core technology of location-based services is the location-sensing technology. In this paper, we present a location-sensing method which is based on the location fingerprinting approach for WLAN in an indoor environment. This paper focuses on the implementation issues of constructing an indoor location-sensing system. These issues include how to increase accuracy of position, how to reduce the efforts of constructing a fingerprint database and what factors influencing the characteristic of the location fingerprint. The experimental results show that our method can achieve a better accuracy. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A fast branch-and-bound scheme for the multiprocessor scheduling problem with communication time

    Page(s): 104 - 111
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (325 KB) |  | HTML iconHTML  

    In this paper, we propose a fast branch-and-bound (B&B) algorithm for solving the multiprocessor scheduling problem with non-negligible communication time. The basic idea of our proposed method is to focus on an "inevitable" communication delay that could not be avoided in any assignment of tasks onto the processors. The proposed method is implemented as a part of B&B scheme, and the performance of the scheme is evaluated experimentally. The result of experiments implies that for randomly generated instances consisting of at most 300 tasks: 1) we could solve more than 90% of those instances within one minute if any communication takes zero time unit; 2) the percentage of hard instances increases by increasing the number of processors and the time required for each communication; and 3) the proposed method could achieve a significant improvement in increasing the lower bound of partial solutions especially for those hard instances. Those results suggest that the proposed method could output an optimum solution for many instances within a short computing time by combining it with a good heuristic to give a better upper bound. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel differential evolutionary algorithms for physique states characterization in bioengineering

    Page(s): 213 - 219
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (275 KB) |  | HTML iconHTML  

    This paper presents a methodology to detect online the physiological state of strains in a bioreactor. The biologic reactions means that one of the pathways of the metabolism is activated and in this case the microorganisms will produce or will consummate the discrete-event system (DES) is synthesized applying the maximum of modulus of the wavelet transform on measured signals constrained to the biotechnologist expert validation. The determination of holder coefficient by differential evolutionary algorithms allows to make the difference between different discontinuities and to obtain the segmentation of the signals. All these evaluations lead to associate the signals variations during the time to physiological states. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Full text access may be available. Click article title to sign in or learn about subscription options.
  • Automatic detection of multi-level deadlocks in distributed transaction management systems

    Page(s): 297 - 304
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (285 KB) |  | HTML iconHTML  

    A model of asynchronous transaction management has been proposed in this study. This model demonstrates a procedure for elimination of delays caused by the occurrence of distributed deadlocks. The possibilities of occurrence of deadlocks are eliminated by using multiple asynchronous operations. By using the proposed model of activity, many conventional delays associated with transaction processing get eliminated prior to the occurrence of a wait state. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A scalable task duplication based algorithm for improving the schedulability of real-time heterogeneous multiprocessor systems

    Page(s): 89 - 96
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (505 KB) |  | HTML iconHTML  

    In this paper, we propose an O(v2) scalable duplication based algorithm (RT-SDA) for scheduling precedence constrained real-time tasks on heterogeneous multiprocessors. This models a network of workstations, with processors of varying computing power. The algorithm takes the heterogeneities of both computation and communication of the multiprocessor system into account. RT-SDA employs selective task duplication to reduce the start time of the real-time tasks in the job, thereby increasing the guarantee ratio of the real-time application. Moreover, our scheme is scalable in that the application can be scheduled even if the available number of processors is less than the required number of processors. Compared to the existing scheduling algorithms in the literature, RT-SDA offers better schedulability in terms of a higher guarantee ratio. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel multi-scale computation using the message passing interface

    Page(s): 199 - 204
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (421 KB) |  | HTML iconHTML  

    A sequential three-dimensional hybrid molecular-dynamics (MD)/finite-element (FE) code to perform continuum and atomistic multi-scale modeling and computation, has been parallelized using the message passing interface (MPI) library. A master-slave divide-and-conquer approach, emphasizing the functionality and robustness of the code, is implemented through loop parallelism and has reduced execution time: yielding a speedup of greater than 3 and has shown potential for further speedup. The smoothness of the stress distribution across the overlapping region between the continuum domain and the atomistic domain demonstrates the suitability of this method to nanostructure modeling. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • LKHW: a directed diffusion-based secure multicast scheme for wireless sensor networks

    Page(s): 397 - 406
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (355 KB) |  | HTML iconHTML  

    In this paper, we present a mechanism for securing group communications in Wireless Sensor Networks (WSN). First, we derive an extension of logical key hierarchy (LKH). Then we merge the extension with directed diffusion. The resulting protocol, LKHW, combines the advantages of both LKH and directed diffusion: robustness in routing, and security from the tried and tested concepts of secure multicast. In particular, LKHW enforces both backward and forward secrecy, while incurring an energy cost that scales roughly logarithmically with the group size. This is the first security protocol that leverages directed diffusion, and we show how directed diffusion can be extended to incorporate security in an efficient manner. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Software queue-based algorithms for pipelined synchronization on multiprocessors

    Page(s): 115 - 122
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (336 KB) |  | HTML iconHTML  

    Synchronization either ensures mutual exclusion on shared data or forces a processor to wait until a set of variables becomes a specific state; the latter is called conditional synchronization. We have improved the performance of mutual exclusion on multiprocessors by allowing processors to concurrently access different parts of shared data in a pipelined manner (Takesue, 2002). A special software tree of queue-tail pointers is the key scheme for the pipelining, but it requires other hardware schemes such as the queue distributed in the caches. This paper proposes software queue-based algorithms for pipelined synchronization only with the Fetch&Inc. as hardware support. We pipeline mutual exclusion by exploiting the software tree. Conditional synchronization is pipelined by declaring the semaphore as a data structure, and by simulating the P and V operations so that the V can be eagerly performed before accessing shared data. Evaluation results with an RTL (register transfer level) simulator show that as compared with hardware queue-based non-pipelined synchronization, the speedup of our pipelining reaches up to over 2.0 for large data in heavily contentious cases. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Neural network training algorithms on parallel architectures for finance applications

    Page(s): 236 - 243
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (319 KB) |  | HTML iconHTML  

    We focus on the neural network training problem that could be used for price forecasting or other purposes in finance. We design and develop four different parallel and multithreaded backpropagation neural network algorithms: neuron and training set parallelism on a distributed memory architecture using MPI; loop-level (fine-grain) and coarse-grained parallelism in shared memory architecture using OpenMP. We have conducted various experiments to study the performance of these algorithms and compared our results with a traditional autoregression model to establish accuracy of our results. The comparison between our MPI and OpenMP results suggest that the training set parallelism performs better than all the other types of parallelism considered in the study. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Loop-synthesizing transformation for maintaining parallelism and enhancing locality

    Page(s): 156 - 163
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (340 KB) |  | HTML iconHTML  

    On parallel computers, parallelism and locality are critical points for the performance of programs. It is known that locality and parallelism of loop nests can be improved by loop transformations. However, many useful loop transformations are restricted to perfectly nested loop nests. We present a loop-synthesizing transformation for maintaining parallelism and improving locality, with respect to a sequence of parallel loop nests. Since the result of our synthesizing transformation is a perfectly nested loop nest, we can directly perform the loop transformations which are restricted to perfectly nested loop nests, to further parallelize and to further enhance locality after the transformation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance analysis of approximate string searching implementations for heterogeneous computing platform

    Page(s): 173 - 180
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (355 KB) |  | HTML iconHTML  

    This paper presents an analytical performance prediction model that can be used to predict the execution time, speedup and similar performance metrics of four approximate string searching implementations running on an MPI cluster of heterogeneous workstations. The four implementations are based on master-worker model using static and dynamic allocation of the text collection. The developed performance model has been validated on an 8-cluster of heterogeneous workstations and it has been shown that the model is able to predict the execution time and other performance metrics of four parallel implementations accurately. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Secure bootstrapping and routing in an IPv6-based ad hoc network

    Page(s): 375 - 382
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (467 KB) |  | HTML iconHTML  

    The mobile ad hoc network (MANET), which is characterized by an infrastructureless architecture and multi-hop communication, has attracted a lot of attention recently. In the evolution of IP networks to version 6, adopting the same protocol would guarantee the success and portability of MANETs. In this paper, we propose a secure bootstrapping and routing protocol for MANETs. Mobile hosts can autoconfigure and even change their IP addresses based on the concept of CGA (cryptographically generated address), but they can not hide their identities easily. The protocol is modified from DSR (dynamic source routing) to support secure routing. The neighbor discovery and domain name registration in IPv6 are incorporated and enhanced with security functions. The protocol is characterized by the following features: (i) it is designed based on IPv6, (ii) relying on a DNS server, it allows bootstrapping a MANET with little pre-configuration overhead, so network formation is light-weight, and (iii) it is able to resist a variety of security attacks. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A visual approach to specifying message-passing operations

    Page(s): 263 - 270
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (352 KB) |  | HTML iconHTML  

    Visual programming is a very promising approach for parallel programming because of the complexity in making parallel programs. There were several attempts to provide a visual environment for making parallel programs but only achieved a limited success. The commonly used technique is to draw some graphs whose nodes represent modules and arcs represent some communication paths. The graphs are then annotated by attaching some conventional programming codes. In practice, this approach can be useful but in a limited number of cases. To improve the situation, a new visual programming environment is being developed that allows the creation of programs from algorithmic "film" specifications with a minimal use of text in making programs. In this environment, there are six different groups of frames for the programmer to watch, edit, and specify operations. One of them is for specifying I/O operations and communication between software components in a complex program. Specifying communications among processes in a parallel program is just a partial case in this subsystem. This paper presents a visual environment for specifying communication among processes in a parallel program using a language of micro-icons. As an example, the scatter and gather types of collective communication are presented based on the master/slave scheme of computation. These examples show how to define message-passing communication without using text-based programming style. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The parallel computation of time-dependent Monte Carlo transport

    Page(s): 223 - 229
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3256 KB)  

    Parallel Monte Carlo methods are successful because particles are typically independent and easily distributed to multiple processors. For time-dependent Monte Carlo particle transport problem, due to the communication of each time-step about scattering source attribute and meshes, it reduces the parallel efficiency and limits enlarge of parallel scale. We research parallel computation of two types of time-dependent particle transport problems. Adaptive processor assignment in parallel computation and three parallel I/O models with low-cost communication are presented. The optimized processor choice is obtained. We propose a scheme that is based upon Monte Carlo layered sample technique. It is used to treat communication of scattering source. The parallel expandability is greatly improved. The large speedups over the basic methods are obtained. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Secure dynamic distributed routing algorithm for ad hoc wireless networks

    Page(s): 359 - 366
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (289 KB) |  | HTML iconHTML  

    An ad hoc wireless network permits wireless mobile nodes to communicate without prior infrastructure. Due to the limited range of each wireless node, communication sessions between two nodes are usually established through a number of intermediate nodes. Unfortunately, some of these intermediate nodes might be malicious, forming a threat to the security or confidentiality of exchanged data. While data encryption can protect the content exchanged between nodes, analysis of communication patterns may reveal valuable information about end users and their relationships. Using anonymous paths for communication provides security and privacy against traffic analysis. To establish these anonymous paths, all nodes build a global view of the network by exchanging routing information. In dynamic ad hoc networks, building this global view is not an option. In this paper, we propose and analyze a distributed route construction algorithm for use in the establishment of anonymous routing paths in ad hoc wireless networks. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.