By Topic

Parallel and Distributed Systems, IEEE Transactions on

Issue 3 • Date Mar 1998

Filter Results

Displaying Results 1 - 8 of 8
  • Parallel computation in biological sequence analysis

    Page(s): 283 - 294
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (364 KB)  

    A massive volume of biological sequence data is available in over 36 different databases worldwide, including the sequence data generated by the Human Genome project. These databases, which also contain biological and bibliographical information, are growing at an exponential rate. Consequently, the computational demands needed to explore and analyze the data contained in these databases is quickly becoming a great concern. To meet these demands, we must use high performance computing systems, such as parallel computers and distributed networks of workstations. We present two parallel computational methods for analyzing these biological sequences. The first method is used to retrieve sequences that are homologous to a query sequence. The biological information associated with the homologous sequences found in the database may provide important clues to the structure and function of the query sequence. The second method, which helps in the prediction of the function, structure, and evolutionary history of biological sequences, is used to align a number of homologous sequences with each other. These two parallel computational methods were implemented and evaluated on an Intel IPSC/860 parallel computer. The resulting performance demonstrates that parallel computational methods can significantly reduce the computational time needed to analyze the sequences contained in large databases View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A framework for reinforcement-based scheduling in parallel processor systems

    Page(s): 249 - 260
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (272 KB)  

    Task scheduling is important for the proper functioning of parallel processor systems. The static scheduling of tasks onto networks of parallel processors is well-defined and documented in the literature. However, in many practical situations a priori information about the tasks that need to be scheduled is not available. In such situations, tasks usually arrive dynamically and the scheduling should be performed on-line or “on the fly”. In this paper, we present a framework based on stochastic reinforcement learning, which is usually used to solve optimization problems in a simple and efficient way. The use of reinforcement learning reduces the dynamic scheduling problem to that of learning a stochastic approximation of an unknown average error surface. The main advantage of the proposed approach is that no prior information is required about the parallel processor system under consideration. The learning system develops an association between the best action (schedule) and the current state of the environment (parallel system). The performance of reinforcement learning is demonstrated by solving several dynamic scheduling problems. The conditions under which reinforcement learning can used to efficiently solve the dynamic scheduling problem are highlighted View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimizing computing costs using divisible load analysis

    Page(s): 225 - 234
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (396 KB)  

    A bus oriented network where there is a charge for the amount of divisible load processed on each processor is investigated. A cost optimal processor sequencing result is found which involves assigning load to processors in nondecreasing order of the cost per load characteristic of each processor. More generally, one can trade cost against solution time. Algorithms are presented to minimize computing cost with an upper bound on solution time and to minimize solution time with an upper bound on cost. As an example of the use of this type of analysis, the effect of replacing one fast but expensive processor with a number of cheap but slow processors is also discussed. The type of questions investigated here are important for future computer utilities that perform distributed computation for some charge View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Low expansion packings and embeddings of hypercubes into star graphs: a performance-oriented approach

    Page(s): 261 - 274
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (588 KB)  

    We discuss the problem of packing hypercubes into an n-dimensional star graph S(n), which consists of embedding a disjoint union of hypercubes U into S(n) with load one. Hypercubes in U have from [n/2] to (n+1)·[log2 n]-2([lod2n]+1)+2 dimensions, i.e., they can be as large as any hypercube which can be embedded with dilation at most four into S(n). We show that U can be embedded into S(n) with optimal expansion, which contrasts with the growing expansion ratios of previously known techniques. We employ several performance metrics to show that, with our techniques, a star graph can efficiently execute heterogeneous workloads containing hypercube, mesh, and star graph algorithms. The characterization of our packings includes some important metrics which have not been addressed by previous research (namely, average dilation, average congestion, and congestion). Our packings consistently produce small average congestion and average dilation, which indicates that the induced communication slowdown is also small. We consider several combinations of node mapping functions and routing algorithms in S(n), and obtain their corresponding performance metrics using either mathematical analysis or computer simulation View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A practical approach to dynamic load balancing

    Page(s): 235 - 248
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (440 KB)  

    This paper presents a cohesive, practical load balancing framework that improves upon existing strategies. These techniques are portable to a broad range of prevalent architectures, including massively parallel machines, such as the Cray T3D/E and Intel Paragon, shared memory systems, such as the Silicon Graphics PowerChallenge, and networks of workstations. As part of the work, an adaptive heat diffusion scheme is presented, as well as a task selection mechanism that can preserve or improve communication locality. Unlike many previous efforts in this arena, the techniques have been applied to two large-scale industrial applications on a variety of multicomputers. In the process, this work exposes a serious deficiency in current load balancing strategies, motivating further work in this area View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Computing performance bounds of fork-join parallel programs under a multiprocessing environment

    Page(s): 295 - 311
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (312 KB)  

    We study a multiprocessing computer system which accepts parallel programs that have a fork-join computational paradigm. The multiprocessing computer system under study is modeled as K homogeneous servers, each with an infinite capacity queue. Parallel programs arrive at the multiprocessing system according to a series-parallel phase type interarrival process with mean arrival rate of h. Upon the program arrival, it forks into K-independent tasks and each task is assigned to an unique server. Each task's service time has a k-stage Erlang distribution with mean service time of λ. A parallel program is completed upon the completion of its last task. This kind of queuing model has no known closed form solution in the general (K⩾2) case. In this paper, we show that by carefully modifying the arrival and service distributions at some imbedded points in time, we can obtain tight performance bounds. We also provide a computational efficient algorithm for obtaining upper and lower bounds on the expected response time. The methodology is flexible and allows one to trade-off the tightness of the bounds and computational cost View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Work-time optimal k-merge algorithms on the PRAM

    Page(s): 275 - 282
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (348 KB)  

    For 2⩽k⩽n, the k-merge problem is to merge a collection of ksorted sequences of total length n into a new sorted sequence. The k-merge problem is fundamental as it provides a common generalization of both merging and sorting. The main contribution of this work is to give simple and intuitive work-time optimal algorithms for the k-merge problem on three PRAM models, thus settling the status of the k-merge problem. We first prove that Ω(n log k) work is required to solve the k-merge problem on the PRAM models. We then show that the EREW-PRAM and both the CREW-PRAM and the CRCW require Ω(log n) time and Ω(log log n+log k) time, respectively, provided that the amount of work is bounded by O(n log k). Our first k-merge algorithm runs in Θ(log n) time and performs Θ(n log k) work on the EREW-PRAM. Finally, we design a work-time optimal CREW-PRAM k-merge algorithm that runs in Θ(log log n+log k) time and performs Θ(n log k) work. This latter algorithm is also work-time optimal on the CREW-PRAM model. Our algorithms completely settle the status of the k-merge problem on the three main PRAM models View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An efficient dynamic scheduling algorithm for multiprocessor real-time systems

    Page(s): 312 - 319
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (264 KB)  

    Many time-critical applications require predictable performance and tasks in these applications have deadlines to be met. In this paper, we propose an efficient algorithm for nonpreemptive scheduling of dynamically arriving real-time tasks (aperiodic tasks) in multiprocessor systems. A real-time task is characterized by its deadline, resource requirements, and worst case computation time on p processors, where p is the degree of parallelization of the task. We use this parallelism in tasks to meet their deadlines and, thus, obtain better schedulability compared to nonparallelizable task scheduling algorithms. To study the effectiveness of the proposed scheduling algorithm, we have conducted extensive simulation studies and compared its performance with the myopic scheduling algorithm. The simulation studies show that the schedulability of the proposed algorithm is always higher than that of the myopic algorithm for a wide variety of task parameters View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

IEEE Transactions on Parallel and Distributed Systems (TPDS) is published monthly. It publishes a range of papers, comments on previously published papers, and survey articles that deal with the parallel and distributed systems research areas of current importance to our readers.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
David Bader
College of Computing
Georgia Institute of Technology