Scheduled System Maintenance on May 29th, 2015:
IEEE Xplore will be upgraded between 11:00 AM and 10:00 PM EDT. During this time there may be intermittent impact on performance. We apologize for any inconvenience.
By Topic

Parallel and Distributed Systems, IEEE Transactions on

Issue 1 • Date Jan. 2005

Filter Results

Displaying Results 1 - 13 of 13
  • [Front cover]

    Publication Year: 2005 , Page(s): c1
    Save to Project icon | Request Permissions | PDF file iconPDF (143 KB)  
    Freely Available from IEEE
  • [Inside front cover]

    Publication Year: 2005 , Page(s): c2
    Save to Project icon | Request Permissions | PDF file iconPDF (74 KB)  
    Freely Available from IEEE
  • Editor's note

    Publication Year: 2005 , Page(s): 1 - 3
    Save to Project icon | Request Permissions | PDF file iconPDF (130 KB) |  | HTML iconHTML  
    Freely Available from IEEE
  • A decentralized convergence detection algorithm for asynchronous parallel iterative algorithms

    Publication Year: 2005 , Page(s): 4 - 13
    Cited by:  Papers (8)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (336 KB) |  | HTML iconHTML  

    We introduce a theoretical algorithm and its practical version to perform a decentralized detection of the global convergence of parallel asynchronous iterative algorithms. We prove that, even if the algorithm is completely decentralized, the detection of global convergence is achieved on one processor under the classical conditions. The proposed algorithm is very useful in the context of grid computing in which the processors are distributed and in which detecting the convergence on a master processor may be penalizing or even impossible as in peer to peer computation frameworks. Finally, the efficiency of the practical algorithm is illustrated in a typical experiment. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A process distribution approach for multisensor data fusion systems based on geographical dataspace partitioning

    Publication Year: 2005 , Page(s): 14 - 23
    Cited by:  Papers (7)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (270 KB) |  | HTML iconHTML  

    In this work, we present a new approach to distributed sensor data fusion (SDF) systems in multitarget tracking, called TSDF (Tessellated SDF), centered around a geographical partitioning (tessellation) of the data. A functional decomposition divides SDF into components that can be assigned to processing units, parallelizing the processing. The tessellation implicitly defines the set of tracks potentially yielding correlations with the sensor plots (observations) in a tile. Some tracks may occur as correlation candidates for multiple tiles. Conflicts caused by correlations of such tracks with plots in different tiles, are resolved by combining all involved tracks and plots into independent data association problems. The benefit of the TSDF approach to a clustering-based process distribution is independence of the problem space, which yields better scalability and manageability characteristics. The TSDF approach allows scaling in more than one way. It allows SDF for single sensor, multiple sensors on a single platform, and even for multiple sensors on multiple platforms. It also provides the flexibility to scale the processing to the size of the problem. This enables a better control of the throughput, to meet various timing constraints. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel implementation of back-propagation algorithm in networks of workstations

    Publication Year: 2005 , Page(s): 24 - 34
    Cited by:  Papers (20)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (800 KB) |  | HTML iconHTML  

    This work presents an efficient mapping scheme for the multilayer perceptron (MLP) network trained using back-propagation (BP) algorithm on network of workstations (NOWs). Hybrid partitioning (HP) scheme is used to partition the network and each partition is mapped on to processors in NOWs. We derive the processing time and memory space required to implement the parallel BP algorithm in NOWs. The performance parameters like speed-up and space reduction factor are evaluated for the HP scheme and it is compared with earlier work involving vertical partitioning (VP) scheme for mapping the MLP on NOWs. The performance of the HP scheme is evaluated by solving optical character recognition (OCR) problem in a network of ALPHA machines. The analytical and experimental performance shows that the proposed parallel algorithm has better speed-up, less communication time, and better space reduction factor than the earlier algorithm. This work also presents a simple and efficient static mapping scheme on heterogeneous system. Using divisible load scheduling theory, a closed-form expression for number of neurons assigned to each processor in the NOW is obtained. Analytical and experimental results for static mapping problem on NOWs are also presented. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • VI-attached database storage

    Publication Year: 2005 , Page(s): 35 - 50
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1144 KB) |  | HTML iconHTML  

    This work presents a Vl-attached database storage architecture to improve database transaction rates. More specifically, we examine how Vl-based interconnects can be used to improve I/O path performance between a database server and a storage subsystem. To facilitate the interaction between client applications and a Vl-aware storage system, we design and implement a software layer called DSA, that is layered between applications and VI. DSA takes advantage of specific VI features and deals with many of its shortcomings. We provide and evaluate one kernel-level and two user-level implementations of DSA. These implementations trade transparency and generality for performance at different degrees and, unlike research prototypes, are designed to be suitable for real-world deployment. We have also investigated many design trade offs in the storage cluster. We present detailed measurements using a commercial database management system with both microbenchmarks and industrial database workloads on a mid-size, 4 CPU, and a large, 32 CPU, database server. We also compare the effectiveness of Vl-attached storage with an iSCSI configuration, and conclude that storage protocols implemented using DSA over VI have significant performance advantages. More generally, our results show that Vl-based interconnects and user-level communication can improve all aspects of the I/O path between the database system and the storage back-end. We also find that to make effective use of VI in I/O intensive environments, we need to provide substantial additional functionality than what is currently provided by VI. Finally, new storage APIs that help minimize kernel involvement in the I/O path are needed to fully exploit the benefits of Vl-based communication. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Cost-effective designs of WDM optical interconnects

    Publication Year: 2005 , Page(s): 51 - 66
    Cited by:  Papers (21)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1120 KB) |  | HTML iconHTML  

    Optical communication, in particular, wavelength-division-multiplexing (WDM) technique, has become a promising networking choice to meet ever-increasing demands on bandwidth from emerging bandwidth-intensive computing/communication applications, such as data browsing in the World Wide Web, multimedia conferencing, e-commerce, and video-on-demand services. As optics becomes a major networking media in all communications needs, optical interconnects will inevitably play an important role in interconnecting processors in parallel and distributed computing systems. We consider a cost-effective design of WDM optical interconnects for current and future generation parallel and distributed computing and communication systems. We first categorize WDM optical interconnects into two different connection models based on their target applications: the wavelength-based model and the fiber-link-based model. Most of existing WDM optical interconnects belong to the first category. We then present a minimum cost design for WDM optical interconnects under wavelength-based model by using sparse crossbar switches instead of full crossbar switches in combination with wavelength converters. For applications that use the fiber-link-based model, we show that network cost can be significantly reduced, and present such a minimum cost design for WDM optical interconnects under this model. Finally, we generalize the idea used in the design for the fiber-link-based model to WDM optical interconnects under the wavelength-based model, and obtain another new design that can trade off switch cost with wavelength converter cost in this type of WDM optical interconnect. The results in this paper are applicable to any emerging optical switching technologies, such as SOA-based and MEMS-based technologies. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A two-level directory architecture for highly scalable cc-NUMA multiprocessors

    Publication Year: 2005 , Page(s): 67 - 79
    Cited by:  Papers (8)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1304 KB) |  | HTML iconHTML  

    One important issue the designer of a scalable shared-memory multiprocessor must deal with is the amount of extra memory required to store the directory information. It is desirable that the directory memory overhead be kept as low as possible, and that it scales very slowly with the size of the machine. Unfortunately, current directory architectures provide scalability at the expense of performance. This work presents a scalable directory architecture that significantly reduces the size of the directory for large-scale configurations of a multiprocessor without degrading performance. First, we propose multilayer clustering as an effective approach to reduce the width of directory entries. Based on this concept, we derive three new compressed sharing codes, some of them with a space complexity of O(log2(log2(N))) for an N-node system. Then, we present a novel two-level directory architecture to eliminate the penalty caused by compressed directories in general. The proposed organization consists of a small full-map first-level directory (which provides precise information for the most recently referenced lines) and a compressed second-level directory (which provides in-excess information for all the lines). The proposals are evaluated based on extensive execution-driven simulations (using RSIM) of a 64-node cc-NUMA multiprocessor. Results demonstrate that a system with a two-level directory architecture achieves the same performance as a multiprocessor with a big and nonscalable full-map directory, with a very significant reduction of the memory overhead. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The UCSC Kestrel parallel processor

    Publication Year: 2005 , Page(s): 80 - 92
    Cited by:  Papers (30)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1168 KB) |  | HTML iconHTML  

    The architectural landscape of high-performance computing stretches from superscalar uniprocessor to explicitly parallel systems, to dedicated hardware implementations of algorithms. Single-purpose hardware can achieve the highest performance and uniprocessors can be the most programmable. Between these extremes, programmable and reconfigurable architectures provide a wide range of choice in flexibility, programmability, computational density, and performance. The UCSC Kestrel parallel processor strives to attain single-purpose performance while maintaining user programmability. Kestrel is a single-instruction stream, multiple-data stream (SIMD) parallel processor with a 512-element linear array of 8-bit processing elements. The system design focuses on efficient high-throughput DNA and protein sequence analysis, but its programmability enables high performance on computational chemistry, image processing, machine learning, and other applications. The Kestrel system has had unexpected longevity in its utility due to a careful design and analysis process. Experience with the system leads to the conclusion that programmable SIMD architectures can excel in both programmability and performance. This work presents the architecture, implementation, applications, and observations of the Kestrel project at the University of California at Santa Cruz. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 2004 Reviewers list

    Publication Year: 2005 , Page(s): 93 - 96
    Save to Project icon | Request Permissions | PDF file iconPDF (69 KB)  
    Freely Available from IEEE
  • TPDS Information for authors

    Publication Year: 2005 , Page(s): c3
    Save to Project icon | Request Permissions | PDF file iconPDF (74 KB)  
    Freely Available from IEEE
  • [Back cover]

    Publication Year: 2005 , Page(s): c4
    Save to Project icon | Request Permissions | PDF file iconPDF (143 KB)  
    Freely Available from IEEE

Aims & Scope

IEEE Transactions on Parallel and Distributed Systems (TPDS) is published monthly. It publishes a range of papers, comments on previously published papers, and survey articles that deal with the parallel and distributed systems research areas of current importance to our readers.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
David Bader
College of Computing
Georgia Institute of Technology