By Topic

Parallel Computing in Electrical Engineering, 2004. PARELEC 2004. International Conference on

Date 7-10 Sept. 2004

Filter Results

Displaying Results 1 - 25 of 75
  • [Cover page]

    Publication Year: 2004, Page(s): c1
    Request permission for commercial reuse | PDF file iconPDF (167 KB)
    Freely Available from IEEE
  • [Title page]

    Publication Year: 2004, Page(s):i - iv
    Request permission for commercial reuse | PDF file iconPDF (83 KB)
    Freely Available from IEEE
  • Table of contents

    Publication Year: 2004, Page(s):v - xi
    Request permission for commercial reuse | PDF file iconPDF (60 KB)
    Freely Available from IEEE
  • Welcome from the Chairs

    Publication Year: 2004, Page(s): xii
    Request permission for commercial reuse | PDF file iconPDF (105 KB)
    Freely Available from IEEE
  • Conference Committee

    Publication Year: 2004, Page(s): xiii
    Request permission for commercial reuse | PDF file iconPDF (97 KB)
    Freely Available from IEEE
  • Program Committee

    Publication Year: 2004, Page(s): xiv
    Request permission for commercial reuse | PDF file iconPDF (105 KB)
    Freely Available from IEEE
  • Organizing Committee

    Publication Year: 2004, Page(s): xv
    Request permission for commercial reuse | PDF file iconPDF (93 KB)
    Freely Available from IEEE
  • Reviewers

    Publication Year: 2004, Page(s): xvi
    Request permission for commercial reuse | PDF file iconPDF (92 KB)
    Freely Available from IEEE
  • Organic computing - Vision and challenge for system design [breaker page]

    Publication Year: 2004, Page(s): 3
    Cited by:  Papers (2)
    Request permission for commercial reuse | PDF file iconPDF (47 KB)
    Freely Available from IEEE
  • The Role of Parallel Computing at ABB Corporate Research Switzerland

    Publication Year: 2004, Page(s): 4
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (67 KB) | HTML iconHTML

    The paper presents a short summary of the invited talk to PARELEC 2004 conference in Dresden. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimising MPI Applications for Heterogeneous Coupled Clusters with MetaMPICH

    Publication Year: 2004, Page(s):7 - 12
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (168 KB) | HTML iconHTML

    Cluster systems built mainly from commodity hardware components have become more and more usable for high performance computing tasks in the past few years. To increase the parallelism for applications, it is often desirable to combine those clusters to a higher lever, commonly called metacomputer. This class of high performance computing platforms can be understood as a cluster of clusters, where... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Matrix Multiplication Performance on Commodity Shared-Memory Multiprocessors

    Publication Year: 2004, Page(s):13 - 18
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (184 KB) | HTML iconHTML

    Cache-oblivious algorithms for matrix multiplication are confirmed as an effective way of exploiting Intel architecture shared-memory multiprocessors. The performance also remains consistent across a wide range of matrix size. The Cilk programming environment remains an effective way of implementing this type of algorithm, but the need for portability and a compiler upgrade route mean that a porta... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel Implementation of FDTD Computations Based on Macro Data Flow Paradigm

    Publication Year: 2004, Page(s):19 - 24
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (600 KB) | HTML iconHTML

    In this paper, we present a methodology, which enables designing optimal macro data flow graphs that represent computation and communication patterns for the FDTD problem in irregular computational areas. The macro data flow graphs are executed in a MIMD system. Communication is implemented with a Remote Direct Memory Access facility. To obtain minimal communication overheads, the rotating buffers... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimal Programming of Critical Sections in Modern Network Processors under Performance Requirements

    Publication Year: 2004, Page(s):25 - 30
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (152 KB) | HTML iconHTML

    Modern network processors deliver a set of methods for implementing critical sections. A number of them rely on specific hardware support and capabilities, while software techniques are still available when hardware support is not flexible enough. Network processors are dedicated to packet processing and their main goal is to achieve the best possible packet processing performance. Therefore, when... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Large-Scale Tolerance Analysis

    Publication Year: 2004, Page(s):33 - 38
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (184 KB) | HTML iconHTML

    In this paper, we present an approach to the tolerance analysis of non-linear, time-invariant systems depending on statistical distributed parameters. The approach is based on an approximation of the performance function over the parameter domain which allows the determination of statistical properties of this function and provides appropriate information for the design adjustment. We show that pa... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Employing Compilers for Determining Architectural Features of Application-Specific DSPs

    Publication Year: 2004, Page(s):39 - 44
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (120 KB) | HTML iconHTML

    In order to achieve high performance and low hardware overhead over application specific integrated circuits (ASICs), application-specific DSPs (AS-DSPs) are more and more widely used. However, designing them is still a tedious, time-consuming and error-prone task since each application has to be analyzed thoroughly, which is usually done by hand. Recently, we proposed a platform approach to desig... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Compiler Scheduling for STA-Processors

    Publication Year: 2004, Page(s):45 - 60
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (136 KB) | HTML iconHTML

    This paper presents an adaptation of the list scheduling algorithm to generate code for processors of the Synchronous Transfer Architecture (STA) by applying techniques known from RISC and TTA. The proposed scheduling approach is based on informed, deterministic algorithms that can be implemented run-time efficiently. Although the presented compiler prototype does not generate optimized code, it p... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Analog VLSI Pulsed Neural Network Implementation for Image Segmentation

    Publication Year: 2004, Page(s):51 - 55
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (168 KB) | HTML iconHTML

    We present a massively parallel VLSI realisation of a pulse-coupled neural network for image segmentation. The network consists of simple integrate-and-fire (IAF) neurons with self-organising local connections. The prototype implementation comprises 64 x 64 neurons with coupling of four nearest neighbours, digital to analog converters, analog memories and a digital readout circuit. The chip has be... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Moldable Task Scheduling in Dynamic SMP Clusters with Communication on the Fly

    Publication Year: 2004, Page(s):59 - 64
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (264 KB) | HTML iconHTML

    The paper concerns task graph scheduling in parallel programs using the concept of moldable computational tasks for a parallel architecture based on dynamic SMP processor clusters with data transmissions on the fly. The presented algorithm for scheduling parallel program graphs decomposes an initial program graph to sub-graphs, which fulfill the definition of a moldable task. So identified moldabl... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Program Graph Scheduling for SMP Clusters with Communication on-the-Fly Based on Extended DS Approach

    Publication Year: 2004, Page(s):65 - 70
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1048 KB) | HTML iconHTML

    This paper deals with a problem of program graph scheduling for a parallel with dynamic processor switching and data transfers on the fly. The architecture of such system is based on a concept of SMP clusters implemented in network-on-chip (NoC) modules, which connect processors to shared memory modules. Processors can by dynamically switched between clusters at runtime, allowing dynamic distribut... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Optimal Abstraction Model for Hardware Multithreading in Modern Processor Architectures

    Publication Year: 2004, Page(s):71 - 76
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (192 KB) | HTML iconHTML

    This document presents a theoretical analysis of state-of-the-art hardware threading approaches such as Switch on Event Multi Threading (SoEMT) and Simultaneous Multi Threading (SMT). It proposes that the On-Demand Virtual Single-Instruction-Multiple-Data (ODVSIMD) abstraction model is a very efficient method of hardware threading in certain scenarios. The principles of ODVSIMD abstraction model a... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Dynamic Piecewise Linear/Regular Algorithms

    Publication Year: 2004, Page(s):79 - 84
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (176 KB) | HTML iconHTML

    In this paper we present an extension of the class of piecewise linear algorithms (PLAs) in order to model one type of dynamic data dependencies. This extension significantly increases the range of applications which can be parallelized and mapped to massively parallel processor arrays. For instance, a lot of computational intensive applications for video and image processing consist of nested loo... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Algorithm Partitioning including Optimized Data-Reuse for Processor Arrays

    Publication Year: 2004, Page(s):85 - 90
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (320 KB) | HTML iconHTML

    This paper describes a method for algorithm partitioning through which affine indexed algorithms are transformed to Processor Arrays. Former design flows start with a spacetime transformation which we omit completely. Therefore, we are able to consider the constraints of a target architecture at the beginning of our design flow. We show our method for three different partitioning schemes and empha... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Modified Vertex Method for Parallelization of Arbitrary Nested Loops

    Publication Year: 2004, Page(s):91 - 96
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (152 KB) | HTML iconHTML

    A technique, permitting us to linearize constraints formed to find affine schedules for arbitrary nested loops, is presented. The main advantage of this technique is that it does not require finding the polytope vertices and results in the fewer number of inequalities and equalities than that yielded with the vertex technique. Affine schedules found are valid for the arbitrary positive lower and u... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Introducing Variable Sharing to Process Calculi

    Publication Year: 2004, Page(s):99 - 104
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (128 KB) | HTML iconHTML

    The π calculus models communication using synchronous channels which proved to be unrealistic in implementation. The join calculus eliminates the synchronous channels. It, however, still considers message-passing as the only communication form in distributed systems. We show that processes can share variables in local memories of system nodes, if they can travel among the nodes. We present a ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.