By Topic

Parallel Processing Workshops (ICPPW), 2010 39th International Conference on

Date 13-16 Sept. 2010

Filter Results

Displaying Results 1 - 25 of 91
  • [Front cover]

    Page(s): C1
    Save to Project icon | Request Permissions | PDF file iconPDF (144 KB)  
    Freely Available from IEEE
  • [Title page i]

    Page(s): i
    Save to Project icon | Request Permissions | PDF file iconPDF (31 KB)  
    Freely Available from IEEE
  • [Title page iii]

    Page(s): iii
    Save to Project icon | Request Permissions | PDF file iconPDF (72 KB)  
    Freely Available from IEEE
  • [Copyright notice]

    Page(s): iv
    Save to Project icon | Request Permissions | PDF file iconPDF (125 KB)  
    Freely Available from IEEE
  • Table of contents

    Page(s): v - xii
    Save to Project icon | Request Permissions | PDF file iconPDF (137 KB)  
    Freely Available from IEEE
  • Message from the Workshops Co-chairs

    Page(s): xiii
    Save to Project icon | Request Permissions | PDF file iconPDF (71 KB)  
    Freely Available from IEEE
  • CLAWS 2010: International Workshop on Compilers, Languages and Architectures for Web Services

    Page(s): xiv
    Save to Project icon | Request Permissions | PDF file iconPDF (68 KB)  
    Freely Available from IEEE
  • DIAMOND 2010: International Workshop on Data Intensive Applications in Mobile and Distributed Environments

    Page(s): xv - xvi
    Save to Project icon | Request Permissions | PDF file iconPDF (75 KB)  
    Freely Available from IEEE
  • MCSoC 2010: Fifth International Symposium on Embedded Multicore SoCs

    Page(s): xvii - xix
    Save to Project icon | Request Permissions | PDF file iconPDF (83 KB)  
    Freely Available from IEEE
  • PSTI 2010: First International Workshop on Parallel Software Tools and Tool Infrastructures

    Page(s): xx - xxi
    Save to Project icon | Request Permissions | PDF file iconPDF (65 KB)  
    Freely Available from IEEE
  • SCC 2010: The Second International Workshop on Security in Cloud Computing

    Page(s): xxii - xxiv
    Save to Project icon | Request Permissions | PDF file iconPDF (83 KB)  
    Freely Available from IEEE
  • SRMPDS 2010: Sixth International Workshop on Scheduling and Resource Management for Parallel and Distributed Systems

    Page(s): xxv - xxvii
    Save to Project icon | Request Permissions | PDF file iconPDF (80 KB)  
    Freely Available from IEEE
  • P2S2 2010: Third International Workshop on Parallel Programming Models and Systems Software for High-End Computing

    Page(s): xxviii - xxix
    Save to Project icon | Request Permissions | PDF file iconPDF (81 KB)  
    Freely Available from IEEE
  • GreenCom 2010: Second International Workshop on Green Computing

    Page(s): xxx
    Save to Project icon | Request Permissions | PDF file iconPDF (73 KB)  
    Freely Available from IEEE
  • AWASN 2010: International Workshop on Applications of Wireless Ad Hoc and Sensor Networks

    Page(s): xxxi - xxxii
    Save to Project icon | Request Permissions | PDF file iconPDF (78 KB)  
    Freely Available from IEEE
  • WS4D: Toolkits for Networked Embedded Systems Based on the Devices Profile for Web Services

    Page(s): 1 - 8
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (325 KB) |  | HTML iconHTML  

    As the application of the Internet Protocol (IP) is not longer restricted to the internet and computer networks, future IP-based application scenarios require an enormous diversity of heterogeneous platforms and systems. Thereby emerging communication architectures, concepts, technologies and protocols must be capable of handling thousands of devices and communication endpoints on the one hand and be flexible and extensible enough on the other hand, to provide cross domain interoperability independent of platform specific constraints. The Devices Profile for Web Services (DPWS) is such a cross domain technology. This paper provides an overview of DPWS and existing DPWS implementations and toolkits with special focus on the Web Service for Devices (WS4D) initiative. Therefore, features and capabilities of DPWS are described in detail by referring to the open source WS4D implementations. The target platforms are ranging from resource rich server platforms down to highly resource constrained embedded devices. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Load Balancing Scheme for ebXML Registries

    Page(s): 9 - 16
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (769 KB) |  | HTML iconHTML  

    Large scale Service Oriented Architecture (SOA) developments are becoming increasingly reliant on registry services that manage Web Services using taxonomic attributes. At present a registry stores a Web Services interface definition and protocol bindings in WSDL, along with one or more XML schema files that define the structure of a SOAP message exchanged between Web Services operations and client processes and other static metadata. During Web Service discovery an ebXML registry returns the access URI associated with the service binding to allow dynamic discovery and invocation. This usually restricts a calling process to a Web Service invocation on one host. This work explores a mechanism to manage service bindings for a Web Service that has been deployed across multiple hosts, such that, a URI returned by a registry can resolve to a host that satisfies different system constraints like current CPU load, physical memory, swap memory, and time of day. This paper discusses the design and development of new scheme for ebXML registries that facilitates periodic collection and management of dynamic system properties for registry clients and enforces constraints during service discovery and query operation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Sampling-Based Algorithm for Approximating Maximum Average Value Region in Wireless Sensor Network

    Page(s): 17 - 23
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (472 KB) |  | HTML iconHTML  

    In wireless sensor network, sensory readings are often noisy due to the imprecision of measuring hardware and the disturbance of deployment environment, so it is often inaccurate if we use individual sensor readings to answer queries. In this paper, we consider a useful application of sensor network: maximum average value region query. This query returns the region with the maximum average value among all possible regions in the network, where the region is a fix-sized circle pre-defined by users. Using the average value of a region to answer the query, noises between sensors will be neutralized with each other, which will make the results more reliable. However, because of the huge amount of possible regions in the network, it is costly to process the query exactly. Therefore, we propose a sampling-based algorithm AMAVR to deal with the problem approximately. AMAVR uses a background value to prune the useless regions which cannot be the result. A further optimization strategy is also given to handle the situation that, background value based filter does not work when some individual sensor nodes have higher values than their neighbors. By using both of the two techniques, the scale of the sampling population can be effectively reduced, that is, we cost less energy to get a satisfying result. At last, the conducted simulations demonstrate the energy efficiency of the proposed methods in our paper. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Collaborative Spatial Object Recommendation in Location Based Services

    Page(s): 24 - 33
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1143 KB) |  | HTML iconHTML  

    Recommendation systems have found their ways into many on-line web applications, e.g., product recommendation on Amazon and movie recommendation on Netflix. Particularly, collaborative filtering techniques have been widely used in these systems to personalize the recommendations according to the needs and tastes of users. In this paper, we apply collaborative filtering in spatial object recommendation which is essential in many location based services. Due to the large number of spatial objects and participating users, using collaborative filtering to obtain recommendations for a particular user can be very expensive. However, we observe that users tend to have affinity for some regions and argue that using users with similar regional bias in recommendation may help in reducing the search space of similar users. Thus, we propose two techniques, namely, Access Minimum Bounding Rectangle Overlapped Area (AMBROA) and Grid Division Cosine Similarity (GDCS), to form regions of interests that represent user location interests and activities and to find users with local access similarity to facilitate effective spatial object recommendation. We conduct an extensive performance evaluation to validate our ideas. Evaluation result demonstrates the superiority of our proposal over the conventional approach. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Application Specific Instruction Accelerator for Multistandard Viterbi and Turbo Decoding

    Page(s): 34 - 43
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (452 KB) |  | HTML iconHTML  

    There is an increasing demand for converged solution for multi-standard radio processors to support existing and future standards. In this work, heterogeneous multi-processor platform is proposed for multi standard wireless communication system which is programmable and scalable in adapting to future standards. Channel decoding algorithms form important constituent of wireless communication system because of their computational complexity. A programmable radio processor is proposed for channel decoding with application specific instruction accelerators. Viterbi and Turbo channel decoding algorithms are analyzed for computational parallelism in the algorithms and for hardware reusability across the algorithms. Application specific instruction accelerator is designed by exploiting similar characteristics and computational parallelism across the algorithms. The analysis shows that the throughput of 54Mbps for UWB Viterbi Decoder and 12 Mbps for UMTS Turbo Decoder at 91.7MHz can be achieved using the proposed design. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multi-layer Prefetching for Hybrid Storage Systems: Algorithms, Models, and Evaluations

    Page(s): 44 - 49
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (414 KB) |  | HTML iconHTML  

    Parallel storage systems have been highly scalable and widely used in support of data-intensive applications. In future systems with the nature of massive data processing and storing, hybrid storage systems opt for a solution to fulfill a variety of demands such as large storage capacity, high I/O performance and low cost. Hybrid storage systems (HSS) contain both high-end storage components (e.g. solid-state disks and hard disk drives) to guarantee performance, and low-end storage components (e.g. tapes) to reduce cost. In HSS, transferring data back and forth among solid-state disks (SSDs), hard disk drives (HDDs), and tapes plays a critical role in achieving high I/O performance. Prefetching is a promising solution to reduce the latency of data transferring in HSS. However, prefetching in the context of HSS is technically challenging due to an interesting dilemma: aggressive prefetching is required to efficiently reduce I/O latency, whereas overaggressive prefetching may waste I/O bandwidth by transferring useless data from HDDs to SSDs or from tapes to HDDs. To address this problem, we propose a multi-layer prefetching algorithm that can judiciously prefetch data from tapes to HDDs and from HDDs to SSDs. To evaluate our algorithm, we develop an analytical model and the experimental results reveal that our prefetching algorithm improves the performance in hybrid storage systems. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Split Driver Approach to Soc Virtualization - Challenges and Opportunities

    Page(s): 50 - 57
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (950 KB) |  | HTML iconHTML  

    Embedded platforms are becoming increasingly more resource-rich (e.g. processing speeds, number of cores, memory, and communication rates). As a result, they are being transformed from `closed', fixed-function devices to programmable and flexible platforms capable of supporting diverse types of services. One approach to enabling service diversity jointly with proper isolation of key critical functionality is to leverage platform virtualization technology. Toward this end, this paper first describes an approach to virtualizing System-on-a-Chip (SoC) platforms, and next explores the opportunities for shared use of such virtualized SoC devices by multiple concurrently executing services. The research is conducted on the Intel Tolapai SoC which integrates an x86 core with a crypto accelerator, and using the Xen hypervisor. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • FFT Algorithms Evaluation on a Homogeneous Multi-processor System-on-Chip

    Page(s): 58 - 64
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (694 KB) |  | HTML iconHTML  

    This paper presents the evaluation of radix-2, radix-4 and radix-8 algorithms for N-point FFTs on a homogeneous Multi-Processor System-on-Chip, prototyped on FPGA device. The evaluation of the algorithms was done analysing profiling of the algorithms in comparison to a single processor architecture. The performance were evaluated in terms of required clock cycles, achieved speed-up and parallelization efficiency. The analysis showed for each algorithm how the parallelization efficiency grows moving from small to larger FFTs. Moreover the comparison between the different implementations showed the parallelization properties of each algorithm. Radix-2 algorithm shows the best speed-up and parallelization efficiency while radix-4 gives the best performance in terms of required clock cycles. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Parallel Skeleton Library for Embedded Multicores

    Page(s): 65 - 73
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (349 KB) |  | HTML iconHTML  

    Many SoCs adopt multicore architectures. As a result, embedded programmers are also facing the challenge of parallel programming. We propose a parallel skeleton library that can be used on embedded multicores. Our library is implemented in standard C++ using template features. We propose two parallel skeletons to support common program patterns on multicores. In our skeleton library, programmers can easily choose underlying parallel implementations with no code changes. Experimental results show that many applications can take advantage of these two skeletons for performance improvement, sometimes better than hand-parallelized code. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Power and Performance Tabu Search Based Multicore Network-on-Chip Design

    Page(s): 74 - 81
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1228 KB) |  | HTML iconHTML  

    This paper presents a Tabu search based approach for the topology synthesis of application-specific multicore architectures using an automated design technique. The Tabu search method incorporates multiple objectives in order to generate an optimal NoC topology which accounts for both power and performance factors. The method generates a system-level floorplan in each major stage of the topology synthesis. By incorporating the floorplan information, it is possible to attain accurate values for power consumption of the routers and physical links, as well as manage the interconnections within the system. The technique also includes a contention analyzer that assesses performance and omits any potential bottlenecks. The contention analyzer uses a Layered Queuing Network approach to model the rendezvous interactions amongst system components. Several experiments are conducted using various SoC benchmark applications to compare the power and performance outcomes of the proposed technique. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.