By Topic

Computing Systems (WSCAD-SCC), 2010 11th Symposium on

Date 27-30 Oct. 2010

Filter Results

Displaying Results 1 - 25 of 34
  • [Front cover]

    Page(s): C1
    Save to Project icon | Request Permissions | PDF file iconPDF (2028 KB)  
    Freely Available from IEEE
  • [Title page i]

    Page(s): i
    Save to Project icon | Request Permissions | PDF file iconPDF (27 KB)  
    Freely Available from IEEE
  • [Title page iii]

    Page(s): iii
    Save to Project icon | Request Permissions | PDF file iconPDF (72 KB)  
    Freely Available from IEEE
  • [Copyright notice]

    Page(s): iv
    Save to Project icon | Request Permissions | PDF file iconPDF (108 KB)  
    Freely Available from IEEE
  • Table of contents

    Page(s): v - vii
    Save to Project icon | Request Permissions | PDF file iconPDF (227 KB)  
    Freely Available from IEEE
  • Conference Organization

    Page(s): vii
    Save to Project icon | Request Permissions | PDF file iconPDF (78 KB)  
    Freely Available from IEEE
  • Message from General Chairs

    Page(s): viii
    Save to Project icon | Request Permissions | PDF file iconPDF (81 KB)  
    Freely Available from IEEE
  • Mensagem dos Coordenadores Gerais

    Page(s): ix
    Save to Project icon | Request Permissions | PDF file iconPDF (100 KB)  
    Freely Available from IEEE
  • Message from Program Chairs

    Page(s): x
    Save to Project icon | Request Permissions | PDF file iconPDF (78 KB)  
    Freely Available from IEEE
  • Mensagem dos Coordenadores do Comitê de Programa

    Page(s): xi
    Save to Project icon | Request Permissions | PDF file iconPDF (76 KB)  
    Freely Available from IEEE
  • Program Committee and Reviewers

    Page(s): xii - xiii
    Save to Project icon | Request Permissions | PDF file iconPDF (58 KB)  
    Freely Available from IEEE
  • Brazilian Computer Society

    Page(s): xiv - xvi
    Save to Project icon | Request Permissions | PDF file iconPDF (84 KB)  
    Freely Available from IEEE
  • Parallel Routing Algorithm for Extra Level Omega Networks on Reconfigurable Systems

    Page(s): 1 - 8
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (403 KB)  

    Several parallel routing algorithms have been proposed during the last three decades. However, most algorithms have been not implemented. Therefore, the execution time and memory resources have been neither measured nor reported. This work presents two parallel routing algorithms for Omega multistage networks by using hardware assistant approach. Both algorithms have been mapped on a FPGA. The first algorithm minimizes the execution time and it is based on a priority encoder. The second one optimizes the hardware resources by using embedded FPGA memories. Omega networks are blocking and some permutations are not completely routed. Extra levels increase the routing capability by doubling the number of paths. This work evaluates the route capacity as a function of network workload, parallel networks and extra levels. Network switches with 2 and 4 inputs/outputs have been taken into account. For each connection, the first algorithm spends only two clock cycles by using the priority encoder. For the second algorithm based on memories, the number of cycles per connection ranges from 2 to 10 and the average number of cycles is around 5. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Proposal of Parallelization of Local Search Procedures Applied to the Minimum Cost Hop-and-Root Constrained Forest Problem

    Page(s): 9 - 16
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (264 KB)  

    In this work, we present two parallel heuristics that coordinate the application of many local search procedures to the Minimum Cost Hop-and-root Constrained Forest Problem (MCHCFP), a Combinatorial Optimization problem in which the minimization of the communication costs in a wireless sensor network with constrained delay and number of hops is desired. The coordinated application of local search procedures in parallel must take into account that these search procedures may have different goals, since the MCHCFP requires the minimization of trees and routes. The presented ideas can be generalized to other problems in which a similar situation occurs. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Simulation of the Foreground-Background Queue in Parallel Systems

    Page(s): 17 - 24
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (435 KB)  

    This paper presents a simulation study of the FB scheduling policy in the presence of system loads of varying intensity and variability. The FB policy is compared with the policies FCFS, PS and SRPT, under the same conditions, in mono and multiprocessor environments. The results show that the policy FB is very appropriate for extreme situations, in which there is large variability in service time workload and high load intensity. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Influence of Communication Models on the Scalability of Master-Slave Platforms Running Bag-of-Tasks Applications

    Page(s): 25 - 32
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (326 KB)  

    Bag-of-Tasks applications are parallel applications composed of independent (i.e., embarrassingly parallel) tasks that do not communicate with each other, may depend upon one or more input files, and can be executed in any order. Each file may be input for more than one task. A common framework to execute BoT applications is the master-slave topology. In this paper we studied the scalability of BoT applications running on multi-node systems (e.g. clusters and grids) organized as master-slave platforms, considering two communications paradigms: multiplexed connections and efficient broadcast. We prove that the lower bound on the isoefficiency function for master-slave platforms is achievable by those platforms that have an efficient broadcast primitive available. Our study employs a set of simulation experiments that confirms the theoretical results. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimizing a Retargetable Compiled Simulator to Achieve Near-Native Performance

    Page(s): 33 - 39
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (514 KB)  

    The design of new architectures can be simplified with the use of retargetable instruction set simulation tools, which can validate the decisions in the design exploration cycle with high flexibility and reduced cost. The increasing system complexity makes the traditional approach to simulation inefficient for today's architectures. The compiled simulation technique makes use of a priori knowledge about the application to accelerate the simulation with high efficiency. This paper presents a retargetable compiled simulator with three optimization techniques and taking advantage of new GCC optimizations to improve the performance. Three architectures were modeled and tested, MIPS, SPARC and PowerPC. Our MIPS model achieved the best results, with average of 651 million instruction per second, and only 2.8 times slower than native execution. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Advanced SS: A Superscalar Simulator with Support for Operating System

    Page(s): 40 - 47
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (327 KB) |  | HTML iconHTML  

    Computer simulation has allowed the analysis of behavior and performance of systems still in its design phase. The Advanced Superscalar Simulator project is a tool for simulation of a complete computer system, involving the simulation of a superscalar processor and an input and output system, with infrastructure for symmetric multiprocessing. It can also run an operating system in the simulated hardware, making the environment more close to reality. The processor simulator is designed so that it can run multiple instruction sets, and it currently supports x86-64. The input and output system contains hypothetical devices of keyboard, video, hard drive and timer interrupt controller. An operating system was designed to manage the simulated hardware. It provides control of interruptions, multiprogramming, virtual memory and a subset of the Linux system calls, therefore, being partially compatible with binaries generated for Linux. The simulator was validated through experiments with the Spec 2000 Benchmark and proved its applicability as a tool for performance analysis. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Hardware Accelerator for Dictionary-Based Compression of MMP Algorithm

    Page(s): 48 - 55
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (443 KB)  

    The MMP is an algorithm for image compression which uses the multiscale method of recurrent patterns, based on dictionary. The MMP has compression ratio at the same level of others compression algorithms which are based on transforms, having been detached to images with high frequency, however its execution time has been shown high, by repeated searches of these patterns in dictionaries. In this paper we propose a parallel and dedication hardware to accelerate the execution of MMP, and it is implemented in FPGA, which performs the critical function in 340ns, achieving a speedup of 300 over software version. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • D-Power: A VHDL Dynamic Power Estimation Tool for Superscalar Architectures - Cache Hierarchy and Fetch Stage

    Page(s): 56 - 63
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (345 KB)  

    Technological innovations constantly emerge in computer systems. Its primary focus is on the improvement of performance. One way to improve performance is to increase the hardware. However, this increase in hardware has its implications, like a larger area required on the chip and a consequent increase in power consumption. This boost in power consumption raises heat dissipation, difficult cooling and circuit expansion, among other factors. Because of these problems, the power consumption is target of several studies which try to estimate it and find alternatives to reduce it before the design of the chip. In this context, this paper presents the D-Power tool, a tool described in VHDL designed to estimate dynamic power consumption in components of the fetch stage and cache hierarchy in a superscalar architecture. Based on the entries parameters, the tool is able to verify, among several models of components, which one are more advantageous in relation to power consumption and performance. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • NSGAII Applied to Unified Second Level Cache Memory Hierarchy Tuning Aiming Energy and Performance Optimization

    Page(s): 64 - 71
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (497 KB) |  | HTML iconHTML  

    The evolutionary algorithm NSGAII was applied to the problem of cache memory hierarchy optimization, considering unified second level. The proposed multi-objective approach considers two main objectives: energy consumption and performance related to the number of cycles necessary to run an application. Experiments done with 18 applications from two benchmarks (Power Stone and Mibench) permitted to conclude that found solutions, when NSGAII is applied, are close to optimal solutions. Results also were compared with an existing heuristic (TECH-CYCLES) and was observed that the quality of results obtained are superior in all analyzed cases, being in average 187 times lower in terms of the cost function (FC=Energy x Cycles) that represents the two components: energy and cycles of the application. Evaluating the impact in terms of number of simulations and obtained results, could be noticed that NSGAII needs only 1% of search space, becoming competitive for architecture exploration with unified second level. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Process Mapping Based on Memory Access Traces

    Page(s): 72 - 79
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (225 KB)  

    Process mapping is a technique widely used in parallel machines to provide performance gains by improving the use of resources such as interconnections and cache memory hierarchy. The problem to find the best mapping is considered NP-Hard and, in shared memory environments, there is the additional difficulty to find the communication pattern, which is implicit and occurs through memory accesses. In this context, this work aims to improve the performance of parallel applications that use shared memory. For that, it was developed a method for analysis of the shared memory which identifies the mapping without requiring any previous knowledge of the application behavior. Applications from the NAS Parallel Benchmarks (NPB) were used in these experiments, showing performance gains of up to 42% compared to the native scheduler of the operating system. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Implementation of Techniques for Fault Tolerance in a Network-on-Chip

    Page(s): 80 - 87
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (708 KB)  

    Networks-on-Chip (NoCs - Networks-on-Chip) have emerged as the best alternative to provide high performance in communication for futures Systems-on- Chip (SoCs) with dozens of cores integrated on a single silicon die. However, the components of a NoC are susceptible to faults resulting from heating, power surge, external radiation and others. Faults in a router or a network link can lead to the transfer of erroneous data or, depending on the nature of the fault, cause problems in routing packets, such as forwarding a packet to an incorrect destination or even prevention of a particular path network is used, resulting in system failures. A fault tolerant NoC should be able to detect a fault and prevent it from leading to a system failure, ensuring the correct operation of the application. This paper presents the implementation of techniques for detection and recovery of faults in a NoC, which were modeled in SystemC and validated by simulation. Results compare the effectiveness of two techniques to provide fault tolerance to a NoC. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Scheduling Strategies Evaluation for Opportunistic Grids

    Page(s): 88 - 95
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (170 KB)  

    On this work we evaluate scheduling strategies usually applied on heterogeneous distributed environments, taking into consideration the characteristics of opportunistic grids. On the explored scenarios, several parameters are considered and correlated, such as application arrival rate, the occurrence of failures, application tasks size and heterogeneity, and the time to recover from application failures. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Model to Explore Business Opportunities in Ubiquitous Environments

    Page(s): 96 - 103
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (478 KB)  

    The mobile and ubiquitous computing has been stimulated by the widespread diffusion of mobile devices, wireless networks, and, more recently, by location systems. In this context, applications are emerging in different areas, such as, education, entertainment, and commerce. The application of the ubiquitous computing in the exploration of business opportunities is called Ubiquitous Commerce. This paper proposes the UbiTrade model, which aims at supporting the ubiquitous commerce. Considering the related works, we can note that the proposed model is the only that allows the user to act either in the supplier role or in the consumer character. In addition, most of the current proposals are restricted to a specific area of business. The UbiTrade does not have this restriction. The model was implemented and its validation was conducted by a group of users. The results indicated the model feasibility, particularly in regard to its most significant contribution, that is, the integration of the support to the roles of supplier and consumer. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.