By Topic

Computer Architecture and High Performance Computing (SBAC-PAD), 2010 22nd International Symposium on

Date 27-30 Oct. 2010

Filter Results

Displaying Results 1 - 25 of 43
  • [Front cover]

    Publication Year: 2010, Page(s): C1
    Request permission for commercial reuse | PDF file iconPDF (2037 KB)
    Freely Available from IEEE
  • [Title page i]

    Publication Year: 2010, Page(s): i
    Request permission for commercial reuse | PDF file iconPDF (19 KB)
    Freely Available from IEEE
  • [Title page iii]

    Publication Year: 2010, Page(s): iii
    Request permission for commercial reuse | PDF file iconPDF (66 KB)
    Freely Available from IEEE
  • [Copyright notice]

    Publication Year: 2010, Page(s): iv
    Request permission for commercial reuse | PDF file iconPDF (122 KB)
    Freely Available from IEEE
  • Table of contents

    Publication Year: 2010, Page(s):v - vii
    Request permission for commercial reuse | PDF file iconPDF (148 KB)
    Freely Available from IEEE
  • Message from the General Chairs

    Publication Year: 2010, Page(s): viii
    Request permission for commercial reuse | PDF file iconPDF (67 KB) | HTML iconHTML
    Freely Available from IEEE
  • Message from the Program Committee Co-chairs

    Publication Year: 2010, Page(s):ix - x
    Request permission for commercial reuse | PDF file iconPDF (70 KB) | HTML iconHTML
    Freely Available from IEEE
  • Conference organizers

    Publication Year: 2010, Page(s):xi - xii
    Request permission for commercial reuse | PDF file iconPDF (82 KB)
    Freely Available from IEEE
  • Program Committee

    Publication Year: 2010, Page(s):xiii - xiv
    Request permission for commercial reuse | PDF file iconPDF (73 KB)
    Freely Available from IEEE
  • list-reviewer

    Publication Year: 2010, Page(s): xv
    Request permission for commercial reuse | PDF file iconPDF (52 KB)
    Freely Available from IEEE
  • Brazilian Computer Society (SBC)

    Publication Year: 2010, Page(s):xvi - xviii
    Request permission for commercial reuse | PDF file iconPDF (86 KB)
    Freely Available from IEEE
  • Flexible Error Protection for Energy Efficient Reliable Architectures

    Publication Year: 2010, Page(s):1 - 8
    Cited by:  Papers (4)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (302 KB) | HTML iconHTML

    Technology scaling is having an increasingly detrimental effect on microprocessor reliability, with increased variability and higher susceptibility to errors. At the same time, as integration of chip multiprocessors increases, power consumption is becoming a significant bottleneck that could threaten their growth. To deal with these competing trends, energy-efficient solutions are needed to deal w... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Characterizing Energy Consumption in Hardware Transactional Memory Systems

    Publication Year: 2010, Page(s):9 - 16
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (232 KB) | HTML iconHTML

    Transactional Memory is currently being advocated as a promising alternative to lock-based synchronization because it simplifies multithreaded programming. In this way, future many-core CMP architectures may need to provide hardware support for transactional memory. On the other hand, power dissipation constitutes a first class consideration in multicore processor design. In this work, we characte... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Control Scheme for a CGRA

    Publication Year: 2010, Page(s):17 - 24
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1299 KB) | HTML iconHTML

    Ability to instantiate low cost and agile FSMs that can implement an arbitrary parallelism and combine such FSMs in a chain and in a hierarchy is one of the key differentiating factors between the ASICs and MPSOCs. CGRAs that have been reported in literature, like MPSOCs, also lack this ASIC like ability. The downside of ASICs is their lack of reuse and high engineering cost. We present a CGRA arc... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • High Level Power and Energy Exploration Using ArchC

    Publication Year: 2010, Page(s):25 - 32
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (630 KB) | HTML iconHTML

    With the increase in the design complexity of MPSoC architectures, estimating power consumption is very complex and time consuming at lower level of abstraction. We propose a methodology using ArchC named Power-ArchC for a fast high-level estimation of processor power consumption. Power values are obtained by an instruction level power characterization at gate level. The requirements for power eva... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance Debugging of GPGPU Applications with the Divergence Map

    Publication Year: 2010, Page(s):33 - 40
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (448 KB) | HTML iconHTML

    The increasing programability and the high computational power of Graphical Processing Units (GPU) make them attractive to general purpose programming. However, taking full benefit of this execution environment is a challenging task. One of these challenges stem from divergences, a phenomenon that occurs when threads that execute in lock-step are forced to take different program paths due to branc... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Mixed-Precision Parallel Linear Programming Solver

    Publication Year: 2010, Page(s):41 - 46
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (279 KB) | HTML iconHTML

    We use mixed-precision technique, which is used to exploit the high single precision performance of modern processors, to build the first sparse mixed-precision linear programming solver on the Cell BE processor. The technique is used to enhance the performance of an LP IPM-based solver by implementing mixed-precision sparse Cholesky factorization, the most time consuming part of LP solvers. Moreo... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Tree Projection-Based Frequent Itemset Mining on Multicore CPUs and GPUs

    Publication Year: 2010, Page(s):47 - 54
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (968 KB) | HTML iconHTML

    Frequent itemset mining (FIM) is a core operation for several data mining applications as association rules computation, correlations, document classification, and many others, which has been extensively studied over the last decades. Moreover, databases are becoming increasingly larger, thus requiring a higher computing power to mine them in reasonable time. At the same time, the advances in high... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Mapping Pipelined Applications with Replication to Increase Throughput and Reliability

    Publication Year: 2010, Page(s):55 - 62
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1632 KB) | HTML iconHTML

    Mapping and scheduling an application onto the processors of a parallel system is a difficult problem. This is true when performance is the only objective, but becomes worse when a second optimization criterion like reliability is involved. In this paper we investigate the problem of mapping an application consisting of several consecutive stages, i.e., a pipeline, onto heterogeneous processors, w... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Improving In-memory Column-Store Database Predicate Evaluation Performance on Multi-core Systems

    Publication Year: 2010, Page(s):63 - 70
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (375 KB) | HTML iconHTML

    The ability to analyze a large volume of data for the purpose of business intelligence has led to various innovations in database technology. One example is the increased interest of using column-oriented data layout to address query performance in analytical and warehousing workloads. As system architectures move towards multi-core designs, it is important to address optimizing performance for th... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Comparative Analysis of Load Balancing Algorithms Applied to a Weather Forecast Model

    Publication Year: 2010, Page(s):71 - 78
    Cited by:  Papers (9)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (362 KB) | HTML iconHTML

    Among the many reasons for load imbalance in weather forecasting models, the dynamic imbalance caused by localized variations on the state of the atmosphere is the hardest one to handle. As an example, active thunderstorms may substantially increase load at a certain time step with respect to previous time steps in an unpredictable manner - after all, tracking storms is one of the reasons for runn... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Sharing Resources for Performance and Energy Optimization of Concurrent Streaming Applications

    Publication Year: 2010, Page(s):79 - 86
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (372 KB) | HTML iconHTML

    We aim at finding optimal mappings for concurrent streaming applications. Each application consists of a linear chain with several stages, and processes successive data sets in pipeline mode. The objective is to minimize the energy consumption of the whole platform, while satisfying given performance-related bounds on the period and latency of each application. The problem is to decide which proce... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Feedback-Driven Restructuring of Multi-threaded Applications for NUCA Cache Performance in CMPs

    Publication Year: 2010, Page(s):87 - 94
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (631 KB) | HTML iconHTML

    This paper addresses feedback-directed restructuring techniques tuned to Non Uniform Cache Architectures (NUCA) in CMPs running multi-threaded applications. Access time to NUCA caches depends on the location of the referred block, so the locality and cache mapping of the application influence the overall performance. We show techniques for altering the distribution of applications into the cache s... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Cache Replacement Policy Using Adaptive Insertion and Re-reference Prediction

    Publication Year: 2010, Page(s):95 - 102
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (875 KB) | HTML iconHTML

    Previous research shows that LRU replacement policy is not efficient when applications exhibit a distant re-reference interval. Recently proposed RRIP policy improves performance for such workloads. However, RRIP lacks of access recency information, which may confuse the replacement policy to make accurate prediction. Consequently, RRIP is not robust for recency-friendly workloads. This paper prop... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • MOPSO Applied to Architecture Tuning with Unified Second-Level Cache for Energy and Performance Optimization

    Publication Year: 2010, Page(s):103 - 110
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (544 KB) | HTML iconHTML

    Design Space Exploration (DSE) have been a suitable strategy to configure a parameterized SoC platform in terms of systems requirements such as energy and performance. In this work, a multi-objective approach (MOPSO) based on Particle Swarm Optimization was applied for DSE problems for supporting architecture tuning in memory hierarchy with unified second level cache. The proposed approach conside... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.