By Topic

Computer Architecture and High Performance Computing, 2004. SBAC-PAD 2004. 16th Symposium on

Date 27-29 Oct. 2004

Filter Results

Displaying Results 1 - 25 of 42
  • [Cover page]

    Publication Year: 2004, Page(s): c1
    Request permission for commercial reuse | PDF file iconPDF (425 KB)
    Freely Available from IEEE
  • Proceedings. 16th Symposium on Computer Architecture and High Performance Computing

    Publication Year: 2004
    Request permission for commercial reuse | PDF file iconPDF (2934 KB)
    Freely Available from IEEE
  • Table of contents

    Publication Year: 2004, Page(s):v - vii
    Request permission for commercial reuse | PDF file iconPDF (40 KB)
    Freely Available from IEEE
  • Message from the General Chairs

    Publication Year: 2004, Page(s): viii
    Request permission for commercial reuse | PDF file iconPDF (19 KB) | HTML iconHTML
    Freely Available from IEEE
  • Message from the Program Chairs

    Publication Year: 2004, Page(s): ix
    Request permission for commercial reuse | PDF file iconPDF (20 KB) | HTML iconHTML
    Freely Available from IEEE
  • Conference organizers

    Publication Year: 2004, Page(s): x
    Request permission for commercial reuse | PDF file iconPDF (16 KB)
    Freely Available from IEEE
  • Program Committee

    Publication Year: 2004, Page(s):xi - xii
    Request permission for commercial reuse | PDF file iconPDF (18 KB)
    Freely Available from IEEE
  • list-reviewer

    Publication Year: 2004, Page(s): xiii
    Request permission for commercial reuse | PDF file iconPDF (18 KB)
    Freely Available from IEEE
  • Brazilian Computer Society (SBC)

    Publication Year: 2004, Page(s):xiv - xv
    Request permission for commercial reuse | PDF file iconPDF (22 KB)
    Freely Available from IEEE
  • Cache filtering techniques to reduce the negative impact of useless speculative memory references on processor performance

    Publication Year: 2004, Page(s):2 - 9
    Cited by:  Papers (3)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (168 KB) | HTML iconHTML

    High-performance processors employ aggressive speculation and prefetching techniques to increase performance. Speculative memory references caused by these techniques sometimes bring data into the caches that are not needed by correct execution. This paper proposes the use of the first-level caches as filters that predict the usefulness of speculative memory references. With the proposed technique... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Self-monitored adaptive cache warm-up for microprocessor simulation

    Publication Year: 2004, Page(s):10 - 17
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (168 KB) | HTML iconHTML

    Simulation is the most important tool for computer architects to evaluate the performance of new computer designs. However, detailed simulation is extremely time consuming. Sampling is one of the techniques that effectively reduce simulation time. In order to achieve accurate sampling results, microarchitectural structure must be adequately warmed up before each measurement. In this paper, a new t... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The eDRAM based L3-cache of the BlueGene/L supercomputer processor node

    Publication Year: 2004, Page(s):18 - 22
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (144 KB) | HTML iconHTML

    BlueGene/L is a supercomputer consisting of 64K dual-processor system-on-a-chip compute nodes, capable of delivering an arithmetic peak performance of 5.6Gflops per node. To match the memory speed to the high compute performance, the system implements an aggressive three-level on-chip cache hierarchy for each node. The implemented hierarchy offers high bandwidth and integrated prefetching on cache... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multi-profile instruction based compression

    Publication Year: 2004, Page(s):23 - 29
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (152 KB) | HTML iconHTML

    Code compression has been used to minimize the memory area requirement of embedded systems. Recently, performance improvement and energy consumption reduction are observed as a by-product of compression. In this paper we propose a novel technique for efficiently exploring the trade-offs involved in code compression. Our multiprofile approach to build dictionaries combines the best features of both... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A study of errant pipeline flushes caused by value misspeculation

    Publication Year: 2004, Page(s):32 - 39
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (160 KB) | HTML iconHTML

    Value speculation has been proposed as a technique that can overcome true data dependencies, hide memory latencies, and expose higher degrees of instruction level parallelism (ILP). Branch direction prediction and target address prediction are two widely used control speculation techniques aimed at providing a steady stream of instructions to the instruction window. In this paper we consider a loa... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design space exploration using T&D-Bench

    Publication Year: 2004, Page(s):40 - 47
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (160 KB) | HTML iconHTML

    This paper presents T&D-Bench - teaching and design workbench, a software infrastructure for modeling and simulation of state-of-the-art processors. It combines features that simplify and accelerate the processor design process without restricting the designer possibilities, thus representing a good tradeoff for educational and research purposes that is not found in other environments. In T&D-Benc... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Value predictors for reuse through speculation on traces

    Publication Year: 2004, Page(s):48 - 55
    Cited by:  Papers (2)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (400 KB) | HTML iconHTML

    Reusing dynamic sequences of instructions - i.e., traces - improves performance for many benchmarks. However, many traces are not reused because of unavailable inputs in the reuse test. Reuse through speculation on traces (RST) aims to increase the number of reused traces by predicting those inputs when necessary, with minimal additional hardware when compared to nonspeculative trace reuse. In thi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • IATO: a flexible EPIC simulation environment

    Publication Year: 2004, Page(s):58 - 65
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (152 KB) | HTML iconHTML

    High-performance superscalar processors are designed with the help of complex simulation environment. The simulation infrastructure permits to validate the processor instruction set and contributes as well to the performance evaluation of the selected microarchitecture. Unfortunately, new architectures like the EPIC are not properly supported in the research community. Due to its specificity, the ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • ArchC: a systemC-based architecture description language

    Publication Year: 2004, Page(s):66 - 73
    Cited by:  Papers (37)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (160 KB) | HTML iconHTML

    This paper presents an architecture description language (ADL) called ArchC, which is an open-source SystemC-based language that is specialized for processor architecture description. Its main goal is to provide enough information, at the right level of abstraction, in order to allow users to explore and verify new architectures, by automatically generating software tools like simulators and co-ve... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimizations for compiled simulation using instruction type information

    Publication Year: 2004, Page(s):74 - 81
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (5568 KB) | HTML iconHTML

    The design of new architectures can be simplified with the use of retargetable instruction set simulation tools, which can validate the design decisions in the design exploration cycle with high flexibility and reduced cost. The growing system complexity makes the traditional approach inefficient for today's architectures. Compiled simulation techniques make use of a priori knowledge to accelerate... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Improving server performance on transaction processing workloads by enhanced data placement

    Publication Year: 2004, Page(s):84 - 91
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (208 KB) | HTML iconHTML

    Modern servers access large volumes of data while running commercial workloads. The data is typically spread among several storage devices (e.g. disks). Carefully placing the data across the storage devices can minimize costly remote accesses and improve performance. We propose the use of simulated annealing to arrive at an effective layout of data on disk. The proposed technique considers the con... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • High performance communication system based on generic programming

    Publication Year: 2004, Page(s):92 - 99
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (184 KB) | HTML iconHTML

    This paper presents a high performance communication system based on generic programming. The system adapts itself according to the protocol being used on communication, simplifying the development of libraries. In order to validate the concepts, a MPI implementation has been developed and it is compared to a traditional implementation - MPICH-GM. It is demonstrated that the same functionality and... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance evaluation of a prototype distributed NFS server

    Publication Year: 2004, Page(s):100 - 105
    Cited by:  Papers (3)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (168 KB) | HTML iconHTML

    A high-performance file system is normally a key point for large cluster installations, where hundreds or even thousands of nodes frequently need to manage large volumes of data. While most solutions usually make use of dedicated hardware and/or specific distribution and replication protocols, the NFSP (NFS Parallel) project aims at improving performance within a standard NFS client/server system.... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • FlowCert : probabilistic certification for peer-to-peer computations

    Publication Year: 2004, Page(s):108 - 115
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (272 KB) | HTML iconHTML

    Large scale cluster, peer-to-peer computing systems and grid computer systems gather thousands of nodes for computing parallel applications. At this scale, it raises the problem of the result checking of the parallel execution of a program on an unsecured grid. This domain is the object of numerous works, either at the hardware or at the software level. We propose here an original software method ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A performance evaluation of a quorum-based state-machine replication algorithm for computing grids

    Publication Year: 2004, Page(s):116 - 123
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (176 KB) | HTML iconHTML

    Quorum systems are well-known tools that improve the performance and the availability of distributed systems. In this paper we explore their use as a means to achieve low response time for network services that are replicated and accessed over computing grids. To that end, we propose both a quorum construction and a quorum-based state-machine replication algorithm that tolerates crash failures in ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Scheduling in Bag-of-Task grids: the PAUA case

    Publication Year: 2004, Page(s):124 - 131
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (160 KB) | HTML iconHTML

    In this paper we discuss the difficulties involved in the scheduling of applications on computational grids. We highlight two main sources of difficulties: 1) the size of the grid rules out the possibility of using a centralized scheduler; 2) since resources are managed by different parties, the scheduler must consider several different policies. Thus, we argue that scheduling applications on a gr... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.