By Topic

IEEE Computer Architecture Letters

Issue 1 • Date Jan.-June 2006

Filter Results

Displaying Results 1 - 11 of 11
  • Balanced instruction cache: reducing conflict misses of direct-mapped caches through balanced subarray accesses

    Publication Year: 2006, Page(s):2 - 5
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (238 KB) | HTML iconHTML

    It is observed that the limited memory space of direct-mapped caches is not used in balance therefore incurs extra conflict misses. We propose a novel cache organization of a balanced cache, which balances accesses to cache sets at the granularity of cache subarrays. The key technique of the balanced cache is a programmable subarray decoder through which the mapping of memory reference addresses t... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • From sequential programs to concurrent threads

    Publication Year: 2006, Page(s):6 - 9
    Cited by:  Papers (10)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (523 KB) | HTML iconHTML

    Chip multiprocessors are of increasing importance due to difficulties in achieving higher clock frequencies in uniprocessors, but their success depends on finding useful work for the processor cores. This paper addresses this challenge by presenting a simple compiler approach that extracts non-speculative thread-level parallelism from sequential codes. We present initial results from this techniqu... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Topology optimization of interconnection networks

    Publication Year: 2006, Page(s):10 - 13
    Cited by:  Papers (9)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (779 KB) | HTML iconHTML

    This paper describes an automatic optimization tool that searches a family of network topologies to select the topology that best achieves a specified set of design goals while satisfying specified packaging constraints. Our tool uses a model of signaling technology that relates bandwidth, cost and distance of links. This model captures the distance-dependent bandwidth of modern high-speed electri... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance, power efficiency and scalability of asymmetric cluster chip multiprocessors

    Publication Year: 2006, Page(s):14 - 17
    Cited by:  Papers (42)  |  Patents (26)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (324 KB) | HTML iconHTML

    This paper evaluates asymmetric cluster chip multiprocessor (ACCMP) architectures as a mechanism to achieve the highest performance for a given power budget. ACCMPs execute serial phases of multithreaded programs on large high-performance cores whereas parallel phases are executed on a mix of large and many small simple cores. Theoretical analysis reveals a performance upper bound for symmetric mu... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Probabilistic counter updates for predictor hysteresis and bias

    Publication Year: 2006, Page(s):18 - 21
    Cited by:  Papers (3)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (700 KB) | HTML iconHTML

    Hardware predictor designers have incorporated hysteresis and/or bias to achieve desired behavior by increasing the number of bits per counter. Some resulting proposed predictor designs are currently impractical because their counter tables are too large. We describe a method for dramatically reducing the amount of storage required for a predictor's counter table with minimal impact on prediction ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A case for fault tolerance and performance enhancement using chip multi-processors

    Publication Year: 2006, Page(s):22 - 25
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (174 KB) | HTML iconHTML

    This paper makes a case for using multi-core processors to simultaneously achieve transient-fault tolerance and performance enhancement. Our approach is extended from a recent latency-tolerance proposal, dual-core execution (DCE). In DCE, a program is executed twice in two processors, named the front and back processors. The front processor pre-processes instructions in a very fast yet highly accu... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Adopting system call based address translation into user-level communication

    Publication Year: 2006, Page(s):26 - 29
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1137 KB) | HTML iconHTML

    User-level communication alleviates the software overhead of the communication subsystem by allowing applications to access the network interface directly. For that purpose, efficient address translation of virtual address to physical address is critical. In this study, we propose a system call based address translation scheme where every translation is done by the kernel instead of a translation ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Data parallel address architecture

    Publication Year: 2006, Page(s):30 - 33
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (238 KB) | HTML iconHTML

    Data parallel memory systems must maintain a large number of outstanding memory references to fully use increasing DRAM bandwidth in the presence of increasing latency. At the same time, the throughput of modern DRAMs is very sensitive to access pattern's due to the time required to precharge and activate banks and to switch between read and write access. To achieve memory reference parallelism a ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • In-network cache coherence

    Publication Year: 2006, Page(s):34 - 37
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (422 KB) | HTML iconHTML

    We propose implementing cache coherence protocols within the network, demonstrating how an in-network implementation of the MSI directory-based protocol allows for in-transit optimizations of read and write delay. Our results show 15% and 24% savings on average in memory access latency for SPLASH-2 parallel benchmarks running on a 4times4 and a 16times16 multiprocessor respectively View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance modeling using Monte Carlo simulation

    Publication Year: 2006, Page(s):38 - 41
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (412 KB) | HTML iconHTML

    Cycle accurate simulation has long been the primary tool for micro-architecture design and evaluation. Though accurate, the slow speed often imposes constraints on the extent of design exploration. In this work, we propose a fast, accurate Monte-Carlo based model for predicting processor performance. We apply this technique to predict the CPI of in-order architectures and validate it against the I... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Foreword

    Publication Year: 2006, Page(s): 11
    Request permission for commercial reuse | PDF file iconPDF (128 KB) | HTML iconHTML
    Freely Available from IEEE

Aims & Scope

IEEE Computer Architecture Letters is a rigorously peer-reviewed forum for publishing early, high-impact results in the areas of uni- and multiprocessor computer systems, computer architecture, microarchitecture, workload characterization, performance evaluation and simulation techniques, and power-aware computing. 

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
José Martinez
Cornell University
336 Frank H.T. Rhodes Hall
Ithaca, NY 14853 USA
e-mail: martinez@cornell.edu