By Topic

International Symposium on Code Generation and Optimization

20-23 March 2005

Filter Results

Displaying Results 1 - 25 of 36
  • [Cover]

    Publication Year: 2005, Page(s): c1
    Request permission for commercial reuse | PDF file iconPDF (46 KB)
    Freely Available from IEEE
  • International Symposium on Code Generation and Optimization

    Publication Year: 2005
    Request permission for commercial reuse | PDF file iconPDF (73 KB)
    Freely Available from IEEE
  • Table of contents

    Publication Year: 2005, Page(s):v - viii
    Request permission for commercial reuse | PDF file iconPDF (48 KB)
    Freely Available from IEEE
  • Message from the General Co-Chairs

    Publication Year: 2005, Page(s): ix
    Request permission for commercial reuse | PDF file iconPDF (33 KB) | HTML iconHTML
    Freely Available from IEEE
  • Message from the Program Chair

    Publication Year: 2005, Page(s): x
    Request permission for commercial reuse | PDF file iconPDF (32 KB) | HTML iconHTML
    Freely Available from IEEE
  • Committees

    Publication Year: 2005, Page(s):xi - xiii
    Request permission for commercial reuse | PDF file iconPDF (38 KB)
    Freely Available from IEEE
  • Reviewers

    Publication Year: 2005, Page(s): xiv
    Request permission for commercial reuse | PDF file iconPDF (22 KB)
    Freely Available from IEEE
  • Virtual machine learning: thinking like a computer architect

    Publication Year: 2005
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (52 KB) | HTML iconHTML

    Summary form only given. Modern commercial software is written in languages that execute on a virtual machine. Such languages often have dynamic features that require rich runtime support and preclude traditional static optimization. Implementations of these languages have employed dynamic optimization strategies to achieve significant performance improvements. In this paper the author describes s... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Context threading: a flexible and efficient dispatch technique for virtual machine interpreters

    Publication Year: 2005, Page(s):15 - 26
    Cited by:  Papers (3)  |  Patents (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (240 KB) | HTML iconHTML

    Direct-threaded interpreters use indirect branches to dispatch bytecodes, but deeply-pipelined architectures rely on branch prediction for performance. Due to the poor correlation between the virtual program's control flow and the hardware program counter, which we call the context problem, direct threading's indirect branches are poorly predicted by the hardware, limiting performance. Our dispatc... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Automatically reducing repetitive synchronization with a just-in-time compiler for Java

    Publication Year: 2005, Page(s):27 - 36
    Cited by:  Papers (3)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (200 KB) | HTML iconHTML

    We describe an automatic technique to remove repetitive synchronization in Java™ programs by removing selected MONITORENTER/EXIT operations. Once these operations are removed, parts of a method that were not originally locked become protected by a lock. If it is unsafe to synchronize the code between the original locked regions, however, the code is not transformed. Scalability is also prote... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Compile-time concurrent marking write barrier removal

    Publication Year: 2005, Page(s):37 - 48
    Cited by:  Papers (1)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (232 KB) | HTML iconHTML

    Garbage collectors incorporating concurrent marking to cope with large live data sets and stringent pause time constraints have become common in recent years. The snapshot-at-the-beginning style of concurrent marking has several advantages over the incremental update alternative, but one main disadvantage: it requires the mutator to execute a significantly more expensive write barrier. This paper ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Collecting and exploiting high-accuracy call graph profiles in virtual machines

    Publication Year: 2005, Page(s):51 - 62
    Cited by:  Papers (8)  |  Patents (11)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (248 KB) | HTML iconHTML

    Due to the high dynamic frequency of virtual method calls in typical object-oriented programs, feedback-directed devirtualization and inlining is one of the most important optimizations performed by high-performance virtual machines. A critical input to effective feedback-directed inlining is an accurate dynamic call graph. In a virtual machine, the dynamic call graph is computed online during pro... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Effective adaptive computing environment management via dynamic optimization

    Publication Year: 2005, Page(s):63 - 73
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (184 KB) | HTML iconHTML

    To minimize the surging power consumption of microprocessors, adaptive computing environments (ACEs) where microarchitectural resources can be dynamically tuned to match a program's runtime requirement and characteristics are becoming increasingly common. Adaptive computing environments usually have multiple configurable hardware units, necessitating exploration of a large number of combinatorial ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Maintaining consistency and bounding capacity of software code caches

    Publication Year: 2005, Page(s):74 - 85
    Cited by:  Papers (9)  |  Patents (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (256 KB) | HTML iconHTML

    Software code caches are becoming ubiquitous, in dynamic optimizers, runtime tool platforms, dynamic translators fast simulators and emulators, and dynamic compilers. Caching frequently executed fragments of code provides significant performance boosts, reducing the overhead of translation and emulation and meeting or exceeding native performance in dynamic optimizers. One disadvantage of caching,... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance of runtime optimization on BLAST

    Publication Year: 2005, Page(s):86 - 96
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (224 KB) | HTML iconHTML

    Optimization of a real world application BLAST is used to demonstrate the limitations of static and profile-guided optimizations and to highlight the potential of runtime optimization systems. We analyze the performance profile of this application to determine performance bottlenecks and evaluate the effect of aggressive compiler optimizations on BLAST. We find that applying common optimizations (... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimizing sorting with genetic algorithms

    Publication Year: 2005, Page(s):99 - 110
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (280 KB) | HTML iconHTML

    The growing complexity of modern processors has made the generation of highly efficient code increasingly difficult. Manual code generation is very time consuming, but it is often the only choice since the code generated by today's compiler technology often has much lower performance than the best hand-tuned codes. A promising code generation strategy, implemented by systems like ATLAS, FFTW, and ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Combining models and guided empirical search to optimize for multiple levels of the memory hierarchy

    Publication Year: 2005, Page(s):111 - 122
    Cited by:  Papers (28)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (336 KB) | HTML iconHTML

    This paper describes an algorithm for simultaneously optimizing across multiple levels of the memory hierarchy for dense-matrix computations. Our approach combines compiler models and heuristics with guided empirical search to take advantage of their complementary strengths. The models and heuristics limit the search to a small number of candidate implementations, and the empirical results provide... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Predicting unroll factors using supervised classification

    Publication Year: 2005, Page(s):123 - 134
    Cited by:  Papers (25)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (224 KB) | HTML iconHTML

    Compilers base many critical decisions on abstracted architectural models. While recent research has shown that modeling is effective for some compiler problems, building accurate models requires a great deal of human time and effort. This paper describes how machine learning techniques can be leveraged to help compiler writers model complex systems. Because learning techniques can effectively mak... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multicores from the compiler's perspective: a blessing or a curse?

    Publication Year: 2005
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (58 KB) | HTML iconHTML

    With all major processor vendors building multiple processor cores on a single chip, multicores is the next tour-de-force in computer architecture. The ability to parallelize and execute applications on multiple cores provides an opportunity to get the processor performance back on track with the Moore's law. In this article the author analyzes the seismic changes in computer architecture that lea... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimizing address code generation for array-intensive DSP applications

    Publication Year: 2005, Page(s):141 - 152
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (520 KB) | HTML iconHTML

    The application code size is a critical design factor for many embedded systems. Unfortunately, most available compilers optimize primarily for speed of execution rather than code density. As a result, the compiler-generated code can be much larger than necessary. In particular, in the DSP domain, the past research found that optimizing address code generation can be very important since address c... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient SIMD code generation for runtime alignment and length conversion

    Publication Year: 2005, Page(s):153 - 164
    Cited by:  Papers (6)  |  Patents (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (264 KB) | HTML iconHTML

    When generating codes for today's multimedia extensions, one of the major challenges is to deal with memory alignment issues. While hand programming still yields best performing SIMD codes, it is both time consuming and error prone. Compiler technology has greatly improved, including techniques that simdize loops with misaligned accesses by automatically rearranging misaligned memory streams in re... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Superword-level parallelism in the presence of control flow

    Publication Year: 2005, Page(s):165 - 175
    Cited by:  Papers (24)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (208 KB) | HTML iconHTML

    In this paper, we describe how to extend the concept of superword-level parallelization (SLP), used for multimedia extension architectures, so that it can be applied in the presence of control flow constructs. Superword-level parallelization involves identifying scalar instructions in a large basic block that perform the same operation, and, if dependences do not prevent it, combining them into a ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Compiler managed dynamic instruction placement in a low-power code cache

    Publication Year: 2005, Page(s):179 - 190
    Cited by:  Papers (19)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (632 KB) | HTML iconHTML

    Modern embedded microprocessors use low power on-chip memories called scratch-pad memories to store frequently executed instructions and data. Unlike traditional caches, scratch-pad memories lack the complex tag checking and comparison logic, thereby proving to be efficient in area and power. In this work, we focus on exploiting scratch-pad memories for storing hot code segments within an applicat... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Phase-aware remote profiling

    Publication Year: 2005, Page(s):191 - 202
    Cited by:  Papers (9)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (200 KB) | HTML iconHTML

    Recent advances in networking and embedded device technology have made the vision of ubiquitous computing a reality; users can access the Internet's vast offerings anytime and anywhere. Moreover, battery-powered devices such as personal digital assistants and Web-enabled mobile phones have successfully emerged as new access points to the world's digital, infrastructure. This ubiquity offers a new ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Practical path profiling for dynamic optimizers

    Publication Year: 2005, Page(s):205 - 216
    Cited by:  Papers (10)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (264 KB) | HTML iconHTML

    Modern processors are hungry for instructions. To satisfy them, compilers need to find and optimize execution paths across multiple basic blocks. Path profiles provide this context, but their high overhead has so far limited their use by dynamic compilers. We present new techniques for low overhead online practical path profiling (PPP). Following targeted path profiling (TPP), PPP uses an edge pro... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.