By Topic

2000 IEEE International Symposium on Performance Analysis of Systems and Software. ISPASS (Cat. No.00EX422)

24-25 April 2000

Filter Results

Displaying Results 1 - 25 of 31
  • 2000 IEEE International Symposium on Performance Analysis of Systems and Software. ISPASS (Cat. No.00EX422)

    Publication Year: 2000
    Request permission for commercial reuse | PDF file iconPDF (230 KB)
    Freely Available from IEEE
  • Author index

    Publication Year: 2000, Page(s): 207
    Request permission for commercial reuse | PDF file iconPDF (69 KB)
    Freely Available from IEEE
  • Some observations based on simple models of MP scaling

    Publication Year: 2000, Page(s):123 - 128
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (268 KB)

    The emergence of large shared memory multiprocessor systems offer the potential of accelerating the pace of ever increasing system performance. On the one hand, it seems simple: add more processors, get more performance. On the other hand, it is quite difficult, as efficient scaling of workloads to large numbers of processors is a nontrivial challenge. Nevertheless, the way we use these very large... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Simplified workload characterization using unified prediction

    Publication Year: 2000, Page(s):163 - 171
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (352 KB)

    Quantitative workload characterization is essential to high performance computer architecture design. Unfortunately, quantitative results are typically hard to interpret, reproduce and compare, due to the staggering amount of detail inherent in modern architecture. Source language, compiler technology target ISA, and micro-architecture, intertwined with system aspects such as memory hierarchy and ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Invocation profile characterization of Java applications

    Publication Year: 2000, Page(s):116 - 122
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (272 KB)

    Low performance of Java code execution (J. Gosling et al., 1996) has risen in the computer science community the awareness of the need for reengineering. This is mainly due to the software layer called Java Virtual Machine (T. Lindholm and F. Yellin, 1997), which allows Java applications to be multiplatform, but also to object oriented languages features, that impose a higher performance cost than... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • CommBench-a telecommunications benchmark for network processors

    Publication Year: 2000, Page(s):154 - 162
    Cited by:  Papers (77)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (440 KB)

    The paper presents a benchmark, CommBench, for use in evaluating and designing telecommunications network processors. The benchmark applications focus on small, computationally intense program kernels typical of the network processor environment. The benchmark is composed of eight programs, four of them oriented towards packet header processing and four oriented towards data stream processing. The... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Methodology to optimize the cost/performance of disk subsystems

    Publication Year: 2000, Page(s):109 - 115
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (280 KB)

    The storage hierarchy plays a major role in the price and performance of high-end enterprise servers. In fact, the total price of a high-end server's hardware is dominated by its memory and disk configuration. Similarly, the performance is significantly influenced by the design and configuration of the server's memory and disk subsystems. Given these realities, the design and development of future... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An analytical model for loop tiling and its solution

    Publication Year: 2000, Page(s):146 - 153
    Cited by:  Papers (11)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (300 KB)

    The authors address the problem of estimating the performance of loop tiling, an important program transformation for improved memory hierarchy utilization. We introduce an analytical model for estimating the memory cost of a loop nest as a rational polynomial in tile size variables. We also present a constant-time algorithm for finding an optimal solution to the model (i.e., for selecting optimal... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Extracting fine-grain profiles of in-order executions of instruction level parallel programs

    Publication Year: 2000, Page(s):7 - 12
    Cited by:  Papers (2)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (292 KB)

    Optimizing compilers targeted to instruction level parallel (ILP) architectures schedule program instructions in such a way so as to minimize the number of execution stalls, called bubbles, that occur during program execution because of hazards. These bubbles are estimated by compilers on the basis of the target processor functional model. Unfortunately, these functional models are often inaccurat... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Real-time image on QoS Web

    Publication Year: 2000, Page(s):70 - 75
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (228 KB)

    Digital images have become a dominant information source on the Web. Emerging Web-enabled applications, such as global collaboration in environmental studies, point and click manufacturing, and electronic commerce, to name a few, push for timely processing and transmission of images. This real-time imaging is much more than variations on image processing without regard to time. Its performance req... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Modeling load address behaviour through recurrences

    Publication Year: 2000, Page(s):101 - 108
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (444 KB)

    Addresses of load instructions exhibit regularity in their behaviour which is modelled through several models (locality repetitive patterns, etc.) and exploited in processor and memory hierarchy design. Nevertheless, sparse and symbolic applications are intensive in addressing patterns not entirely covered by current models. In this work we introduce a new recurrence among load pairs called &ldquo... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An efficient solver for Cache Miss Equations

    Publication Year: 2000, Page(s):139 - 145
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (264 KB)

    Cache Miss Equations (CME) (S. Ghosh et al., 1997) is a method that accurately describes the cache behavior by means of polyhedra. Even though the computation cost of generating CME is a linear function of the number of references, solving them is a very time consuming task and thus trying to study a whole program may be infeasible. The paper presents effective techniques that exploit some propert... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Quantifying instruction-level parallelism limits on an EPIC architecture

    Publication Year: 2000, Page(s):21 - 27
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (344 KB)

    EPIC architectures rely heavily on state-of-the-art compiler technology to deliver optimal performance while keeping hardware design simple. It is generally believed that an optimizing compiler has an enormous scheduling window to exploit instruction-level parallelism (ILP) since the compiler orchestrates the entire program. Many state-of-the-art compilers typically confine optimizations to loop b... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance analysis through synthetic trace generation

    Publication Year: 2000, Page(s):1 - 6
    Cited by:  Papers (25)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (296 KB)

    Most research in the area of microarchitectural performance analysis is done using trace-driven simulations. Although trace-driven simulations are fairly accurate, they are both time- and space-consuming which makes them sometimes impractical. Modeling the execution of a computer program by a statistical profile and generating a synthetic benchmark trace from this statistical profile can be used t... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A quantitative simulator for dynamic memory managers

    Publication Year: 2000, Page(s):64 - 69
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (284 KB)

    In the last thirty years, several dynamic memory management schemes have been proposed. Such schemes include first fit, best fit, segregated fit, and buddy systems. Because the performance (speed and memory utilization) of each scheme differs, software engineers often face difficult choices in selecting the most suitable approach for their applications. In this paper, a quantitative simulator for ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Instruction overhead and data locality effects in superscalar processors

    Publication Year: 2000, Page(s):95 - 100
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (312 KB)

    To reduce software development and maintenance costs, programmers are increasingly using object oriented programming languages, such as C++, and relying on highly flexible data structures, such as linked lists. Object oriented programming languages provide features that help manage complex software systems, but object oriented programs tend to suffer increased instruction counts, e.g. due to gener... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Web latency reduction via client-side prefetching

    Publication Year: 2000, Page(s):193 - 200
    Cited by:  Papers (9)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (328 KB)

    The rapid growth of the WWW has inspired numerous techniques to reduce Web latency. While some of these techniques have not been implemented because they either increase network traffic or require cooperation between tiers, recent studies cast a shadow on techniques already in use (e.g. proxy caching) as a result of the increasingly dynamic aspects of the WWW. In particular, the proliferation of d... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Checking order-insensitivity using ternary simulation in synchronous programs

    Publication Year: 2000, Page(s):52 - 57
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (272 KB)

    In synchronous systems, new asynchronous distribution schemes are introduced. Properties can be established in order to distribute a program with weak synchronization. This paper deals with automatic verification of the order-insensitive property. Order-insensitivity is an important property introduced in synchronous systems by analogy with delay-insensitivity in asynchronous hardware. An algorith... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance scalability in multiprocessor systems with resource contention

    Publication Year: 2000, Page(s):129 - 138
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (464 KB)

    Multiple processes may contend for shared resources such as variables stored in the shared memory of a multiprocessor system. Mechanisms required to preserve data consistency on such systems often lead do a decrease in system performance. This research focuses on controlling shared resource contention for achieving high capacity and scalability in multiprocessor based applications that include tel... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A new approach in the analysis and modeling of disk access patterns

    Publication Year: 2000, Page(s):172 - 177
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (272 KB)

    While in previous work we have demonstrated that disk arrival patterns are consistent with self-similarity and have provided a physical explanation for the self-similar phenomenon in disk arrival patterns, the authors now deal with the analysis and modeling of disk access patterns. We provide visual and mathematical evidence showing that the same bursty behavior observed in the time series can als... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Accurate simulation and evaluation of code reordering

    Publication Year: 2000, Page(s):13 - 20
    Cited by:  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (364 KB)

    The need for bridging the ever growing gap between memory and processor performance has motivated research for exploiting the memory hierarchy effectively. An important software solution called code reordering produces a new program layout to better utilize the available memory hierarchy. Many algorithms have been proposed. They differ based on: 1) the code granularity assumed by the reordering al... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Issues in the design of store buffers in dynamically scheduled processors

    Publication Year: 2000, Page(s):76 - 87
    Cited by:  Papers (2)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (512 KB)

    Processor performance can be sensitive to load-store ordering, memory bandwidth, and memory access latency. A store buffer is a mechanism that exists in many current processors to accomplish one or more of the following: store access ordering, latency hiding, and data forwarding. Different policies that govern store buffer behavior can affect overall processor performance. However, the performance... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance evaluation of real-time scheduling on a multicomputer architecture

    Publication Year: 2000, Page(s):28 - 33
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (256 KB)

    The complexity of some real-time applications demands high performance computer architectures. Multicomputer architectures have a potential for high performance and reliability because of their expressive number of processors and communication channels. Therefore, they are natural candidates for supporting complex real-time computing. This paper presents a performance evaluation of real-time sched... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A server performance model for static Web workloads

    Publication Year: 2000, Page(s):201 - 206
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (272 KB)

    The paper describes a queuing network model for a multiprocessor system running a static Web workload such as SPECweb96. The model includes architectural details of the Web server in terms of multilevel cache hierarchy, processor bus, memory pipeline, PCI bus based I/O subsystem, and bypass I/O-memory path for DMA transfers. The model is based on detailed measurements from a baseline system and a ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Do generational schemes improve the garbage collection efficiency?

    Publication Year: 2000, Page(s):58 - 63
    Cited by:  Papers (3)  |  Patents (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (276 KB)

    Recently, most research efforts on garbage collection have concentrated on reducing pause times. However, very little effort has been spent on the study of garbage collection efficiency, especially generational garbage collection which was introduced as a way to reduce garbage collection pause times. In this paper a detailed study of garbage collection efficiency in generational schemes is present... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.