By Topic

Proceedings of the Fourth Annual IEEE International Workshop on Workload Characterization. WWC-4 (Cat. No.01EX538)

2-2 Dec. 2001

Filter Results

Displaying Results 1 - 23 of 23
  • Proceedings of the Fourth Annual IEEE International Workshop on Workload Characterization. WWC-4 (Cat. No.01EX538)

    Publication Year: 2001
    Request permission for commercial reuse | PDF file iconPDF (120 KB)
    Freely Available from IEEE
  • Author index

    Publication Year: 2001, Page(s):202 - 203
    Request permission for commercial reuse | PDF file iconPDF (69 KB)
    Freely Available from IEEE
  • Synthetic trace generation for the Internet

    Publication Year: 2001, Page(s):169 - 174
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (531 KB) | HTML iconHTML

    We consider the distribution of destination addresses in IP packets arriving at an Internet router and show that the spatial locality of those addresses is well characterized by an empirical power law function. We demonstrate how the LRU stack model implied by this function can be used to generate synthetic IP traffic, e.g., for experimental studies of routing and caching protocols. We also show h... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • DNS-based Internet client clustering and characterization

    Publication Year: 2001, Page(s):159 - 168
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1002 KB) | HTML iconHTML

    This paper proposes a novel protocol which uses the Internet domain name system (DNS) to partition Web clients into disjoint sets, each of which is associated with a single DNS server. We define an L-DNS cluster to be a grouping of Web clients that use the same Local DNS server to resolve Internet host names. We identify such clusters in real-time using data obtained from a Web Server in conjuncti... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Modeling application performance by convolving machine signatures with application profiles

    Publication Year: 2001, Page(s):149 - 156
    Cited by:  Papers (28)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (804 KB) | HTML iconHTML

    This paper presents a performance modeling methodology that is faster than traditional cycle-accurate simulation, more sophisticated than performance estimation based on system peak-performance metrics, and is shown to be effective on a class of High Performance Computing benchmarks. The method yields insight into the factors that affect performance on single-processor and parallel computers. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A comprehensive model of the supercomputer workload

    Publication Year: 2001, Page(s):140 - 148
    Cited by:  Papers (45)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (801 KB) | HTML iconHTML

    As with any computer system, the performance of supercomputers depends upon the workloads that serve as their input. Unfortunately, however, there are many important aspects of the supercomputer workloads that have not been modeled, or that have been modeled only incipiently. This paper attacks this problem by considering requested time (and its relation with execution time) and the possibility of... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Cache characterization surfaces and predicting workload miss rates

    Publication Year: 2001, Page(s):129 - 139
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (892 KB) | HTML iconHTML

    In this paper, we use locality surfaces to predict cache miss rates. To do this, we introduce two new surfaces. The miss surface characterizes how a trace is filtered by a particular cache in terms of locality. A cache characterization surface helps us examine caches in terms of what stride/delay relationships are likely to cause misses in the cache. The cache characterization surface is independe... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Compressing address traces with RECET

    Publication Year: 2001, Page(s):120 - 126
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (556 KB) | HTML iconHTML

    Storing very long address traces for the simulation of cache configurations requires vast amounts of storage space. This storage space requirement can be lowered considerably using lossless compression of the original address trace. In this paper a new real-time address trace compression method is presented based on the RECET (Real-time Cache Evaluation Tool) platform. With the RECET address trace... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Palmist: a tool to log Palm system activity

    Publication Year: 2001, Page(s):111 - 119
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (806 KB) | HTML iconHTML

    In this paper we describe a Palm system call logging tool called Palmist. Palmist allows the practitioner to selectively collect statistics such as the system call invoked, application that invoked the system call, the time of the call and the call arguments. The logging mechanism adds a latency of about 10 msec per call to collect the log. On an average, the system uses about 20 bytes of memory o... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Examining performance differences in workload execution phases

    Publication Year: 2001, Page(s):82 - 90
    Cited by:  Papers (11)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (823 KB) | HTML iconHTML

    Workload characterization is vital to the design and performance analysis of new generation computer architectures. In many simulation-based performance analysis studies, only a small "representative" portion of the total workload execution is used for analysis. This is due to the prohibitive amount of time it takes to simulate or execute a workload to completion. Methods of choosing the portion o... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance characteristics of ItaniumTM processor on data encryption algorithms

    Publication Year: 2001, Page(s):54 - 62
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (931 KB) | HTML iconHTML

    In this paper, performance characteristics of the ItaniumTM processor, the first implementation of the ItaniumTM Processor Family (IPF), is examined for the task of data encryption using public key and symmetric key encryption algorithms. The ItaniumTM processor key performance characteristics: wide issue width, large number of functional units, and large number of... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Characterization of TPC-H queries on AMD AthlonTM microprocessors

    Publication Year: 2001, Page(s):26 - 35
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1158 KB) | HTML iconHTML

    Studies on TPC-H based workloads have been few. The ever increasing, Internet based transactions necessitate better design of Servers running database applications. Business managers run DSS (Decision Support System) applications to analyze various business scenarios. This paper characterizes a small subset of a DSS workload obtained from the TPC-H benchmark from TPC (Transaction Processing Counci... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An application-centric ccNUMA memory profiler

    Publication Year: 2001, Page(s):101 - 110
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1038 KB) | HTML iconHTML

    Cache coherent shared memory multiprocessors are an attractive and available target for parallel multi-threaded applications. However, achieving the expected levels of performance has proven difficult. ccNUMA permance depends critically on memory and task allocation, and by the amount and type of the coherency transactions. No performance analysis tool to date has done an adequate job of providing... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An analysis of the amount of global level redundant computation in the SPEC 95 and SPEC 2000 benchmarks

    Publication Year: 2001, Page(s):74 - 81
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1099 KB) | HTML iconHTML

    This paper analyzes the amount of global level redundant computation within selected benchmarks of the SPEC 95 and SPEC 2000 benchmark suites. Local level redundant computations are redundant computations that are the result of a single static instruction (i.e. PC dependent) while global level redundant computations are redundant computations that are the result of multiple static instructions (i.... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A characterization of speech recognition on modern computer systems

    Publication Year: 2001, Page(s):45 - 53
    Cited by:  Papers (8)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (909 KB) | HTML iconHTML

    In this paper we describe and characterize the speech recognition process, and assess the suitability of current microprocessors and memory systems for running speech recognition applications. We use representative benchmark applications-RASTA to characterize the signal-processing on the front end, and SPHINX for the graph search on the back end Recognition time is dominated by the back end, which... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • SPEClite: using representative samples to reduce SPEC CPU2000 workload

    Publication Year: 2001, Page(s):15 - 23
    Cited by:  Papers (13)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (830 KB) | HTML iconHTML

    An execution-driven microarchitecture-accurate microprocessor simulator requires a complex software program. The simulator must be highly detailed and accurate if it is used for microarchitecture design evaluation. The detail and accuracy comes at the high cost of enormous simulation time. A simulator that models a modern super-scalar processor is 105 to 106 times slower than... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Characterization of JavaTM application server workloads

    Publication Year: 2001, Page(s):175 - 181
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (603 KB) | HTML iconHTML

    This paper examines the workload characterization of a Java 2 Enterprise Edition (J2EETM) application server workload. The application provides services for a mixture of e-commerce transactions continuously. This paper examines the variation of system observable behavior, program behavior and performance characteristics at the CPU level and their impact as a result of the variation of t... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Runtime predictability of loops

    Publication Year: 2001, Page(s):91 - 98
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (714 KB) | HTML iconHTML

    To obtain the benefits of aggressive, wide-issue, architectures, a large window of valid instructions must be available. While researchers have been successful in obtaining high accuracies with a range of dynamic branch predictors, there still remains the need for more aggressive instruction delivery. Loop bodies possess a large amount of spatial and temporal locality. A large percentage of a prog... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Characterization of data value unpredictability to improve predictability

    Publication Year: 2001, Page(s):65 - 73
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (927 KB) | HTML iconHTML

    Recent research has shown that it is possible to overcome the parallelism limits imposed by dataflow by predicting instruction results based on previously produced values or a sequence thereof. Unlike branch prediction schemes where prediction accuracies of 90% and above are the norms, data value prediction schemes have been able to correctly predict only about 40-70% of the result-producing instr... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Workload characterization of multithreaded Java servers on two PowerPC processors

    Publication Year: 2001, Page(s):36 - 44
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (725 KB) | HTML iconHTML

    Java has, in recent years, become fairly popular as a platform for commercial servers. However, the behavior of Java server applications has not been studied extensively. We characterize two multithreaded Java server benchmarks, SPECjbb2000 and VolanoMark 2.1.2, on two IBM PowerPC architectures, the RS64-111 and the POWER3-11, and compare them to more traditional workloads as represented by select... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Memory performance analysis of SPEC2000C for the Intel(R) ItaniumTM processor

    Publication Year: 2001, Page(s):184 - 192
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1803 KB) | HTML iconHTML

    We describe our memory performance analysis of SPEC2000C using the newly released Intel(R) ItaniumTM processor (IPF). Memory overhead is very significant for SPEC200OC; on the average 39% cycles are spent in data stalls. Cache misses are significant, but also data translation performance (DTLB) affects many benchmarks. We present a study based on collecting measurements from the hardwar... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Memory energy characterization and optimization for the SPEC2000 benchmarks

    Publication Year: 2001, Page(s):193 - 201
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (927 KB) | HTML iconHTML

    With an increasing focus on low power computing devices such as PDA's, energy has become an important criterion for optimization. Power mode control of memory devices has the potential to significantly reduce energy consumption, for a moderate increase in execution time. Prior studies have explored the benefits of many schemes for DRAM power mode control. In this work, we present a framework for e... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • MiBench: A free, commercially representative embedded benchmark suite

    Publication Year: 2001, Page(s):3 - 14
    Cited by:  Papers (1159)  |  Patents (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1053 KB) | HTML iconHTML

    This paper examines a set of commercially representative embedded programs and compares them to an existing benchmark suite, SPEC2000. A new version of SimpleScalar that has been adapted to the ARM instruction set is used to characterize the performance of the benchmarks using configurations similar to current and next generation embedded processors. Several characteristics distinguish the represe... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.