By Topic

Low Power Electronics and Design, 2002. ISLPED '02. Proceedings of the 2002 International Symposium on

Date 2002

Filter Results

Displaying Results 1 - 25 of 67
  • Embedded tutorial 2: compilers for power and energy management

    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (223 KB)  

    Summary form only given. In this tutorial, I will give an overview of current approaches to compiler-directed power and energy mangement. I will discuss several promising compiler optimization techniques in detail, together with an assessment of their potential benefits. These optimizations include remote task mapping, resource hibernation, dynamic voltage and frequency scaling, and quality of result tradeoffs. Based on preliminary experiences with these optimizations, I will present a compiler writer's wish list for hardware architects and OS designers in order to support application specific power and energy management. An overview of future challenges will conclude the tutorial. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A power and resolution adaptive flash analog-to-digital converter

    Page(s): 233 - 236
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (484 KB) |  | HTML iconHTML  

    A new power and resolution adaptive flash ADC, named PRA-ADC, is proposed. The PRA-ADC enables exponential power reduction with linear resolution reduction. Unused parallel voltage comparators are switched to standby mode. The voltage comparators consume only the leakage power during the standby mode. The PRA-ADC, capable of operating at 5-bit, 6-bit, 7-bit, and 8-bit precision, dissipates 69 mW at 5-bit and 435 mW at 8-bit. The PRA-ADC was designed and simulated with 0.18 μm CMOS technology. The PRA-ADC design is applicable to RF portable communication devices, allowing tighter management of power and efficiency. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Low-voltage memories for power-aware systems

    Page(s): 1 - 6
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (740 KB) |  | HTML iconHTML  

    This paper describes low-voltage RAM designs for stand-alone and embedded memories in terms of signal-to-noise-ratio designs of RAM cells and subthreshold-current reduction. First, structures and areas of current DRAM and SRAM cells are discussed. Next, low-voltage peripheral circuits that have been proposed so far are reviewed with focus on subthreshold-current reduction, speed variation, on-chip voltage conversion, and testing. Finally, based on the above discussion, a perspective is given with emphasis on needs for high-speed simple non-volatile RAMs, new devices/circuits for reducing active-mode leakage currents, and memory-rich SoC architectures. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Low power integrated scan-retention mechanism

    Page(s): 98 - 102
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (595 KB) |  | HTML iconHTML  

    This paper presents a methodology for unifying the scan mechanism and data retention in latches which leads to scannable latches with the data retention capability achieved at a very low power overhead during the active mode. A detailed analysis of power and area overhead is presented, with layout examples for various common latch styles. Implications of using different power gating techniques for reducing leakage during sleep mode on the design of retention latches are considered, including well biasing for leakage control and sharing wells between gated logic and retention latch devices. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Tradeoffs in power-efficient issue queue design

    Page(s): 184 - 189
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (779 KB) |  | HTML iconHTML  

    A major consumer of microprocessor power is the issue queue. Several microprocessors, including the Alpha 21264 and POWER4™, use a compacting latch-based issue queue design which has the advantage of simplicity of design and verification. The disadvantage of this structure, however, is its high power dissipation. In this paper, we explore different issue queue power optimization techniques that vary not only in their performance and power characteristics, but in how much they deviate from the baseline implementation. By developing and comparing techniques that build incrementally on the baseline design, as well as those that achieve higher power savings through a more significant redesign effort, we quantify the extra benefit the higher design cost techniques provide over their more straightforward counterparts. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Power analysis techniques for SoC with improved wiring models

    Page(s): 259 - 262
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (487 KB) |  | HTML iconHTML  

    This paper proposes two techniques for improving the accuracy of gate-level power analysis for system-on-a-chip (SoC). (1) The creation of custom wire load models for clock nets; and (2) the use of layout information (actual net capacitance and input signal transition time). The analysis time is reduced to less than one three-hundredth of the transistor-level power analysis time. The error is within 5% of that of a real chip, (the same level in transistor-level power analysis) if technique (2) is used. The analytical error between technique (1) and (2) is within 1%. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Full-chip sub-threshold leakage power prediction model for sub-0.18 μm CMOS

    Page(s): 19 - 23
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (488 KB) |  | HTML iconHTML  

    The driving force for the semiconductor industry growth has been the elegant scaling nature of CMOS technology. In future CMOS technology generations, supply and threshold voltages will have to continually scale to sustain performance increase, control switching power dissipation, and maintain reliability. These continual scaling requirements on supply and threshold voltages pose several technology and circuit design challenges. With threshold voltage scaling sub-threshold leakage power is expected to become a significant portion of the total power in future CMOS systems. Therefore, it becomes crucial to predict sub-threshold leakage power of such systems. In this paper, we present a subthreshold leakage power prediction model that takes into account within-die threshold voltage variation. Statistical measurements of 32-bit microprocessors in 0.18 μm CMOS confirms the mean error of the model to be 4%. Comparisons of this model to two other existing models that do not take within-die threshold voltage variation into account are also presented. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • High performance and low power FIR filter design based on sharing multiplication

    Page(s): 295 - 300
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (595 KB) |  | HTML iconHTML  

    We present a high performance and low power FIR filter design, which is based on computation sharing multiplier (CSHM). CSHM specifically targets computation re-use in vector-scalar products and is effectively used in our FIR filter design. Efficient circuit level techniques: a new carry select adder and conditional capture flip-flop (CCFF), are also used to further improve power and performance. The proposed FIR filter architecture was implemented in 0.25 μm technology. Experimental results on a 10 tap low pass CSHM FIR filter show speed and power improvement of 19% and 17%, respectively, with respect to an FIR filter based on the Wallace tree multiplier. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Contents provider-assisted dynamic voltage scaling for low energy multimedia applications

    Page(s): 42 - 47
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (791 KB) |  | HTML iconHTML  

    This paper presents a new concept of DVS (dynamic voltage scaling) for multimedia applications. Many multimedia applications have a periodic property, but each period shows a large variation in terms of its execution time. Exact estimation of such variation is a crucial factor for low energy software execution with DVS technique. Previous DVS techniques focused only on end users (client sites) and their quality heavily depends on the accuracy of the worst case execution time estimation. This paper proposes that contents providers (server sites) supply the information of the execution time variations in addition to the content itself. This makes it possible to perform DVS independent to worst case execution time estimation. The extra work required by the contents provider for this purpose is fully compensated by the benefits for the end users because single content is often provided to many users. Experimental results show that our method greatly reduces the energy consumption of client systems compared to previous DVS techniques. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • TLB and snoop energy-reduction using virtual caches in low-power chip-multiprocessors

    Page(s): 243 - 246
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (607 KB) |  | HTML iconHTML  

    In our quest to bring down the power consumption in low-power chip-multiprocessors, we have found that TLB and snoop accesses account for about 40% of the energy wasted by all L1 data-cache accesses. We have investigated the prospects of using virtual caches to bring down the number of TLB accesses. A key observation is that while the energy wasted in the TLBs are cut, the energy associated with snoop accesses becomes higher. We then contribute with two techniques to reduce the number of snoop accesses and their energy cost. Virtual caches together with the proposed techniques are shown to reduce the energy wasted in the L1 caches and the TLBs by about 30%. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Activity-sensitive clock tree construction for low power

    Page(s): 279 - 282
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (420 KB) |  | HTML iconHTML  

    This paper presents an activity-sensitive clock tree construction technique for low power design of VLSI clock networks. We introduce the term of node difference based on module activity information, and show its relationship with the power consumption. A binary clock tree is built using the node difference between different modules to optimize the power consumption due to the interconnections (i.e., clock gating signals and clock edges). We also develop a method to determine gating signals with minimum number of transitions. After the clock tree is constructed, the gating signals are optimized for further power savings. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An intra-task dynamic voltage scaling method for SoC design with hierarchical FSM and synchronous dataflow model

    Page(s): 84 - 87
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (513 KB)  

    This paper presents a method of intra-task dynamic voltage scaling (DVS) for SoC design with hierarchical FSM and synchronous dataflow model (in short, HFSM-SDF model). To have an optimal intra-task DVS, exact execution paths need to be determined in compile time or runtime. In general programs, since determining exact execution paths in compile time or runtime is not possible, existing methods assume worst/average-case execution paths and take static voltage scaling approaches. In our work, we exploit a property of HFSM-SDF model to calculate exact execution paths in runtime. With the information of exact execution paths, our DVS method can calculate exact remaining workload. The exact workload enables to calculate optimal voltage level which gives optimal energy consumption while satisfying the given timing constraint. Experiments show the effectiveness of the presented method in low-power design of an MPEG4 decoder system. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Future directions in clocking multi-GHz systems

    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (208 KB)  

    Summary form only given. This tutorial addresses the problems and possible solutions of clocking digital systems operating at multi-GHz frequencies. We address techniques for managing clock uncertainties and clock power in synchronous circuits. There are two trends that are disturbing: (a) the power taken by the clock distribution network and clocked storage elements (flip-flops and latches) is increasing relatively to the rest of the logic, (b) clock uncertainties are taking a significant portion of the cycle away from useful logic operations. We present ways of designing clock storage elements that are capable of absorbing a significant portion of clock uncertainties and passing delay from one logic stage to the other. At multi-GHz frequencies of operation it will be difficult to precisely control the timing boundaries between the logic stages. Thus the ability to extend the operation into the time period allocated for the next pipeline stage is important. This is known as time borrowing. Also, the ability to incorporate logic into the clocked storage elements is of critical importance given that the number of logic stages in a pipeline running at multi-GHz frequencies, is decreasing to less than ten. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Energy-efficient hybrid wakeup logic

    Page(s): 196 - 201
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (787 KB) |  | HTML iconHTML  

    The instruction window is a critical component and a major energy consumer in out-of-order superscalar processors. An important source of energy consumption in the instruction window is the instruction wakeup: a completing instruction broadcasts its result register tag and an associative comparison is performed with all the entries in the window. This paper shows that a very large fraction of the completing instructions have to wake up no more than a single instruction currently in the window. Consequently, we propose to save energy by using indexing to only enable the comparator at the single instruction to wake up. Only in the rare case when more than one instruction needs to wake up, our scheme reverts to enabling all the comparators or a subset of them. For this reason, we call our scheme Hybrid. Overall, our scheme is very effective: for a processor with a 96-entry window, the number of comparisons performed by the average completing instruction with a destination register is reduced to 0.8. The exact magnitude of the energy savings will depend on the specific instruction window implementation. Furthermore, the application suffers no performance penalty. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Analyzing energy friendly steady state phases of dynamic application execution in terms of sparse data structures

    Page(s): 76 - 79
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (515 KB) |  | HTML iconHTML  

    In the past decades, data structure analysis was mainly done at a high level of abstraction in the computer science community. For instance, choosing a linked list as a data structure as opposed to an array for a specific situation, was mainly motivated from a performance point of view under the implicit assumption that the computer platform (that had to run the software) consisted out of one monolithical, physical memory. In the context of mobile, embedded devices, energy consumption is as important as performance. In addition to this, the assumption of one monolithical memory is outdated for many (if not all) current-day platforms! Clearly, there is a need to improve the choices that are made during data structure analysis given specific knowledge of the memory hierarchy of the platform under investigation. We show how memory related energy consumption can heavily be reduced by taking into account the access behaviour of the application on the one hand and the available on-chip and off-chip memory space on the other hand. We do this by exploiting the sparseness that is present in one steady state of the data structure under investigation. Analytical results show that energy reductions of a factor of 8.7 are feasible in comparison to common data structure implementations. We trade these gains off with on-chip memory space consumption of a custom memory architecture. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Unified methodology for resolving power-performance tradeoffs at the microarchitectural and circuit levels

    Page(s): 166 - 171
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (691 KB) |  | HTML iconHTML  

    Evaluation of architectural tradeoffs is complicated by implications in the circuit domain which are typically not captured in the analysis but substantially affect the results. We propose a metric of hardware intensity (η), which is useful for evaluating issues that affect both circuits and architecture. Analyzing data for actual designs we show how to measure the introduced parameters and discuss variations between observed results and common theoretical assumptions. For a power-efficient design we derive relations for η and supply voltage V under progressively more general situations, and incorporate η into a prior art architectural energy-efficiency criterion. Then, a more general relation is derived for the optimal balance between the architectural complexity, hardware intensity and power supply. Modified forms for these relations are obtained in special cases where the supply voltage is constrained or when clock gating is disallowed. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Discharge current steering for battery lifetime optimization

    Page(s): 118 - 123
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (832 KB) |  | HTML iconHTML  

    Recent work on battery-driven power management has demonstrated that sequential discharge is suboptimal in multi-battery systems, and lifetime can be maximized by distributing (steering) the current load on the available batteries, thereby discharging them in a partially concurrent fashion. Based on these observations, we formulate multi-battery life-time maximization as a continuous, constrained optimization problem, which can be efficiently solved by nonlinear optimizers. We show that great lifetime extensions can be obtained with respect to standard sequential discharge, as well to previously proposed battery allocation schemes. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parametric timing and power macromodels for high level simulation of low-swing interconnects

    Page(s): 307 - 312
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (651 KB) |  | HTML iconHTML  

    The impact of global on-chip interconnections on power consumption and speed of integrated circuits is becoming a serious concern. Designers need therefore to quickly estimate how performance and power are affected by a given choice of the interconnection parameters (length, voltage swing, driver and receiver schematics and sizing). This work focuses on the entire communication channel (driver, interconnect, receiver), and provides high level parametric VHDL simulation models for low-swing signaling schemes. These SPICE-derived power and timing macromodels transfer electrical-level information to the RTL simulation in an event-driven fashion, as transitions occur at the input of the interconnect driver. The accuracy reached by this back-annotation technique is within 5% with respect to SPICE results, with only 4% simulation speed penalty in the worst case. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • ±0.5 V∼±1.5 V VHF CMOS LV/LP four-quadrant analog multiplier in modified bridged-triode scheme

    Page(s): 227 - 232
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (717 KB) |  | HTML iconHTML  

    A new LV/LP CMOS four-quadrant analog multiplier designed in a modified bridged-triode scheme (MBTS) is presented. It provides benefits in terms of linearity, power consumption, frequency response and total harmonic distortion (THD). The fabricated chip in TSMC 0.35 μm n-well SPQM CMOS technology has a nonlinearity error less than 0.8% over ±0.5 V input range under a nominal supply voltage of ±1.5 V, and consumes the total power dissipation of 2.7 mW only. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • E2WFQ: an energy efficient fair scheduling policy for wireless systems

    Page(s): 30 - 35
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (765 KB) |  | HTML iconHTML  

    As embedded systems are being networked, often wirelessly, an increasingly larger share of their total energy budget is due to the communication. This necessitates the development of power management techniques that address communication subsystems, such as radios, as opposed to computation subsystems, such as embedded processors, to which most of the research effort thus far has been devoted. In this paper, we present E2WFQ, an energy efficient version of the weighted fair queuing (WFQ) algorithm for packet scheduling in communication systems. We employ a recently proposed radio power management technique, dynamic modulation scaling (DMS), as a control knob to enable energy-latency tradeoffs during wireless packet scheduling. The use of E2WFQ results in an energy aware packet scheduler, which exploits the statistics of the input arrival pattern as well as the variability in packet lengths. Simulation results show that large savings in energy consumption can be obtained through the use of our scheduling scheme, compared to conventional WFQ, with only a small, bounded increase in worst case packet latency. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Towards energy-aware software-based fault tolerance in real-time systems

    Page(s): 124 - 129
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (595 KB) |  | HTML iconHTML  

    Many real-time systems employed in defense, space, and consumer applications have power constraints and high reliability requirements. In this paper, we focus on the relationship between fault tolerance techniques and energy consumption. In particular, we establish the energy efficiency of Application Level Fault Tolerance (ALFT) over other software-based fault tolerance methods. We then develop sensible energy-aware heuristics for ALFT schemes. The heuristics yield up to 40% energy savings. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design techniques for low power high bandwidth upconversion in CMOS

    Page(s): 237 - 242
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (626 KB) |  | HTML iconHTML  

    An upconvertor topology for low power, high bandwidth applications is presented. Using specific circuit techniques and local circuit-level optimization, the power consumption of the total system comprising an on-chip LC-type VCO, a polyphase network quadrature generator, a linear mixer block and an RF-current buffer, has been minimized. A chip has been designed and manufactured in a 0.25 μm CMOS technology. The VCO oscillates between 1.68 GHz and 2 GHz. Driven by an external LO, the transmitter operates from 900 MHz up to 2 GHz. At 2 GHz, the upconvertor transmits -12 dBm into 50 Ω with a linearity of more than -35 dBc for base band signals up to 33 MHz. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design of a branch-based 64-bit carry-select adder in 0.18 μm partially depleted SOI CMOS

    Page(s): 108 - 111
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (430 KB) |  | HTML iconHTML  

    The paper presents the design of a 64-bit carry-select adder in Branch-Based Logic, a static design style that minimizes the internal node capacitances. This feature is used to lower the dynamic power dissipation, while maintaining good speed performances. The experimental realization of the adder demonstrates an overall delay of 720 ps while only dissipating 96 mW at 1 GHz. The fabrication is based on the 0.18 μm IBM CMOS8S2 SOI technology, which uses partially depleted transistors and copper metallization. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Is nanoelectronics the future of microelectronics?

    Page(s): 172 - 177
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (704 KB) |  | HTML iconHTML  

    We examine current research in nanoelectronics and discuss the role it may play in future electronic systems. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Conditional pre-charge techniques for power-efficient dual-edge clocking

    Page(s): 56 - 59
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (485 KB) |  | HTML iconHTML  

    A new dual edge-triggered flip-flop that saves power by inhibiting transitions of the nodes that are not used to change the state is presented. The proposed flip-flop is 12% faster with 10% lower energy-delay product for 50% data activity, as compared to the previously published dual edge-triggered storage elements. This was confirmed by simulation using 0.18μm process, 1.8V power supply, and clock frequency of 250MHz. This flip-flop is particularly suitable for low-power applications. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.