By Topic

ASIC/SOC Conference, 2002. 15th Annual IEEE International

Date 25-28 Sept. 2002

Filter Results

Displaying Results 1 - 25 of 89
  • Proceedings 15th Annual IEEE International ASIC/SOC Conference (Cat. No.02TH8626)

    Save to Project icon | Request Permissions | PDF file iconPDF (412 KB)  
    Freely Available from IEEE
  • Author index

    Page(s): 0_17 - 0_18
    Save to Project icon | Request Permissions | PDF file iconPDF (73 KB)  
    Freely Available from IEEE
  • System-level power evaluation of an embedded software data block processing algorithm

    Page(s): 451 - 455
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (450 KB) |  | HTML iconHTML  

    Data block processing algorithms have demonstrated significant efficiency in terms of low power consumption when applied to mainly hardware implementation of digital signal processing algorithms. In this paper, a generic data block processing algorithm is applied to the implementation of an FIR filter on a system-on-chip platform incorporating a micro controller and a programmable 32 bit DSP processor. The block processing algorithm is evaluated at the system-level including the performance metrics speed, energy, power and area. The data block processing technique achieves a reduction in energy consumption of 18% and memory accesses are reduced by 44%, for an 8 tap FIR filter. Our algorithm is targeted as a macro block, which can be re-used in the design of more complex DSP systems on the SoC platform. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An interface protocol component modeling language

    Page(s): 456 - 460
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (369 KB) |  | HTML iconHTML  

    Reusing IPs requires designers to perform interface protocol related tasks such as writing test benches and designing interface protocol conversion circuits, e.g, wrappers for IPs. The results of those tasks usually include the interface protocol components for the corresponding IPs, similar to bus protocol components of the bus functional models. Interface protocols of most IPs can be abstracted in transactions. This paper presents a transaction-oriented interface protocol description language which models interface protocol components recognizing or executing transactions over the given interface ports. In addition, we describe a target structure of the synthesizable interface protocol component together with its application to an IP wrapper design. The proposed approach not only reduces re-works on the interface protocol components but also enables the methodology that can be called "transaction-based interface design or synthesis". View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Variable threshold voltage keeper for contention reduction in dynamic circuits

    Page(s): 314 - 318
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (419 KB) |  | HTML iconHTML  

    A variable threshold voltage keeper circuit technique is proposed for simultaneous power reduction and speed enhancement of domino logic circuits. The threshold voltage of the keeper transistor is dynamically modified during circuit operation to reduce the contention current without sacrificing noise immunity. A four-bit multiple-output domino carry generator for a carry lookahead adder is designed with the proposed circuit technique. It is shown that the variable threshold voltage keeper circuit technique enhances the circuit evaluation speed by up to 60% while reducing power dissipation by 37% as compared to a standard domino logic circuit. It is also shown that the proposed domino logic circuit technique offers 20% higher noise immunity as compared to a standard domino circuit with the same evaluation delay characteristics. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Clock network analysis at the pre-layout stage for efficient clock tree synthesis [SOC design]

    Page(s): 363 - 367
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (420 KB) |  | HTML iconHTML  

    In synchronous circuits, the design of clock distribution networks can affect system performance and reliability dramatically. The clock tree synthesis (CTS) requires a technique to distribute clock signals effectively in a system-on-a-chip (SOC) design. This paper presents the techniques to analyze the clock networks that include gated clocks and multiple clock roots, and provide the information required for the successful CTS. We also propose a novel method to increase the accuracy of delay and power estimation at the pre-layout stage. Consequently, the proposed techniques constitute a new CTS design flow that enables a designer to reduce the design cycle by fixing the critical problems before getting into the layout phase. In order to demonstrate the effectiveness of the proposed techniques, an experiment on a real ASIC design has been carried out. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Three phase domino logic circuit

    Page(s): 319 - 322
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (336 KB) |  | HTML iconHTML  

    The speed and area advantage of domino logic circuits compared to static logic circuits makes them a favorite choice for the critical path of high performance processors. However they suffer from low noise margin. Noise is not scaling at the same rate as the supply voltage, and therefore new domino logic circuits are required to increase the noise margin. In this paper a new domino circuit is introduced. Simulations for a 3-input 180 nm AND gate shows that the noise margin can be increased by 62% with only 3% reduction in speed. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A low-voltage low-noise CMOS digital family

    Page(s): 198 - 202
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (340 KB) |  | HTML iconHTML  

    A CMOS logic family is proposed. The primary characteristic of this logic family is the low voltage operation (VDD between one and two transistor threshold voltages (VTs), with a typical VDD = 1.5 VT). While operating at this reduced power supply, low noise and high performance (such as high speed, low power, and high noise margins) are achieved. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Full text access may be available. Click article title to sign in or learn about subscription options.
  • SoC gate level design migration

    Page(s): 155 - 159
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (371 KB) |  | HTML iconHTML  

    A successful industrial product has a lifespan of 20 or more years. Off-the-shelf ICs and ASICs both rely on fabrication processes which are obsolete far sooner. The modern ASIC design process offers excellent portability along with HDL (hardware description language) device descriptions and test benches. Older designs, typically captured as gates in schematics and validation files in proprietary simulation environments, make porting a challenge. The validity of, and issues with, converting a legacy gate-level design are presented. There is a trend in certain SoC designs to initiate gate level migrations in order to meet time-to-market pressures. We have developed a design flow for gate-level design migrations and experiences using this flow are discussed. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Highly efficient digital CMOS accelerator for image and graphics processing

    Page(s): 127 - 132
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (464 KB) |  | HTML iconHTML  

    This paper presents a novel high-bandwidth digital accelerator for image and graphics processing applications. The proposed architecture outperforms previously proposed processing-in-memory architectures in speed, area and power by up to several orders of magnitude. Several variations of the design have been implemented in 2.5 V 0.25 μm and 1.8 V 0.18 μm CMOS technology. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • VLSI implementation of Ogg Vorbis decoder for embedded applications

    Page(s): 20 - 24
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (366 KB) |  | HTML iconHTML  

    In this paper, a novel VLSI architecture of an Ogg Vorbis decoder is proposed, dedicated for embedded applications. Aimed at the use of the decoder in portable audio appliances, first, the computational cost in a series of decoding processes is analyzed. As a result, the LSP (line spectrum pair) process is detected as a bottleneck to achieving realtime decoding by an embedded processor. Thus, the proposed architecture devises a specific hardware LSP module so as to be integrated into a single chip together with an ARM7TDMI processor. Moreover, our decoder employs fixed point arithmetic, rather than floating point arithmetic, by optimizing the calculation accuracy according to audio quality distortion analysis. The proposed LSP module has been implemented with 9,740 gates, and operates at 58.8 MHz, with the total CPU load reduced by 57%. Audio quality assessment indicated that the use of the fixed point arithmetic does not incur any significant sound distortion. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Full text access may be available. Click article title to sign in or learn about subscription options.
  • Long-term power minimization of dual-VT CMOS circuits

    Page(s): 323 - 327
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (378 KB) |  | HTML iconHTML  

    In this paper, we define long-term power dissipation in which the effect of the system-level power management on the total power dissipation of a given circuit is considered. Then, we present a novel design methodology to minimize the long-term power dissipation of a circuit used along with dual-threshold voltage selection and voltage scaling. In simulation on 16-bit carry lookahead adders (CLAs), the proposed approach can reduce up to 80% and 25% of the total power dissipation along with clock- and power-gating, respectively. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • HyPipe: a new approach for high speed circuit design

    Page(s): 203 - 207
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (381 KB) |  | HTML iconHTML  

    Wave pipelining improves the throughput of a circuit by exploiting the delays of combinational elements, rather than register clocks, for synchronization. We proposed a new design approach for high speed circuits which combines the conventional register-based pipelining with wave-pipelining. Our approach, called HyPipe, aims to take the advantages of both pipelining methods. We applied our method to 1-bit and 2-bit adder cells, which can be used as building blocks for larger size adders and multipliers. Our experimental results show that the 1-bit adder achieves the throughput of 2.4 billion additions/second and the 2-bit adder achieves 2.2 billion additions/second for TSMC 0.25 μm technology. Furthermore, they have potential for even higher throughputs provided registers are able to operate faster. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Full text access may be available. Click article title to sign in or learn about subscription options.
  • A CMOS Miller hold capacitance sample-and-hold circuit to reduce charge sharing effect and clock feedthrough

    Page(s): 92 - 96
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (359 KB) |  | HTML iconHTML  

    A technique using Miller capacitance in the sample-and-hold (S/H) circuit is introduced in this paper to reduce the charge sharing effect (CSE) due to the parasitic capacitance and clock feedthrough from a sampling switch. A compact cascode amplifier is used in the Miller feedback circuit and a ten times reduction in CSE and clock feedthrough is achieved. The S/H capacitor is split into two parts, Csh1 and Csh2. One of these S/H capacitors effectively reduces the CSE while the other capacitor reduces clock feedthrough. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Low-power, low-latency global interconnect

    Page(s): 394 - 398
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (340 KB) |  | HTML iconHTML  

    Global interconnects have been identified as a serious limitation to chip scaling, due to their latency and power consumption. We demonstrate a simple scheme to overcome these limitations, based on the utilization of upper-level metals and reduced voltage swing. The upper-level metal allows velocity of light delay if properly dimensioned and power is optimized by an appropriate choice of voltage swing and receiver amplifier. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design of an integrated silicon connectionist retina

    Page(s): 133 - 136
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (359 KB) |  | HTML iconHTML  

    We present here a new paradigm based on a centralized architecture to realize electronic artificial retina. This original architecture, named connectionist retina, can execute in real time RBF (radial basis function) and MLP (multilayer perceptron) neural network applications. We demonstrate that this intelligent embedded system could be used for vision applications. We describe the realized prototype system. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • DAC nonlinearity effects in a wide-band sigma-delta modulator architecture

    Page(s): 75 - 79
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (388 KB) |  | HTML iconHTML  

    This paper presents an alternative approach for the robust implementation of wideband sigma-delta modulators. The proposed topology consists of a pipelined cascaded sigma-delta modulator. To make quantization noise sources negligible at OSR of 8 both stages utilize multibit quantizers. To reduce sensitivity to opamp nonlinearities, both stages are implemented using the sigma-delta modulator with feedforward path. The nonlinearity effects on the modulator performance are analyzed. The simulation results show that the multibit DAC nonlinearity dominates the harmonic distortion and intermodulation distortion. The proposed sigma-delta modulator achieves an 85 dB SNR and 95 dB SFDR in the 6 MHz bandwidth, for 96 MHz sampling frequency. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Experimental evaluation of a compiler-based cache energy optimization strategy

    Page(s): 296 - 300
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (421 KB) |  | HTML iconHTML  

    In this paper, we present experimental results from an optimization strategy that aims at reducing the per access energy cost for direct-mapped data caches. We have developed a compiler algorithm that uses access pattern analysis to determine those references that are certain to result in cache hits (called 'certain hits') in a virtually-addressed, direct-mapped data cache. After detecting such references, the compiler substitutes the corresponding load operations with 'energy-efficient loads' that access only the data array of the cache instead of both tag and data arrays. This tag access elimination, in turn, reduces the per access energy consumption for data accesses. Our experimental results indicate that certain hits constitute a large percentage of total hits. They also show that even our most conservative strategy improves the data cache energy consumption by 11% on the average. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Characterizing dynamic and leakage power behavior in flip-flops

    Page(s): 433 - 437
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (339 KB) |  | HTML iconHTML  

    This paper presents a detailed analysis of power consumption in a variety of flip-flop designs including scannable latches. The analysis was performed by implementing and simulating the different designs using 70 nm, 1 V CMOS technology. First, we perform a detailed characterization of the dynamic power consumption due to output transitions, and that due to clock and data transitions when there is no output transition. Further, we also characterize the leakage behavior of each of the flip-flop designs and specifically, characterize the input dependence of leakage. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design and synthesis of direct connected network devices controller

    Page(s): 265 - 269
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (392 KB) |  | HTML iconHTML  

    In this paper we present the design and synthesis of a direct connected network devices controller (DCNDC) to provide substantial speed-up for networking applications that eliminate the need for a computer to connect digital devices to a local area network (LAN). Specifically, the first goal is to design DCNDC. The main function of DCNDC is to eliminate the operating system processing of the network protocol stack in order to simplify the connection and improve network performance. It utilizes the concept of network channels in Ethernet. Upon completion of the DCNDC and validating its simulation outputs, we synthesize the design in an FPGA chip. Performance measures like power consumption are computed using synthesis tools. If DCNDC reduces the power consumption, network devices can be powered through a category-5 network cable and eliminate the process of regular electrical power outlet installations and maintenance. So, DCNDC will also reduce the connection complexity in terms of device installation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Realization of compact low-power ripple-flash A/D converter architectures using conventional digital CMOS technology

    Page(s): 71 - 74
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (334 KB) |  | HTML iconHTML  

    In this paper, we present a generalized approach for the construction of ripple-flash ADC architectures that consist of cascade-connected capacitive threshold gates, realized using conventional CMOS technology. The main advantages of the proposed ADC architecture are the very small layout area, simple operation, high input-to-output response speed, and very low power dissipation. A new differential output voltage comparator is presented to ensure high precision and low propagation delay times. Several different ADC implementations are explored, including 4-bit, 5-bit and 6-bit ripple-flash circuit that demonstrate highly accurate DC transfer characteristics with INL errors smaller than 0.1 LSB, and near-ideal SNR levels for sampling frequencies of up to 50 MHz. Test circuits manufactured with 0.8 μm CMOS technology have shown that sampling rates in excess of 50 MHz are possible with this approach, while the silicon area and the power dissipation of the tested ADC circuits remain at least one order of magnitude smaller than those of similar flash ADCs built with the conventional approach. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design sensitivities to variability: extrapolations and assessments in nanometer VLSI

    Page(s): 411 - 415
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (419 KB) |  | HTML iconHTML  

    We propose a new framework for assessing (1) the impact of process variation on circuit performance and product value, and (2) the respective returns on investment for alternative process improvements. Elements of our framework include accurate device models and circuit simulation, along with Monte-Carlo analyses, to estimate parametric yields. We evaluate the merits of taking into account such previously unconsidered phenomena as correlations among process parameters. We also evaluate the impact of process variation with respect to such relevant metrics as parametric yield at selling point, and amount of required design guardbanding. Our experimental results yield insights into the scaling of process variation impacts through the next two ITRS technology nodes. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.