By Topic

Asynchronous Circuits and Systems, 2005. ASYNC 2005. Proceedings. 11th IEEE International Symposium on

Date 14-16 March 2005

Filter Results

Displaying Results 1 - 25 of 33
  • Proceedings. 11th IEEE International Symposium on Asynchronous Circuits and Systems

    Publication Year: 2005
    Save to Project icon | Request Permissions | PDF file iconPDF (434 KB)  
    Freely Available from IEEE
  • 11th IEEE International Symposium on Asynchronous Circuits and Systems - Title Page

    Publication Year: 2005 , Page(s): i - ii
    Save to Project icon | Request Permissions | PDF file iconPDF (40 KB)  
    Freely Available from IEEE
  • 11th IEEE International Symposium on Asynchronous Circuits and Systems - Copyright Page

    Publication Year: 2005 , Page(s): iv
    Save to Project icon | Request Permissions | PDF file iconPDF (46 KB)  
    Freely Available from IEEE
  • 11th IEEE International Symposium on Asynchronous Circuits and Systems - Table of contents

    Publication Year: 2005 , Page(s): v - vi
    Save to Project icon | Request Permissions | PDF file iconPDF (37 KB)  
    Freely Available from IEEE
  • Message from the Chairs

    Publication Year: 2005 , Page(s): vii - viii
    Save to Project icon | Request Permissions | PDF file iconPDF (28 KB) |  | HTML iconHTML  
    Freely Available from IEEE
  • Symposium Committee

    Publication Year: 2005 , Page(s): ix
    Save to Project icon | Request Permissions | PDF file iconPDF (23 KB)  
    Freely Available from IEEE
  • Technical Program Committee

    Publication Year: 2005 , Page(s): ix
    Save to Project icon | Request Permissions | PDF file iconPDF (23 KB)  
    Freely Available from IEEE
  • Additional reviewers

    Publication Year: 2005 , Page(s): x
    Save to Project icon | Request Permissions | PDF file iconPDF (19 KB)  
    Freely Available from IEEE
  • Steering Committee

    Publication Year: 2005 , Page(s): x
    Save to Project icon | Request Permissions | PDF file iconPDF (19 KB)  
    Freely Available from IEEE
  • Deep pipelines vs. risk and power walls [microprocessors]

    Publication Year: 2005
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (19 KB)  

    Summary form only given. Intel's ×86 processors pushed pipelining and clock rates until physics stopped us. Less obviously, we were also pushing complexity, and therefore risk. We now know where the limits to these trends lie: with the Prescott processor. This talk explores the nature of risk in chip developments, how the ever-deepening pipelines in the Pentium series affected, and were affected by, perceived risk and thermals, and where the future will take us. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Energy efficient surfing [latchless pipelining technique]

    Publication Year: 2005 , Page(s): 2 - 11
    Cited by:  Papers (6)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (272 KB) |  | HTML iconHTML  

    Surfing is a latchless pipelining technique where the propagation delays of gates and other logic functions are modulated to produce event attractors. We describe a test chip that demonstrates a surfing pipeline ring and then introduce new circuits that dramatically reduce the energy overhead for surfing. Our test chip implements a twelve-stage, surfing ring that supports two independent waves of computation without latches or other storage elements. We have operated the chip for over 48 hours and more than 2.6×1015 surfing events without an error. However, the energy consumption of the ring is unacceptable for scaling to larger applications. Thus, we introduce a new family of surfing circuits that use less energy than their domino counterparts and provide a factor of up to 1.75 improvement by the Et2 metric. We demonstrate this new family with the design of a carry lookahead adder. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • GasP control for domino circuits

    Publication Year: 2005 , Page(s): 12 - 22
    Cited by:  Papers (4)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (216 KB) |  | HTML iconHTML  

    We present two novel asynchronous control circuits for domino pipelines. The control circuits are based on GasP circuits, have a minimum cycle time of six gate delays, and compare favorably with previously published control circuits. We present some results from a chip implementation of several 64-bit domino adders in a TSMC CMOS 180 nm process technology. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design of high-performance power-aware asynchronous pipelined circuits in MOS current-mode logic

    Publication Year: 2005 , Page(s): 23 - 32
    Cited by:  Papers (3)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (544 KB) |  | HTML iconHTML  

    This paper introduces the implementation of multi-GHz power-aware asynchronous pipelined circuits in MOS current-mode logic (MCML). The C-element and double-edge-triggered flip-flop are implemented in MCML and used in the so-called micropipeline circuits. An input data detector is proposed to put the inactive combinational logic into sleep mode. The effects of different layout techniques on the performance and power dissipation of an MCML FIFO are also investigated. Based on post-layout simulation results in a standard 0.18 μm CMOS technology, an asynchronous MCML four-stage FIFO demonstrates a throughput of 4 GHz while dissipating 3.7 mW. The MCML C-element dissipates up to 4× less power compared to its conventional static counterpart at the same throughput of 1.9 GHz. The asynchronous MCML pipelined four-bit carry-look ahead adder with power-saving mechanism reduces the power dissipation by 32% compared to the one without the power-saving mechanism. The power overhead of the input data detector is only 0.23 mW. The input data detector shuts off the stage power in 2 ns and restores the stage in 150 ps after the presence of the new input. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Scheduling discipline for latency and bandwidth guarantees in asynchronous network-on-chip

    Publication Year: 2005 , Page(s): 34 - 43
    Cited by:  Papers (29)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (472 KB) |  | HTML iconHTML  

    Guaranteed services (GS) are important in that they provide predictability in the complex dynamics of shared communication structures. This paper discusses the implementation of GS in an asynchronous network-on-chip. We present a novel scheduling discipline called asynchronous latency guarantee (ALG) scheduling, which provides latency and bandwidth guarantees in accessing a shared media, e.g. a physical link shared between a number of virtual channels. ALG overcomes the drawbacks of existing scheduling disciplines, in particular, the coupling between latency and bandwidth guarantees. A 0.12 μm CMOS standard cell implementation of an ALG link has been simulated. The operation speed of the design was 702 MDI/s. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An asynchronous router for multiple service levels networks on chip

    Publication Year: 2005 , Page(s): 44 - 53
    Cited by:  Papers (15)  |  Patents (6)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (216 KB) |  | HTML iconHTML  

    Networks on chip that can guarantee quality of service (QNoC) are based on special routers that can support multiple service levels. GALS SoCs call for asynchronous NoC implementations, to eliminate the need for synchronization when crossing clock domains. An asynchronous multi-service level QNoC router is investigated. It comprises multiple interconnected input and output ports, and arbitration mechanisms that resolve any output port and service level conflicts. Buffering and credit based transport are enabled, enhancing throughput. A synchronous and an asynchronous router have been designed, and their performance is compared. The asynchronous router requires less area and enables a higher data rate. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An asynchronous NOC architecture providing low latency service and its multi-level design framework

    Publication Year: 2005 , Page(s): 54 - 63
    Cited by:  Papers (63)  |  Patents (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (200 KB) |  | HTML iconHTML  

    The demands of scalable, low latency and power efficient system-on-chip interconnect cannot only be satisfied by point-to-point or shared-bus interconnects. In this paper, we propose a new asynchronous network-on-chip (NOC) architecture which provides low latency transfers. This architecture is implemented as a GALS system, where chip units are built as synchronous islands, connected together using a delay insensitive asynchronous network-on-chip topology. The proposed NOC protocol and its asynchronous implementation are presented as well as the multi-level modeling approach using SystemC language and transaction-level-modeling. Preliminary simulation results show that the asynchronous NOC can offer 5 Gbytes/s throughput in a 0.13 μm CMOS technology. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Register-communication between mutually asynchronous domains

    Publication Year: 2005 , Page(s): 66 - 75
    Cited by:  Papers (4)  |  Patents (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (160 KB) |  | HTML iconHTML  

    We present the design of several so-called communication registers, which are modules that support non-blocking communication between two mutually asynchronous domains. For that purpose, a communication register offers two mutually asynchronous access ports: a write and a read port. Communication registers differ from buffers in that read and write accesses are never held up. Consequently, data may get duplicated or lost. A read access, however, always delivers a value written into the register, although not necessarily the latest one. Each of the two access ports is either clocked or self-timed, where the accesses through a self-timed port are controlled by handshakes. Therefore, one can distinguish four different kinds of modules: one for each possible access port combination. For all four cases, we give simple designs, which in several cases are subsequently refined to meet additional requirements, such as setting an upper-bound to the mutual timing interference, keeping the power consumption low, or reducing the latency. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Request-driven GALS technique for wireless communication system

    Publication Year: 2005 , Page(s): 76 - 85
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (352 KB) |  | HTML iconHTML  

    A globally asynchronous - locally synchronous (GALS) technique for application in wireless communication systems is proposed and evaluated. The GALS wrappers are based on a request-driven operation with an embedded time-out function. A formally verified GALS wrapper is deployed for the 'GALSiftcation' of a baseband processor for WLAN. Details of the GALS partitioning, implementation and the design-flow are discussed. Furthermore, a test strategy based on built-in self-test (BIST) is suggested. The described baseband processor was fabricated and successfully tested. The GALS design is compared with a clock-gated, synchronous version. Advantages for system integration are achieved along with a 1% reduction in dynamic power consumption, a 30% reduction in peak power supply current, and 5 dB reduction in spectral noise. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Self-timed circuitry for global clocking

    Publication Year: 2005 , Page(s): 86 - 96
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1912 KB) |  | HTML iconHTML  

    We present an apparatus used to distribute a timing reference or clock across the extent of a digital system. Self-timed circuitry both generates and distributes a clock signal, while using less power and less skew compared to a clock tree. HSpice simulations, in a 180 nm CMOS process, comparing the distributed clock generator presented in this paper and an H-tree clock distribution system, each clocking a 16 mm×16 mm area suggests a 30% power savings. Also worst case skew was reduced from 27 ps to 2 ps while using a clock period equivalent to 9 FO4 gates. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Proximity communication and time [capacitively coupled IC communication]

    Publication Year: 2005
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (20 KB)  

    Summary form only given. Two IC chips placed face-to-face can communicate without direct electrical contact. The capacitive coupling between their top-level metal layers can carry data. We have demonstrated such "proximity communication" on 50 μm centers and data rates similar to on-chip wires. Such communication offers attractive speed, density, and energy economy, but requires accurate mechanical alignment. Proximity communication requires sensitive amplifiers to compensate for the attenuation suffered as signals pass from one chip to the other. Signals with uncertain arrival times require an amplifier that can distinguish between signal and no signal. The difference between receiving attenuated data signals and receiving attenuated control signals focuses attention on the fundamental problem of time in asynchronous systems. This talk addresses the above issues. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Modeling and verifying circuits using generalized relative timing

    Publication Year: 2005 , Page(s): 98 - 108
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (416 KB) |  | HTML iconHTML  

    We propose a novel technique for modeling and verifying timed circuits based on the notion of generalized relative timing. Generalized relative timing constraints can express not just a relative ordering between events, but also some forms of metric timing constraints. Circuits modeled using generalized relative timing constraints are formally encoded as timed automata. Novel fully symbolic verification algorithms for timed automata are then used to either verify a temporal logic property or to check conformance against an untimed specification. The combination of our new modeling technique with fully symbolic verification methods enables us to verify larger circuits than has been possible with other approaches. We present case studies to demonstrate our approach, including a self-timed circuit used in the integer unit of the Intel® Pentium®4 processor. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Controlling event spacing in self-timed rings

    Publication Year: 2005 , Page(s): 109 - 115
    Cited by:  Papers (5)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (808 KB) |  | HTML iconHTML  

    Prior research in event spacing has identified two effects which contribute to the phenomenon of bursting events in self-timed systems, namely the Charlie and the Drafting effects. In this paper, we attempt to further the understanding of these effects by presenting an analysis of their magnitude for a range of asynchronous handshaking controller implementations. The main contribution of this work is to demonstrate that event spacing irregularities are not an inherent property of self-timed circuits, but can be controlled by careful circuit design. We demonstrate that bursting effects are indeed dependent on the specific implementation of the handshaking circuits used in an asynchronous system, by showing that the magnitude of the Charlie and Drafting effects is implementation-dependent. We also explain how both of these effects can be mitigated by altering the electrical characteristics of the circuit implementation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Delay insensitive encoding and power analysis: a balancing act [cryptographic hardware protection]

    Publication Year: 2005 , Page(s): 116 - 125
    Cited by:  Papers (6)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1024 KB) |  | HTML iconHTML  

    Unprotected cryptographic hardware is vulnerable to a side-channel attack known as differential power analysis (DPA). This attack exploits data-dependent power consumption of a computation to determine the secret key. Dual-rail asynchronous circuits have been regarded as a potential countermeasure to this attack. In this paper, we evaluate the security of asynchronous dual-rail circuits against DPA. Our results show that, unless special precautions are taken, asynchronous circuits are not inherently more DPA resistant than their synchronous dual-rail counterparts. We show that the use of -spaced or return-to-zero (RTZ) protocols, used to provide delay-insensitive encoding for asynchronous circuits, can make a DPA attack easier. We present an overview of balancing dynamic implementations of dual-rail fine-grained asynchronous gates that offer a solution for the DPA weakness. We demonstrate the use of asynchronous balanced cells that use RTZ which are not only secure against DPA but also deliver high performance with low design effort through automated pipelining. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A scalable counterflow-pipelined asynchronous radix-4 Booth multiplier

    Publication Year: 2005 , Page(s): 128 - 137
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (344 KB) |  | HTML iconHTML  

    This paper introduces an asynchronous radix-4 Booth multiplier architecture, which is scalable to arbitrary operand lengths while maintaining a constant cycle time per Booth iteration. It has several novel features, including: (i) a novel counterflow organization, in which the data bits flow in one direction and the Booth commands piggyback on the acknowledgments flowing in the opposite direction; (ii) overlapped execution of multiple iterations of the Booth algorithm; and (iii) design modularity and bit-level pipelining, which enable the multiplier to be scaled to arbitrary operand widths without requiring gate resizing or cycle time overheads. Spice simulations in a 0.18 μm TSMC CMOS process at 1.8 V indicate promising performance: the multiplier takes 640-650 ps per Booth iteration, regardless of the operand widths, thereby demonstrating the scalability of our approach. For 16-bit operands, this performance corresponds to nearly 200 Mops/s throughput. Furthermore, the multiplier is fully functional at reduced supply voltages (e.g., 1.5 V and 1.0 V), and thus capable of dynamically trading off performance for energy efficiency. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Continuous-time digital signal processors

    Publication Year: 2005 , Page(s): 138 - 143
    Cited by:  Papers (15)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (288 KB) |  | HTML iconHTML  

    In this paper, we discuss how asynchronous design techniques can be used in the implementation of continuous-time signal processors. Such processors are presented by signals developed by continuous-time analog-to-digital converters which involve no sampling, and thus do not exhibit aliasing; in addition, the resulting in-band quantization error is lower than in conventional techniques. Several design considerations are given, and preliminary experimental results are presented. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.