By Topic

Digital System Design (DSD), 2012 15th Euromicro Conference on

Date 5-8 Sept. 2012

Filter Results

Displaying Results 1 - 25 of 149
  • [Cover art]

    Page(s): C4
    Save to Project icon | Request Permissions | PDF file iconPDF (1794 KB)  
    Freely Available from IEEE
  • [Title page i]

    Page(s): i
    Save to Project icon | Request Permissions | PDF file iconPDF (104 KB)  
    Freely Available from IEEE
  • [Title page iii]

    Page(s): iii
    Save to Project icon | Request Permissions | PDF file iconPDF (152 KB)  
    Freely Available from IEEE
  • [Copyright notice]

    Page(s): iv
    Save to Project icon | Request Permissions | PDF file iconPDF (123 KB)  
    Freely Available from IEEE
  • Table of contents

    Page(s): v - xvii
    Save to Project icon | Request Permissions | PDF file iconPDF (164 KB)  
    Freely Available from IEEE
  • Message from General Chair

    Page(s): xviii
    Save to Project icon | Request Permissions | PDF file iconPDF (131 KB)  
    Freely Available from IEEE
  • Message from Program Chair

    Page(s): xix - xx
    Save to Project icon | Request Permissions | PDF file iconPDF (134 KB)  
    Freely Available from IEEE
  • Organizing Committee

    Page(s): xxi - xxii
    Save to Project icon | Request Permissions | PDF file iconPDF (130 KB)  
    Freely Available from IEEE
  • Program Committee

    Page(s): xxiii - xxvi
    Save to Project icon | Request Permissions | PDF file iconPDF (184 KB)  
    Freely Available from IEEE
  • Reviewers

    Page(s): xxvii
    Save to Project icon | Request Permissions | PDF file iconPDF (108 KB)  
    Freely Available from IEEE
  • Keynotes [3 abstracts]

    Page(s): xxviii - xxxiii
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (194 KB)  

    Summary form only given. Provides an abstract for each of the three keynote presentations and a brief professional biography of its presenter. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Impact of Duty Factor, Stress Stimuli, and Gate Drive Strength on Gate Delay Degradation with an Atomistic Trap-Based BTI Model

    Page(s): 1 - 7
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (633 KB) |  | HTML iconHTML  

    With deeply scaled CMOS technology, Bias Temperature Instability (BTI) has become one of the most critical degradation mechanisms impacting the device reliability. In this paper, we present the BTI evaluation of a single inverter gate covering both the PMOS and NMOS degradations in a workload dependent, atomistic trap-based, stochastic BTI model. The gate propagation delay depends on the gate intrinsic delay, the input signal characteristics, and the output load. Thus, the BTI degradation is investigated due to the impact of 1) duty factor, 2) periodic clock-based and non-periodic random input sequences, 3) gate drive strength. The inverter is chosen due to its representativity of other CMOS logic gates. The applied BTI model is stochastic, and the device parameters are orthogonally generated by distributions. Results show 3% and 27% degradation shifts on the distribution mean and worst-case. In addition, it is shown that the near-critical paths with lower drive strength cells are more susceptible to the BTI degradation than the critical paths with higher drive strength cells. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Architecture and Design Analysis of a Digital Single-Event Transient/Upset Measurement Chip

    Page(s): 8 - 17
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (380 KB) |  | HTML iconHTML  

    This paper presents the architecture and a detailed design analysis of a digital measurement chip which facilitates long-term irradiation experiments of basic asynchronous circuits. It combines radiation targets like Muller C-elements and elastic pipelines as well as standard combinational gates and flip-fops with an elaborate on-chip measurement infrastructure. Major architectural challenges result from the fact that the latter must operate reliably under the same radiation conditions the target circuits are exposed to, without wasting precious die area for a rad-hard design. A measurement architecture based on multiple non-rad-hard counters is used, which we show to be resilient against double faults, as well as many triple and even higher-multiplicity faults. The analysis is done by means of comprehensive fault injection experiments, which are based on detailed Spice models of the circuits in conjunction with a standard double-exponential current injection model for single-event transients. We also provide probabilistic calculations of the sustainable particle flow rates, based on the results of a detailed area analysis in conjunction with experimentally determined cross section data for the ASIC implementation technology used. The results confirm that the overall architecture indeed supports significant target hit rates, without exceeding the resilience bound of the measurement infrastructure. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Accurate Estimation of Leakage Power Variability in Sub-micrometer CMOS Circuits

    Page(s): 18 - 25
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (874 KB) |  | HTML iconHTML  

    Leakage power has already become the major contributor to the total on-chip power consumption, rendering its estimation a necessary step in the IC design flow. The problem is further exacerbated with the increasing uncertainty in the manufacturing process known as process variability. We develop a method to estimate the variation of leakage power in the presence of both intra-die and inter-die process variability. Various complicating issues of leakage prediction such as spatial correlation of process parameters, the effect of different input states of gates on the leakage, and DIBL and stack effects are taken into account while we model the simultaneous variability of the two most critical process parameters, threshold voltage and effective channel length. Our subthreshold leakage current model is shown to fit closely on the HSPICE Monte Carlo simulation data with an average coefficient of determination (R2) value of 0.9984 for all the cells of a standard library. We also demonstrate the adjustability of this model to wider ranges of variation and its extendability to future technology scalings. We show that our framework imposes little timing penalty on the system design flow and is applicable to real design cases. The procedures explained in this paper are part of VAREX, an academic variability modeling framework for estimation of the effect of process variation on power consumption and performance of Multiprocessor SoCs. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • reMORPH: A Runtime Reconfigurable Architecture

    Page(s): 26 - 33
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (933 KB) |  | HTML iconHTML  

    Programmable hardware built on a regular architecture can partially alleviate the problem of increased defect densities associated with transistor scaling by dynamically wiring around the defects [1]. The fine granularity of FPGAs is however unsuitable for effectively exploiting runtime reconfiguration because of the high overheads involved. A coarse grain reconfigurable array with malleable communication links - reMORPH - is proposed in this paper. The compute tile uses DSP48E and BRAM embedded blocks in a Xilinx FPGA and has a very low footprint of about 200 slice LUTs. The semi-systolic near neighbour communication interconnect can be dynamically reconfigured for each “epoch” of computation. The “epoch” or phases of the application are obtained via profiling or static data flow analysis. Some of the links between the compute tiles are changed during the reconfiguration phase which drastically reduces the context switch overhead enabling high performance/area applications to be built on this fabric. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Designing a High Performance and Reliable Networks-on-Chip Using Network Interface Assisted Routing Strategy

    Page(s): 34 - 41
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (489 KB) |  | HTML iconHTML  

    Partial Virtual channel Sharing (PVS) architecture has been proposed to enhance the performance of Networks-on-Chip (NoC) based systems. In this paper, we present an efficient and reliable Network Interface (NI) assisted routing strategy for NoC using PVS architecture. For this purpose, NoC system is divided into clusters. Each cluster is a group of two nodes comprising Processing Elements (PE), switches, links, etc. Each PE in a cluster can inject data to the network through a router, which is closer to the destination. This helps to reduce the network load by reducing the average hop count of the network. The proposed architecture can recover the PE disconnected from the network due to network level faults by allowing the PE to transmit and receive the packets through the other router in the cluster. 5̅×6 crossbar is used for the proposed architecture which requires one more 5×1 multiplexer without increasing the critical path delay of the router as compared to the 5×5 crossbar. The proposed router has been simulated for uniform and negative exponential distribution (NED) traffic patterns. The simulation results show the significant reduction in average packet latency at the expense of negligible area overhead. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Scalable Monitoring Infrastructure for Self-Organizing Many-Core Architectures

    Page(s): 42 - 49
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (392 KB) |  | HTML iconHTML  

    Self-organizing principles can address the growing complexity and the huge challenge of management and efficient utilization of adaptive many-core architectures. Fundamental for realizing a self-organizing behavior within such architectures is a dedicated monitoring infrastructure that provides the essential information about the system status and system behavior for realizing the basic property of self-awareness. This paper therefore proposes a flexible, hierarchical and scalable monitoring infrastructure for self-organizing, adaptive many-core architectures. The employed basic monitoring unit in the bottom monitoring layer performs data aggregation and filtering and reduces the amount of data that must be processed in higher monitoring layers. The middle layer performs first data analysis and is further responsible for hiding the heterogeneity of the underlying hardware configuration to the topmost monitoring layer. The latter is finally responsible for detecting changes in the system behavior and realizing self-awareness. The proposed monitoring infrastructure was evaluated entirely using a simulation framework. Results show that the infrastructure is able of detecting changes in the system behavior of an entire many-core system causing only a minor system disturbance. Further, the prototypical implementation of the basic monitoring unit proved that it can be realized very efficiently in hardware. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the Design of Configurable Modulo 2n±1 Residue Generators

    Page(s): 50 - 56
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (352 KB) |  | HTML iconHTML  

    In this work new efficient modulo 2n+1 residue generators are proposed. The input operands are divided into n-bit vectors which are added by an inverted end around carry save adder tree and a final stage diminished-1 modulo 2n+1 adder. The conversion of the proposed residue generators to configurable modulo 2n±1 ones is also discussed. Modulo 2n±1 residue generators find applicability as forward converts from the binary to the residue number system, and in the design of self-checking digital systems. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Projected Don't Cares

    Page(s): 57 - 64
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (336 KB) |  | HTML iconHTML  

    In this paper we define and study the properties of projected don't cares, a category of don't cares dynamically built by the minimization algorithm during the synthesis phase. Our target is to exploit projected don't cares properties in order to obtain more compact networks. In particular, we show the use of projected don't care conditions in two synthesis techniques, i.e., using a Boolean and an algebraic algorithm. Experimental results show that in the Boolean case 65% of the considered benchmarks achieve more compact area when implemented using projected don't cares. The benefit in the algebraic approach is reduced (35% of instances benefit from the proposed technique), even if there are examples with an interesting decrease of the area. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • SUT-RNS Residue-to-Binary Converters Design

    Page(s): 65 - 72
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (424 KB) |  | HTML iconHTML  

    The Stored Unibit Transfer (SUT) encoding has been recently proposed as a redundant high-radix encoding for each of the channels of a Residue Number System (RNS) that can improve the efficiency of Binary Signed Digit (BSD)-encoded RNS. However, a residue-to-binary (reverse) converter for it has not yet been reported in the open literature. In this paper we introduce SUT-RNS reverse converters for two different moduli sets, that is, for the 3-moduli {2n-1, 2n, 2n+1} and for the 4-moduli {2n-1, 2n, 2n+1, 22n+1} sets. The area and delay costs of the proposed converters are shown to be less than those required by the corresponding RNS converters for the BSD encoding. In the 4-moduli set case, the converters' costs are shown to be close to those of the corresponding converters for the binary encoding. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Automated Generation of Built-In Self-Repair Architectures for Random Logic SoC Cores

    Page(s): 73 - 78
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (259 KB) |  | HTML iconHTML  

    Built-in self-repair (BISR) architectures and methods are widely used for memory cores of system-on-chips (SoCs), where the area-efficient fault detection and repair are crucial in order to meet the high quality requirements. Research of BISR architectures for logic cores has begun as well. However, the irregular structure of logic cores represents a serious limitation and therefore, currently only ad hoc methods exist. Automated generation of BISR architectures for random logic SoC cores is proposed in this paper. The generation is guided by the characteristics of the architecture: mean time to failure (MTTF) and area overhead. The main contribution is the fully automated handling of arbitrary random logic cores and the possibility to generate architectures based on various BISR principles. The proposed method was implemented and evaluated over benchmark circuits, and the experiments confirmed that BISR architectures can be successfully generated for random logic cores. The MTTFs of the generated architectures have been improved at the cost of relatively low area overhead. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Miscellaneous Types of Partial Duplication Modifications for Availability Improvements

    Page(s): 79 - 83
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (316 KB) |  | HTML iconHTML  

    This paper compares four different redundancy methods, which includes parity code, partial duplication and their combinations, with two standard methods (Duplex and Triple Module Redundancy). Two main attributes are observed: the Total size of system including overhead caused by redundancy addition and steady-state availability - dependability parameter defining the readiness for correct service of a system. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Reliability of Task Execution During Safe Software Processing

    Page(s): 84 - 89
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (415 KB) |  | HTML iconHTML  

    This paper presents the reliability evaluation of task execution during safe software processing. The standard method of duplication in a safety-critical application can also be applied for tasks in a software system. But in addition to this, there is also the possibility for coded task processing to increase the reliability and availability of software. The presented analysis covers the reliability analysis of a single, a duplicated and a coded task by the technique of continuous time Markov processes. Markov processes are often used for the reliability evaluation of safety-critical systems. We introduce a method to describe the execution time of tasks by means of enhanced Markov models and their solution by numerical methods. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Power Optimization Opportunities for a Reconfigurable Arithmetic Component in the Deep Submicron Domain

    Page(s): 90 - 97
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (431 KB) |  | HTML iconHTML  

    In the era of deep submicron integration, digital design complexity is increasing with rates that are hard to follow. On one hand, market demand for newer, faster and reliable applications never stops. On the other hand, fabrication technology can not cover this demand with frequency increase and dimension shrinking only, as it has been done in the past. New architectural level innovations are needed, like reconfigurable computing. Reconfigurable computing takes advantage of idle components or shared functionality between different algorithms, to maximize utilization and improve performance, based on efficient circuit switching interconnections. However, dense and switching interconnections bring power dissipation problems, which are more clear in the deep submicron domain. This paper, presents opportunities for both dynamic and static power reduction for a reconfigurable arithmetic component, which can be used as an IP in RTL and above RTL synthesis methodologies (ESL, HLS, IP based). Both bitwidth and technology scaling is explored, showing that the overall proposed architecture offers clear advantages as device dimensions shrink. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • OWQS: One-Way Quantum Computation Simulator

    Page(s): 98 - 104
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (356 KB) |  | HTML iconHTML  

    In one-way quantum computation (1WQC) model, universal quantum computations are performed using measurements to designated qubits in a highly entangled state. The choices of basis for these measurements as well as the structure of the entanglements specify a quantum algorithm. Although a number of methods have been proposed to simulate quantum circuit model on classical computers, no efficient tool has been developed to simulate the 1WQC model directly. In this paper, some techniques such as qubit elimination, implicit and in-place matrix-vector multiplication and pattern reordering are utilized to considerably reduce the time and memory needed for the simulations. These techniques were implemented in a tool called One-Way Quantum computation Simulator (OWQS). Experimental results validate the efficiency of the proposed approach. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.