By Topic

Computers & Digital Techniques, IET

Issue 4 • Date July 2012

Filter Results

Displaying Results 1 - 7 of 7
  • Arithmetic module-based built-in self test architecture for two-pattern testing

    Page(s): 195 - 204
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (337 KB)  

    Built-in self test (BIST) techniques use test pattern-generation and response-verification operations, reducing the need for external testing. BIST techniques that use arithmetic modules existing in the circuit (accumulators, counters etc.) to perform the testgeneration and response-verification operations have been proposed in the open literature. Two-pattern tests are exercised to detect complementary metal oxide semiconductor (CMOS) stuck-open faults and to assure correct temporal circuit operation at clock speed (delay fault testing). In this study, a novel, arithmetic module-based BIST architecture for two-pattern testing (ABAS) is presented that exercises arithmetic modules to generate two-pattern tests; the hardware overhead required by the presented scheme, provided the availability of such modules is by far the lowest of all schemes that have been presented for the same purpose in the open literature. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multi-objective optimisations for a superscalar architecture with selective value prediction

    Page(s): 205 - 213
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (402 KB)  

    This work extends an earlier manual design space exploration (DSE) of the authors?? developed selective load value prediction-based superscalar architecture to the L2 unified cache. After that the authors perform an automatic DSE using a special developed software tool by varying several architectural parameters. The goal is to find optimal configurations in terms of cycles per instruction and energy consumption. By varying 19 architectural parameters, as the authors proposed, the design space is over 2.5 millions of billions configurations which obviously means that only a heuristic search can be considered. Therefore the authors propose different methods of automatic DSE based on their developed framework for automatic design space exploration which allow them to evaluate only 2500 configurations of the above mentioned huge design space! The experimental results show that their automatic DSE provides significantly better configurations than the previous manual DSE approach, considering the proposed multi-objective approach. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient post-configuration testing of an asynchronous nanowire crossbar system for reliability

    Page(s): 214 - 222
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (766 KB)  

    The recently proposed asynchronous nanowire crossbar architecture is envisioned to enhance the manufacturability and robustness of nanowire crossbar-based configurable digital circuits by removing various timing-related failure modes. Even though the proposed clock-free nanowire crossbar architecture has numerous technical merits over its clocked counterparts, it is still subject to high defect rates inherently induced by the non-deterministic nanoscale assembly of nanowire crossbars. In order to address this issue, a novel functional testing scheme has been proposed to validate threshold gates configured on programmable gate macro blocks (PGMB). The proposed approach selectively tests the crosspoints programmed as ON-state using test vectors tailored to the given threshold gate macro and its functionality. Therefore high-fault coverage can be achieved at significantly reduced test overhead. Also, numerous replacement and reconfiguration schemes have been proposed based on the proposed functional testing scheme to repair configured PGMBs that are partially faulty by locating incorrectly programmed crosspoints and replacing them with defect-free spares. Specific figures of merit have also been coined to quantify the performance of the proposed testing and reconfiguration algorithms. These findings have been extensively validated by a series of parametric simulations. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Functional broadside tests for embedded logic blocks

    Page(s): 223 - 231
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (152 KB)  

    When a logic block is embedded in a larger design, the input sequences applicable to it may be constrained by other logic blocks in the design. This has an impact on what would constitute overtesting of the logic block by scan-based tests. This study defines functional broadside tests that avoid overtesting for an embedded block based on functional broadside tests for the larger design. The definition is constructive and results in a procedure for generating the tests. This study compares these tests with ones generated for the logic block as a stand-alone circuit. The results demonstrate that it is important to consider in the discussion of overtesting the extent to which the functionality of an embedded logic block is utilised as a part of the design. Under certain conditions it is possible to apply to the logic block functional broadside tests that were generated for it as a stand-alone circuit in order to maximise the fault coverage without overtesting, and reduce the computational complexity of test generation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Reset and partial-reset-based functional broadside tests

    Page(s): 232 - 239
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (158 KB)  

    Functional broadside tests were defined to avoid overtesting that may occur under scan-based tests because of non-functional operation conditions created by unreachable scan-in states. Functional broadside tests were computed assuming that functional operation starts after the circuit is initialised by applying a synchronising sequence. This study discusses the definition of functional broadside tests for the case where hardware reset is used for bringing the circuit into a known state before functional operation starts. This study shows that the set of reachable states for a circuit with hardware reset contains the set of reachable states based on a synchronising sequence. Consequently, the set of functional broadside tests and the set of detectable faults for a circuit with hardware reset contain those obtained based on a synchronising sequence. In addition, there are differences between different reset states in the sets of reachable states and the sets of detectable faults. This study also discusses the case where hardware reset is provided only for a subset of the state variables (referred to as partial reset). View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design of experiments and integer linear programming-assisted conjugate-gradient optimisation of high-κ/metal-gate nano-complementary metal-oxide semiconductor static random access memory

    Page(s): 240 - 248
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (953 KB)  

    Low-power consumption and stability in static random access memories (SRAMs) is essential for embedded applications. This study presents a novel design flow for power minimisation of nano-complementary metal-oxide semiconductor SRAMs, while maintaining stability. A 32 nm high-k/metal-gate SRAM has been used as an example circuit. The baseline circuit is subjected to power minimisation using a dual-threshold voltage assignment based on novel combined design of experiments and integer linear programming (DOE-ILP) approach. However, this leads to a 15% reduction in the static noise margin (SNM) of the cell. The conjugate gradient optimisation overcomes this SNM degradation, while reducing the power consumption. The final SRAM design shows 86% reduction in power consumption (including leakage) and 8% increase in the SNM compared with the baseline design. The variability analysis of the optimised cell is performed by considering the effect of 12 parameters. SRAM arrays of different sizes are constructed to demonstrate the feasibility of the proposed SRAM cell. To the best of the authors' knowledge, this is the first study which makes use of DOE-ILP and conjugate gradient method for simultaneous stability and power optimisation in high-k/ metal-gate SRAM circuits. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • FPGA accelerator for floating-point matrix multiplication

    Page(s): 249 - 256
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (290 KB)  

    This study treats architecture and implementation of a field-programmable gate array (FPGA) accelerator for doubleprecision floating-point matrix multiplication. The architecture is oriented towards minimising resource utilisation and maximising clock frequency. It employs the block matrix multiplication algorithm which returns the result blocks to the host processor as soon as they are computed. This avoids output buffering and simplifies placement and routing on the chip. The authors show that such architecture is especially well suited for full-duplex communication links between the accelerator and the host processor. The architecture requires the result blocks to be accumulated by the host processor; however, the authors show that typically more than 99% of all arithmetic operations are performed by the accelerator. The implementation focuses on efficient use of embedded FPGA resources, in order to allow for a large number of processing elements (PEs). Each PE uses eight Virtex-6 DSP blocks. Both adders and multipliers are deeply pipelined and use several FPGA-specific techniques to achieve small area size and high clock frequency. Finally, the authors quantify the performance of accelerator implemented in Xilinx Virtex-6 FPGA, with 252 PEs running at 403 MHz (achieving 203.1 Giga FLOPS (GFLOPS)), by comparing it to double-precision matrix multiplication function from MKL, ACML, GotoBLAS and ATLAS libraries executing on Intel Core2Quad and AMD Phenom X4 microprocessors running at 2.8 GHz. The accelerator performs 4.5 times faster than the fastest processor/library pair. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

IET Computers & Digital Techniques publishes technical papers describing recent research and development work in all aspects of digital system-on-chip design and test of electronic and embedded systems.

Full Aims & Scope

Meet Our Editors

IET Research Journals
iet_cdt@theiet.org