• ### A Multi-Resolution FPGA-Based Architecture for Real-Time Edge and Corner Detection

Publication Year: 2014, Page(s):2376 - 2388
This work presents a new flexible parameterizable architecture for image and video processing with reduced latency and memory requirements, supporting a variable input resolution. The proposed architecture is optimized for feature detection, more specifically, the Canny edge detector and the Harris corner detector. The architecture contains neighborhood extractors and threshold operators that can ... View full abstract»

• ### Copula Models of Correlation: A DRAM Case Study

Publication Year: 2014, Page(s):2389 - 2401
Variable bit retention time observed in a 65-nm dynamic random access memory (DRAM) case study will cause miscorrelation between retention times occurring in Test and Use. Conventional multivariate normal statistics cannot adequately model this miscorrelation. A more general copula-based modeling approach, widely used in financial and actuarial modeling, solves this problem. The DRAM case study sh... View full abstract»

• ### Efficient Software Partial Packet Recovery in 802.11 Wireless LANs

Publication Year: 2014, Page(s):2402 - 2415
In 802.11 wireless LANs, partial packets are often received which usually contain only a few errors. According to the current 802.11 standard, such packets have to be retransmitted. Much effort has been invested recently in repairing such packets without retransmitting the entire packet, e.g., by using error correction (EC) code or retransmitting only the corrupted blocks. In this paper, we study ... View full abstract»

• ### Endurance-Aware Flash-Cache Management for Storage Servers

Publication Year: 2014, Page(s):2416 - 2430
As flash memory emerges as a high-performance and energy-efficient alternative for storage devices, how to accommodate disk-based storage servers with a flash-memory cache might provide a promising solution to resolve the energy-efficiency concerns of storage servers and their data centers. In this work, we propose a cache design method over flash memory without any additional hardware support for... View full abstract»

• ### Extremely Low Cost Error Protection with Correctable Parity Protected Cache

Publication Year: 2014, Page(s):2431 - 2444
Due to shrinking feature sizes, processors are becoming more vulnerable to soft errors. One of the most vulnerable components of a processor is its write-back cache. This paper proposes a new reliable write-back cache called Correctable Parity Protected Cache (CPPC), which adds correction capability to parity protection. In CPPC, parity bits detect faults and the XOR of all data written into the c... View full abstract»

• ### Improving Performance and Capacity of Flash Storage Devices by Exploiting Heterogeneity of MLC Flash Memory

Publication Year: 2014, Page(s):2445 - 2458
The multi-level cell (MLC) NAND flash memory technology enables multiple bits of information to be stored in a memory cell, thus making it possible to increase the density of flash memory without increasing the die size. In MLC NAND flash memory, each memory cell can be programmed as a single-level cell or a multi-level cell at runtime because of its performance/capacity asymmetric programming pro... View full abstract»

• ### Improving Space Efficiency With Path Length Prediction for Finding $k$ Shortest Simple Paths

Publication Year: 2014, Page(s):2459 - 2472
Finding mbi k shortest simple paths in a directed graph is a fundamental problem in many engineering applications. Most existing algorithms such as Yen's algorithm and its variants have polynomial worst-case time complexity, but their average-case running time is very high. The heuristic algorithm MPS can run significantly faster in practice. However, it requires an excessive amount of memory spac... View full abstract»

• ### Interconnection Networks of Degree Three Obtained by Pruning Two-Dimensional Tori

Publication Year: 2014, Page(s):2473 - 2486
We study an interconnection network that we call 3Torus(m,n) obtained by pruning the 4m ×4n torus (of links) so that the resulting network is regular of degree 3. We show that 3Torus(m,n) retains many of the useful properties of tori (although, of course, there is a price to be paid due to the reduction in links). In particular, we show that 3Torus(m,n) is node-symmetric; we establish close... View full abstract»

• ### NO2: Speeding up Parallel Processing of Massive Compute-Intensive Tasks

Publication Year: 2014, Page(s):2487 - 2499
Large-scale computing frameworks, either tenanted on the cloud or deployed in the high-end local cluster, have become an indispensable software infrastructure to support numerous enterprise and scientific applications. Tasks executed on these frameworks are generally classified into data-intensive and compute-intensive ones. However, most existing frameworks, led by MapReduce, are mainly suitable ... View full abstract»

• ### OFWAR: Reducing SSD Response Time Using On-Demand Fast-Write-and-Rewrite

Publication Year: 2014, Page(s):2500 - 2512
This paper presents a cross-layer design strategy to reduce SSD response time and its variation. The key is to cohesively exploit system-level run-time data access workload variation and temporal locality and device-level NAND flash memory write latency versus data retention time trade-off. The basic idea is simple: once write intensity of the workload increases and begins to degrade SSD response ... View full abstract»

• ### On the Systematic Creation of Faithfully Rounded Truncated Multipliers and Arrays

Publication Year: 2014, Page(s):2513 - 2525
Often, when performing fixed-point multiplication, it is sufficient to return a faithfully rounded result, i.e., the machine representable number either immediately above or below the arbitrary precision result, if the latter is not exactly representable. Compared to correctly rounded multipliers, i.e., those returning the nearest machine representable number, faithfully rounded multipliers use co... View full abstract»

• ### Path-Dividing Based Scheduling Algorithm for Reducing Energy Consumption of Clustered VLIW Architectures

Publication Year: 2014, Page(s):2526 - 2539
This paper presents an instruction scheduling algorithm for clustered very long instruction words (VLIW) architectures. It exploits a path-dividing-based technique to decide a more appropriate processing order of instructions, and utilizes a more global view to generate the scheduling result by simultaneously considering the influence of both data dependence relations between instructions and dist... View full abstract»

• ### Reliability Evaluation of BC Networks in Terms of the Extra Vertex- and Edge-Connectivity

Publication Year: 2014, Page(s):2540 - 2548
Reliability evaluation of interconnection network is important to the design and maintenance of multiprocessor systems. The extra connectivity and the extra edge-connectivity are two important parameters for the reliability evaluation of interconnection networks. The n-dimensional bijective connection network (in brief, BC network) includes several well known network models, such as, hypercubes, M... View full abstract»

• ### Self-Adaptive Context Data Management in Large-Scale Mobile Systems

Publication Year: 2014, Page(s):2549 - 2562
Context awareness, intended as providing the current execution environment at the service level, is a fundamental capability in future mobile systems. Unfortunately, the real-world realization of such scenarios is currently undermined by inefficient context data delivery mechanisms, which introduce excessive overhead over bandwidth-constrained wireless fixed infrastructures. To efficiently offload... View full abstract»

• ### Symbolic Analysis of Programmable Logic Controllers

Publication Year: 2014, Page(s):2563 - 2575
Programmable Logic Controllers (PLC) are widely used in industry. The reliability of the PLC is vital to many critical applications. This paper presents a novel approach to the symbolic analysis of PLC systems. The approach includes, (1) calculating the uncertainty characterization of the PLC system, (2) abstracting the PLC system as a Hidden Markov Model, (3) solving the Hidden Markov Model with ... View full abstract»

• ### 3D-ICE: A Compact Thermal Model for Early-Stage Design of Liquid-Cooled ICs

Publication Year: 2014, Page(s):2576 - 2589
Liquid-cooling using microchannel heat sinks etched on silicon dies is seen as a promising solution to the rising heat fluxes in two-dimensional and stacked three-dimensional integrated circuits. Development of such devices requires accurate and fast thermal simulators suitable for early-stage design. To this end, we present 3D-ICE, a compact transient thermal model (CTTM), for liquid-cooled ICs. ... View full abstract»

• ### Task Scheduling on Adaptive Multi-Core

Publication Year: 2014, Page(s):2590 - 2603
Multi-cores have become ubiquitous both in the general-purpose computing and the embedded domain. The current technology trends show that the number of on-chip cores is rapidly increasing, while their complexity is decreasing due to power and thermal constraints. Increasing number of simple cores enable parallel applications benefit from abundant thread-level parallelism (TLP), while sequential fr... View full abstract»

• ### Toward Formal Design of Practical Cryptographic Hardware Based on Galois Field Arithmetic

Publication Year: 2014, Page(s):2604 - 2613
This paper presents a formal method for designing cryptographic processor datapaths on the basis of arithmetic circuits over Galois fields (GFs). The proposed method describes GF arithmetic circuits in the form of hierarchical graph structures, where nodes represent sub-circuits whose functions are defined by arithmetic formulae over GFs, and edges represent data dependency between nodes. In this ... View full abstract»

• ### A New Double Point Multiplication Algorithm and Its Application to Binary Elliptic Curves with Endomorphisms

Publication Year: 2014, Page(s):2614 - 2619
We present a new double point multiplication algorithm based on differential addition chains. Our proposed scheme has a uniform structure and has some degree of built-in resistance against side channel analysis attacks. We discuss deploying our scheme in a hardware implementation of single point multiplication on binary elliptic curves with efficiently computable endomorphisms. Based on operation ... View full abstract»

• ### Design of Goldschmidt Dividers with Quantum-Dot Cellular Automata

Publication Year: 2014, Page(s):2620 - 2625
A Goldschmidt divider implemented with semiconductor quantum-dot cellular automata (QCA) is described. Most Goldschmidt dividers use a state machine for control, but state machines are difficult to implement in QCAs due to the long delays between the state machines and the computational circuits to be controlled. To resolve this problem, a data tag method is used. The data tags travel with the dat... View full abstract»

• ### Improved Miller’s Algorithm for Computing Pairings on Edwards Curves

Publication Year: 2014, Page(s):2626 - 2632
Since Edwards curves were introduced to elliptic curve cryptography by Bernstein and Lange in 2007, they have received a lot of attention due to their very fast group law operation. Pairing computation on such curves is slightly slower than on Weierstrass curves. However, in some pairing-based cryptosystems, they might require a number of scalar multiplications which is time-consuming operation an... View full abstract»

