# IEEE Transactions on Computers

## Filter Results

Displaying Results 1 - 22 of 22
• ### Accurate Parallel Floating-Point Accumulation

Publication Year: 2016, Page(s):3224 - 3238
Cited by:  Papers (1)
| | PDF (2224 KB) | HTML

Using parallel associative reduction, iterative refinement, and conservative early termination detection, we show how to use tree-reduce parallelism to compute correctly rounded floating-point sums in $O(\log N)$ depth. Our ... View full abstract»

• ### A Matrix Decomposition Method for Optimal Normal Basis Multiplication

Publication Year: 2016, Page(s):3239 - 3250
| | PDF (377 KB) | HTML

We introduce a matrix decomposition method and prove that multiplication in GF $(2^k)$ with a Type 1 optimal normal basis for can be performed using $k^2-1$ View full abstract»

• ### A Partial Carry-Save On-the-Fly Correction Multispeculative Multiplier

Publication Year: 2016, Page(s):3251 - 3264
Cited by:  Papers (1)
| | PDF (1544 KB) | HTML

Functional Units that are designed to receive inputs and produce outputs using a non-redundant format typically exhibit an inferior performance. In order to overcome this limitation, the carry-save and partial carry-save formats have been proposed. Both approaches are very suitable when implementing addition trees. Nevertheless, if there are multiplications in the datapath, the inputs to the multi... View full abstract»

• ### A Resilient Routing Algorithm with Formal Reliability Analysis for Partially Connected 3D-NoCs

Publication Year: 2016, Page(s):3265 - 3279
Cited by:  Papers (6)
| | PDF (1913 KB) | HTML

3D ICs can take advantage of a scalable communication platform, commonly referred to as the Networks-on-Chip (NoC). In the basic form of 3D-NoC, all routers are vertically connected. Partially connected 3D-NoC has emerged because of physical limitations of using vertical links. Routing is of great importance in such partially connected architectures. A high-performance, fault-tolerant and adaptive... View full abstract»

• ### Bio-Inspired Load-Balancing Framework for Loosely Coupled Heterogeneous Server Systems

Publication Year: 2016, Page(s):3280 - 3292
| | PDF (1036 KB) | HTML

Balancing load among servers is an important research challenge for a large-scale loosely coupled heterogeneous server system (LCHSS), to improve both the total throughput of the system and the quality of service experienced by clients. In practical terms, a load-balancing method for an LCHSS have to drive servers to underloaded states without unnecessary load migrations among servers. To tackle t... View full abstract»

• ### Compressed Signal Processing on Nyquist-Sampled Signals

Publication Year: 2016, Page(s):3293 - 3303
Cited by:  Papers (1)
| | PDF (978 KB) | HTML

Pattern-recognition algorithms from the domain of machine learning play a prominent role in embedded sensing systems, in order to derive inferences from sensor data. Very often, such systems face severe energy constraints. The focus of this work is to mitigate the computational energy by exploiting a form of compression which preserves a similarity metric widely used for pattern recognition. The f... View full abstract»

• ### Dynamic Resource Allocation for MapReduce with Partitioning Skew

Publication Year: 2016, Page(s):3304 - 3317
Cited by:  Papers (1)
| | PDF (1102 KB) | HTML Media

MapReduce has become a prevalent programming model for building data processing applications in the cloud. While being widely used, existing MapReduce schedulers still suffer from an issue known as partitioning skew, where the output of map tasks is unevenly distributed among reduce tasks. Existing solutions follow a similar principle that repartitions workload among reduce tasks. However, those a... View full abstract»

• ### ELmD: A Pipelineable Authenticated Encryption and Its Hardware Implementation

Publication Year: 2016, Page(s):3318 - 3331
Cited by:  Papers (1)
| | PDF (708 KB) | HTML

Authenticated encryption schemes which resist misuse of nonce at some desired level of privacy are two-pass or Mac-then-Encrypt constructions (inherently inefficient but provide full privacy) and online constructions like McOE, sponge-type authenticated encryptions (such as duplex) and COPA. Only the last one is almost parallelizable except that for associated data processing, the final block-ciph... View full abstract»

• ### Hardware-Based Malware Detection Using Low-Level Architectural Features

Publication Year: 2016, Page(s):3332 - 3344
Cited by:  Papers (3)
| | PDF (1386 KB) | HTML

Security exploits and ensuant malware pose an increasing challenge to computing systems as the variety and complexity of attacks continue to increase. In response, software-based malware detection tools have grown in complexity, thus making it computationally difficult to use them to protect systems in real-time. Therefore, software detectors are applied selectively and at a low frequency, creatin... View full abstract»

• ### Improving Bit Flip Reduction for Biased and Random Data

Publication Year: 2016, Page(s):3345 - 3356
Cited by:  Papers (3)
| | PDF (1905 KB) | HTML

Nonvolatile memory technologies such as Spin-Transfer Torque Random Access Memory (STT-RAM) and Phase Change Memory (PCM) are emerging as promising replacements to DRAM. Before deploying STT-RAM and PCM into functional systems, a number of challenges still remain must be addressed. Specifically, both require relatively high write energy, STT-RAM suffers from high bit error rates and PCM suffers fr... View full abstract»

• ### Multicore-Aware Virtual Machine Placement in Cloud Data Centers

Publication Year: 2016, Page(s):3357 - 3369
Cited by:  Papers (3)
| | PDF (1451 KB) | HTML Media

Finding the best way to map virtual machines (VMs) to physical machines (PMs) in a cloud data center is an important optimization problem, with significant impact on costs, performance, and energy consumption. In most situations, the computational capacity of PMs and the computational load of VMs are a vital aspect to consider in the VM-to-PM mapping. Previous work modeled computational capacity a... View full abstract»

• ### Parallel Algorithms for Generating Harmonised State Identifiers and Characterising Sets

Publication Year: 2016, Page(s):3370 - 3383
| | PDF (1245 KB) | HTML

Many automated finite state machine (FSM) based test generation algorithms require that a characterising set or a set of harmonised state identifiers is first produced. The only previously published algorithms for partial FSMs were brute-force algorithms with exponential worst case time complexity. This paper presents polynomial time algorithms and also massively parallel implementations of both t... View full abstract»

• ### Reducing the Memory Bandwidth Overheads of Hardware Security Support for Multi-Core Processors

Publication Year: 2016, Page(s):3384 - 3397
| | PDF (1322 KB) | HTML

To prevent physical attacks on systems, secure processors have been proposed to reduce trusted computing base to the processor itself. In a secure processor, all off-chip data are encrypted and their integrity is protected. This paper investigates how the limited memory bandwidth of multi-core processors affects the design of secure processors. Although the performance of a single-core secure proc... View full abstract»

• ### Scalable Power Management for On-Chip Systems with Malleable Applications

Publication Year: 2016, Page(s):3398 - 3412
| | PDF (2810 KB) | HTML Media

We present a scalable Dynamic Power Management (DPM) scheme where malleable applications may change their degree of parallelism at run time depending upon the workload and performance constraints. We employ a per-application predictive power manager that autonomously controls the power states of the cores with the goal of energy efficiency. Furthermore, our DPM allows the applications to lend thei... View full abstract»

• ### Secure and Private RFID-Enabled Third-Party Supply Chain Systems

Publication Year: 2016, Page(s):3413 - 3426
Cited by:  Papers (1)
| | PDF (1299 KB) | HTML

Radio Frequency Identification (RFID) is a key emerging technology for supply chain systems. By attaching RFID tags to various products, product-related data can be efficiently indexed, retrieved and shared among multiple participants involved in an RFID-enabled supply chain. The flexible data access property, however, raises security and privacy concerns. In this paper, we target at security and ... View full abstract»

• ### Statistical Cache Bypassing for Non-Volatile Memory

Publication Year: 2016, Page(s):3427 - 3440
Cited by:  Papers (3)
| | PDF (1966 KB) | HTML

With the increasing data throughput requirement, non-volatile memories, such as STT-RAM, PCM and RRAM, have become very competitive designs as on-chip caches in chip-multi-processors (CMPs). Since the write operations are more expensive in an asymmetric-access cache, it is more valuable to justify the data allocation. However, the asymmetric-access property of non-volatile memory is not well addre... View full abstract»

• ### Task Mapping for Redundant Multithreading in Multi-Cores with Reliability and Performance Heterogeneity

Publication Year: 2016, Page(s):3441 - 3455
Cited by:  Papers (2)
| | PDF (2590 KB) | HTML Media

Due to the architectural design, process variations and aging, individual cores in many-core systems exhibit heterogeneous performance. In many-core systems, a commonly adopted soft error mitigation technique is Redundant Multithreading (RMT) that achieves error detection and recovery through redundant thread execution on different cores for an application. However, task mapping a... View full abstract»

• ### TransMap: Transformation Based Remapping and Parallelism for High Utilization and Energy Efficiency in CGRAs

Publication Year: 2016, Page(s):3456 - 3469
| | PDF (1595 KB) | HTML

In the era of platforms hosting multiple applications with arbitrary inter application communication and computation patterns, compile time mapping decisions are neither optimal nor desirable. As a solution to this problem, recently proposed architectures offer run-time remapping. The run-time remapping techniques displace or parallelize/serialize an application to optimize different parameters (e... View full abstract»

• ### Versatile Direct and Transpose Matrix Multiplication with Chained Operations: An Optimized Architecture Using Circulant Matrices

Publication Year: 2016, Page(s):3470 - 3479
| | PDF (844 KB) | HTML

With growing demands in real-time control, classification or prediction, algorithms become more complex while low power and small size devices are required. Matrix multiplication (direct or transpose) is common for such computation algorithms. In numerous algorithms, it is also required to perform matrix multiplication repeatedly, where the result of a multiplication is further multiplied again. T... View full abstract»

• ### Workload Adaptive Shared Memory Management for High Performance Network I/O in Virtualized Cloud

Publication Year: 2016, Page(s):3480 - 3494
| | PDF (1795 KB) | HTML

This paper presents the design and implementation of MemPipe, a dynamic shared memory management system for high performance network I/O among virtual machines (VMs) located on the same host. MemPipe delivers efficient inter-VM communication with three unique features. First, MemPipe employs an inter-VM shared memory pipe to enable high throughput data delivery for both TCP and UDP workloads among... View full abstract»

• ### Binary-Ternary Plus-Minus Modular Inversion in RNS

Publication Year: 2016, Page(s):3495 - 3501
Cited by:  Papers (1)
| | PDF (497 KB) | HTML

A fast RNS modular inversion for finite fields arithmetic has been published at CHES 2013 conference. It is based on the binary version of the plus-minus Euclidean algorithm. In the context of elliptic curve cryptography (i.e., 160-550 bits finite fields), it significantly speeds-up modular inversions. In this paper, we propose an improved version based on both radix 2 and radix 3. This new algori... View full abstract»

• ### Health Status Assessment and Failure Prediction for Hard Drives with Recurrent Neural Networks

Publication Year: 2016, Page(s):3502 - 3508
Cited by:  Papers (5)
| | PDF (669 KB) | HTML

Recently, in order to improve reactive fault tolerance techniques in large scale storage systems, researchers have proposed various statistical and machine learning methods based on SMART attributes. Most of these studies have focused on predicting failures of hard drives, i.e., labeling the status of a hard drive as “good” or not. However, in real-world storage systems, hard drives ... View full abstract»

## Aims & Scope

The IEEE Transactions on Computers is a monthly publication with a wide distribution to researchers, developers, technical managers, and educators in the computer field.

Full Aims & Scope

## Meet Our Editors

Editor-in-Chief
Paolo Montuschi
Politecnico di Torino
Dipartimento di Automatica e Informatica
Corso Duca degli Abruzzi 24
10129 Torino - Italy
e-mail: pmo@computer.org