2014 International Conference on Field-Programmable Technology (FPT)

10-12 Dec. 2014

Filter Results

Displaying Results 1 - 25 of 90
  • [Front matter]

    Publication Year: 2014, Page(s):1 - 6
    Request permission for commercial reuse | PDF file iconPDF (335 KB)
    Freely Available from IEEE
  • Contents

    Publication Year: 2014, Page(s):1 - 6
    Request permission for commercial reuse | PDF file iconPDF (2011 KB)
    Freely Available from IEEE
  • Keynote lectures

    Publication Year: 2014, Page(s): 1
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (28 KB)

    Provides an abstract for each of the keynote presentations and a brief professional biography of each presenter. The complete presentations were not made available for publication as part of the conference proceedings. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Logic emulation in the megaLUT era — Moore's Law beats Rent's Rule

    Publication Year: 2014, Page(s): 1
    Request permission for commercial reuse | PDF file iconPDF (142 KB) | HTML iconHTML
    Freely Available from IEEE
  • Automating customized computing

    Publication Year: 2014, Page(s): 2
    Request permission for commercial reuse | PDF file iconPDF (137 KB) | HTML iconHTML
    Freely Available from IEEE
  • Doing FPGA in a former software company

    Publication Year: 2014, Page(s): 3
    Request permission for commercial reuse | PDF file iconPDF (133 KB) | HTML iconHTML
    Freely Available from IEEE
  • 1.1 Tools & design productivity

    Publication Year: 2014, Page(s): 1
    Request permission for commercial reuse | PDF file iconPDF (30 KB)
    Freely Available from IEEE
  • Design re-use for compile time reduction in FPGA high-level synthesis flows

    Publication Year: 2014, Page(s):4 - 11
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1628 KB) | HTML iconHTML

    High-level synthesis (HLS) raises the level of abstraction for hardware design through the use of software methodologies. An impediment to productivity in HLS flows, however, is the run-time of the back-end toolflow - synthesis, packing, placement and routing - which can take hours or days for the largest designs. We propose a new back-end flow for HLS that makes use of pre-synthesized and placed ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Is high level synthesis ready for business? A computational finance case study

    Publication Year: 2014, Page(s):12 - 19
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (2010 KB) | HTML iconHTML

    High Level Synthesis (HLS) tools for Field Programmable Gate Arrays (FPGAs) have made considerable progress, and are now sufficiently mature that a novice developer could create functionally correct implementation with limited understanding of the target hardware. In this case study, a novice developer considers a benchmark of financial problems for implementation upon FPGA via HLS. This novice st... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Comparing performance, productivity and scalability of the TILT overlay processor to OpenCL HLS

    Publication Year: 2014, Page(s):20 - 27
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (3821 KB) | HTML iconHTML

    High-Level-Synthesis (HLS) tools translate a software description of an application into custom FPGA logic, increasing designer productivity vs. Hardware Description Language (HDL) design flows. Overlays seek to further improve productivity by reducing application compile times and raising abstraction by enabling the designer to target a software-programmable substrate instead of the underlying FP... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Size aware placement for island style FPGAs

    Publication Year: 2014, Page(s):28 - 35
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1842 KB) | HTML iconHTML

    In this paper we first examine the impact of FPGA size on overall performance and run-time of placement and routing in the context of cluster-based island-style FPGAs. Based on the observations, an FPGA placement algorithm, Min-Size, is introduced to alleviate the deterioration of performance and run-time of placement and routing when using a large FPGA to implement a circuit. We achieve this by a... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Analyzing the impact of heterogeneous blocks on FPGA placement quality

    Publication Year: 2014, Page(s):36 - 43
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1439 KB) | HTML iconHTML

    In this paper we propose a quantitative approach to analyze the impact of heterogeneous blocks (H-blocks) on the FPGA placement quality. The basic idea is to construct synthetic heterogeneous placement benchmarks with known optimal wire-length to facilitate the quantitative analysis. To the best of our knowledge, this is the first work that enables the construction of wirelength-optimal heterogene... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 1.2 Financial applications

    Publication Year: 2014, Page(s): 1
    Request permission for commercial reuse | PDF file iconPDF (28 KB)
    Freely Available from IEEE
  • Low-latency option pricing using systolic binomial trees

    Publication Year: 2014, Page(s):44 - 51
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1268 KB) | HTML iconHTML

    This paper presents a novel reconfigurable hardware accelerator for the pricing of American options using the binomial-tree model. The proposed architecture exploits both pipeline and coarse-grain parallelism in a highly efficient and scalable systolic solution, designed to exploit the large numbers of DSP blocks in modern architectures. The architecture can be tuned at compile-time to match user ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Collaborative processing of Least-Square Monte Carlo for American options

    Publication Year: 2014, Page(s):52 - 59
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1127 KB) | HTML iconHTML

    American options are popularly traded in the financial market, so pricing those options becomes crucial in practice. In reality, many popular pricing models do not have analytical solutions. Hence techniques such as Monte Carlo are often used in practice. This paper presents a CPU-FPGA collaborative accelerator using state-of-the-art Least-Square Monte Carlo method, for pricing American options. W... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Accelerating transfer entropy computation

    Publication Year: 2014, Page(s):60 - 67
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1019 KB) | HTML iconHTML

    Transfer entropy is a measure of information transfer between two time series. It is an asymmetric measure based on entropy change which only takes into account the statistical dependency originating in the source series, but excludes dependency on a common external factor. Transfer entropy is able to capture system dynamics that traditional measures cannot, and has been successfully applied to va... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • FPGA-accelerated Monte-Carlo integration using stratified sampling and Brownian bridges

    Publication Year: 2014, Page(s):68 - 75
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (3860 KB) | HTML iconHTML

    Monte-Carlo Integration (MCI) is a numerical technique for evaluating integrals which have no closed form solution. Naive MCI randomly samples the integrand at uniformly distributed points. This naive approach converges very slowly. Stratified sampling can be used to concentrate the samples on segments of the integration domain where the integrand has the highest variance. Even with stratified sam... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 1.3 Architecture & runtime systems

    Publication Year: 2014, Page(s): 1
    Request permission for commercial reuse | PDF file iconPDF (30 KB)
    Freely Available from IEEE
  • Time sharing of Runtime Coarse-Grain Reconfigurable Architectures processing elements in multi-process systems

    Publication Year: 2014, Page(s):76 - 82
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1343 KB) | HTML iconHTML

    This paper presents a method to time share the Processing Elements (PEs) of Runtime Coarse Grain Reconfigurable Architectures (CGRA) among multiple processes being executed concurrently onto the same CGRA. Runtime CGRA architectures time-multiplex the data path, creating a set of contexts for each state. These contexts configure the PEs and the routing resources of the CGRA and are typically loade... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Architectural synthesis of computational pipelines with decoupled memory access

    Publication Year: 2014, Page(s):83 - 90
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (2874 KB) | HTML iconHTML

    As high level synthesis (HLS) moves towards mainstream adoption among FPGA designers, it has proven to be an effective method for rapid hardware generation. However, in the context of offloading compute intensive software kernels to FPGA accelerators, current HLS tools do not always take full advantage of the hardware platforms. In this paper, we present an automatic flow to refactor and restructu... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Improve memory access for achieving both performance and energy efficiencies on heterogeneous systems

    Publication Year: 2014, Page(s):91 - 98
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (2868 KB) | HTML iconHTML

    Hardware accelerators are capable of achieving significant performance improvement for many applications. In this work we demonstrate that it is critical to provide sufficient memory access bandwidth for accelerators to improve the performance and reduce energy consumption. We use the scale-invariant feature transform (SIFT) algorithm as a case study in which three bottleneck stages are accelerate... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Approaching overhead-free execution on FPGA soft-processors

    Publication Year: 2014, Page(s):99 - 106
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1105 KB) | HTML iconHTML

    Implementing systems on FPGA soft-processors, rather than as custom hardware, eases and accelerates the development process, but at the cost of a great reduction in performance. Orthogonal to limitations in parallelism or clock frequency, this reduction in performance primarily originates in the intrinsic addressing and flow-control overheads of scalar microprocessors, which expend a considerable ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 2.1 Mathematical circuits

    Publication Year: 2014, Page(s): 1
    Request permission for commercial reuse | PDF file iconPDF (29 KB)
    Freely Available from IEEE
  • Low-latency double-precision floating-point division for FPGAs

    Publication Year: 2014, Page(s):107 - 114
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1132 KB) | HTML iconHTML

    With growing FPGA capacities, applications requiring more intensive use of floating-point arithmetic become feasible candidates for acceleration using reconfigurable logic. Still among the more uncommon operations, however, are fast double-precision divider units. Since our application domain (acceleration of custom-compiled convex solvers) heavily relies on these blocks, we have implemented low-l... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient FPGA implementation of digit parallel online arithmetic operators

    Publication Year: 2014, Page(s):115 - 122
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (3684 KB) | HTML iconHTML

    Online arithmetic has been widely studied for ASIC implementation. Online components were originally designed to perform computations in digit serial with most significant digit (MSD) first, resulting in the ability to chain arithmetic operators together for low latency. More recently, research has shown that digit parallel online operators can fail more gracefully when operating beyond the determ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.