Scheduled System Maintenance on May 29th, 2015:
IEEE Xplore will be upgraded between 11:00 AM and 10:00 PM EDT. During this time there may be intermittent impact on performance. We apologize for any inconvenience.
By Topic

Programmable Logic Conference (SPL), 2010 VI Southern

Date 24-26 March 2010

Filter Results

Displaying Results 1 - 25 of 48
  • [Front and back cover]

    Publication Year: 2010 , Page(s): c1 - c4
    Save to Project icon | Request Permissions | PDF file iconPDF (4297 KB)  
    Freely Available from IEEE
  • [Title page]

    Publication Year: 2010 , Page(s): i - viii
    Save to Project icon | Request Permissions | PDF file iconPDF (108 KB)  
    Freely Available from IEEE
  • Table of contents

    Publication Year: 2010 , Page(s): ix - xiii
    Save to Project icon | Request Permissions | PDF file iconPDF (101 KB)  
    Freely Available from IEEE
  • Reconfigurable Computing: boosting software education for the multicore era: Why we need to reinvent computing

    Publication Year: 2010 , Page(s): 1
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (198 KB)  

    Summary form only given. This talk discusses important technical-educational issues that, if addressed effectively and timely, could forestall a future dramatic crisis related to the rapidly growing immense electricity consumption of all computers, directly visible or embedded in all kinds of devices, appliances, machines, facilities, complexes, and other computer-based cyber infrastructures. Just for only the internet an increase by a factor of 30 by the year 2030 has been predicted [107] “if the trend continues”. This means a much higher electricity consumption than that of the entire world to-day. This trend is unaffordable. Only looking at climate issues the climate protection scene completely ignores these highly dramatic electricity consumption predictions. However, to avoid a breakdown of the world economy we need these cyber infrastructures. Only Reconfigurable Computing can avoid, that running these infrastructures becomes unaffordable in the future. This very urgent, and we have to complete our rescue actions much earlier than 2030. To solve this problem we need an extensive software to configware migration campaign. However, the programmer population qualified for such movements is not existing. Since twin-paradigm hetero systems have to be programmed we must reinvent computing. We need an initiative at least as far-reaching as the VLSI design revolution initiated in the early 1980s by Carver Mead and Lynn Conway. The needed designer population has not been existing. Under massive funding this initiative has implemented the microelectronics design revolution solving the design crisis and was the incubator of new industries. Professors back to school!. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Session 1 - embedded processors and IP cores

    Publication Year: 2010 , Page(s): 1 - 2
    Save to Project icon | Request Permissions | PDF file iconPDF (62 KB)  
    Freely Available from IEEE
  • The supersmall soft processor

    Publication Year: 2010 , Page(s): 3 - 8
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3387 KB) |  | HTML iconHTML  

    Soft processors have become an increasingly common component of systems that use Field-Programmable Gate Arrays (FPGAs), and are used to implement a wide variety of control and data processing functionality. Often, some additional functionality needs to be added to a system when there is very little space left on the physical device. This functionality may not be performance critical, and so could be implemented on a slow soft processor. For this reason it may be useful to have a processor that is as small as possible yet similar to other commonly-used processors. This paper describes the design, implementation and release of a 32-bit soft processor based on the MIPS-I instruction set and optimized for minimal use of FPGA resources. The `supersmall' soft processor is as much as 2.2 times smaller than Altera's Nios II/e (the smallest of their 3 processors) yet only a factor of 10 times slower. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Libor market model simulation on an FPGA parallel machine

    Publication Year: 2010 , Page(s): 9 - 14
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (397 KB) |  | HTML iconHTML  

    In this paper, we present a high performance scalable FPGA design and implementation of an interest rate derivative pricing engine that targets on the cap pricing. The design consists of a Gaussian random number generator, based on the Mersenne Twister uniform random generator, and a Monte Carlo path generation engine which calculates the prices of an interest rate derivative based on the LIBOR market model. We implemented this design on the Maxwell FPGA supercomputer using up to 32 Xilinx XC4VFX100 FPGA nodes. We have also compared our FPGA hardware implementation with an equivalent optimized pure software implementation running on up to 32 2.8GHz Xeon processors with 1 GB RAM each. This showed our FPGA implementation to be 58× faster than the optimized software implementation, while being more than two orders of magnitude more energy efficient. These results scale linearly with the number of FPGA and Xeon processor nodes used. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Protection of microprocessor-based cores for FPL devices

    Publication Year: 2010 , Page(s): 15 - 20
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (952 KB) |  | HTML iconHTML  

    Microprocessor cores are widely used in the development of complex digital systems. In this paper, a new scheme for the IP protection of microprocessor cores is presented. The proposed framework can perform this task in two ways: the hosting of a digital signature using watermarking techniques that allows claiming authorship rights; and the introduction of additional hardware limiting the functionality of the core if it is not activated. This last feature enables the distribution of cores in “demo” mode. The protection method, named μIPP@HDL provides a robust protection system, while maintaining low overhead and a reasonable area increase, as experimental results show. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • FPGA-based smart sensor implementation with precise frequency to digital converter for flow measurement

    Publication Year: 2010 , Page(s): 21 - 26
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (699 KB) |  | HTML iconHTML  

    The realization of integrated frequency-based smart sensor for flow measurement requires a precise frequency to digital converter. A VHDL-based implementation of such converter for a royalty-free solution with 1 ppm resolution is reported. This work is part of a correlator under development to measure total flow of multiphase fluids. This intellectual property block can also be used with other frequency encoded transducers. The converter has been prototyped with a Xilinx™ XC3S500E Spartan-3E FPGA, and has been tested up to 10MHz. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Session 2 - system-on-chip

    Publication Year: 2010 , Page(s): 27 - 28
    Save to Project icon | Request Permissions | PDF file iconPDF (42 KB)  
    Freely Available from IEEE
  • A Genetic Programming based approach for efficiently exploring architectural communication design space of MPSoCs

    Publication Year: 2010 , Page(s): 29 - 34
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (810 KB) |  | HTML iconHTML  

    New integrated circuits technologies and the demand for more complex applications have created Multi-Processor System-on-Chip (MPSoC). MPSoC is a complex integrated circuit, which can be composed of microprocessors, buses, memories and others computational system components. As the number and variety of components of today's MPSoC is increasing, its communication architecture is becoming a limiting factor for applications performance and power consumption. Thus, techniques have been created for exploring the design space in order to find out the best communication architecture for a given application. Such techniques, however, are either inaccurate (by using static analysis based approaches) or very time consuming since each communication configuration of the design space must be simulated (by using simulation models) or estimated (using mixed approaches). This paper presents a new approach to explore the design space of bus-based communication architectures of MPSoCs using Generalized Linear Models and Genetic Programming. By using the proposed approach, some experiments show that it was possible to explore a subset of the design space and to identify the best communication configuration for a given application reducing 90% of the exploration time with less of 3,8% mean global error. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An environment for energy consumption analysis of cache memories in SoC platforms

    Publication Year: 2010 , Page(s): 35 - 40
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (989 KB) |  | HTML iconHTML  

    The tuning of cache architectures in platforms for embedded systems applications can dramatically reduce energy consumption. The existing cache exploration environments constrain the designer to analyze cache energy consumption on single processor systems and worse, systems that are based on a single processor type. In this paper is presented the PCacheEnergyAnalyzer environment for energy consumption analysis of cache memory on SoC platforms. This is a powerful energy analysis environment that combines the use of efficient tools to provide static and dynamic energy consumption analysis, the flexibility to support the architecture exploration of cache memories on platforms that are not bound to a specific processor, and fast simulation techniques. The proposed environment has been integrated into the SoC modeling framework PDesigner, providing a user-friendly graphical interface allowing the integrated modeling and cache energy analysis of SoCs. The PCacheEnergyAnalyzer has been validated with four applications of the Mediabench suite benchmark. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The development of a hardware abstraction layer generator for system-on-chip functional verification

    Publication Year: 2010 , Page(s): 41 - 46
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1031 KB) |  | HTML iconHTML  

    Nowadays functional verification of large system-on-chip has taken about 70% to 80% of the total design effort. The large amount of IP's of current SoC's makes the work of verification engineers quite hard due to the need to guarantee that the design is bug free before it is sent to tape out. In order to reduce the time spent in the functional verification and support the verification engineers, this work proposes a Hardware Abstract Layer (HAL) generator. The HAL generator is part of a methodology for SoC functional verification, which is supported by IP-XACT and aims to automate the functional verification flow. The HAL generator is able for creating C functions that allow the manipulation of registers and their fields at a very high abstraction level allowing the verification engineers to write their test cases without need to worrying about masks, macros, define and/or pointers manipulation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A placement tool for a NOC-based dynamically reconfigurable system

    Publication Year: 2010 , Page(s): 47 - 52
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3735 KB) |  | HTML iconHTML  

    In the last years, Field programmable gate-arrays (FPGAs) with partial reconfiguration capabilities have raised interest in the implementation of dynamically reconfigurable systems. It has not become a mainstream activity though, due to the lack of solid design methodologies and associated tools. One of the approaches aimed to free the designer of lower level implementation details is to use structured communication resources to provide the interaction between reconfigurable partitions (modules). The architecture of a network-on-chip (NoC) based dynamically reconfigurable system and a placement tool, which automatically places all of its modules, is presented. The tool takes the partitioned design information and the restrictions imposed by the device family architecture into consideration. The basics of the placement algorithm and a study-case as an example are presented. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Session 3 - computer arithmetic

    Publication Year: 2010 , Page(s): 53 - 54
    Save to Project icon | Request Permissions | PDF file iconPDF (48 KB)  
    Freely Available from IEEE
  • FPGA based floating-point library for CORDIC algorithms

    Publication Year: 2010 , Page(s): 55 - 60
    Cited by:  Papers (6)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (275 KB) |  | HTML iconHTML  

    Computation of floating-point transcendental functions has a relevant importance in a wide variety of scientific applications, where the area cost, error and latency are important requirements to be attended. This paper describes a flexible FPGA implementation of a parameterizable floating-point library for computing sine, cosine, arctangent and exponential functions using the CORDIC algorithm. The novelty of the proposed architecture is that by sharing the same resources the CORDIC algorithm can be used in two operation modes, allowing it to compute the sine, cosine or arctangent functions. Additionally, in case of the exponential function, the architectures change automatically between the CORDIC or a Taylor approach, which helps to improve the precision characteristics of the circuit, specifically for small input values after the argument reduction. Synthesis of the circuits and an experimental analysis of the errors have demonstrated the correctness and effectiveness of the implemented cores and allow the designer to choose, for general-purpose applications, a suitable bit-width representation and number of iterations of the CORDIC algorithm. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Montgomery modular multiplication on reconfigurable hardware: Fully systolic array vs parallel implementation

    Publication Year: 2010 , Page(s): 61 - 66
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (250 KB) |  | HTML iconHTML  

    This paper describes a comparison of two FPGA Montgomery modular multiplication architectures: a fully systolic array and a parallel implementation. The modular multiplication is employed in modular exponentiation processes, which is the most important operation of some public-key cryptographic algorithms and the most popular of them is the RSA encryption scheme. The proposed fully systolic array architecture presents a high-radix implementation with carry propagation between the Processing Elements. The parallel implementation is composed by multipliers blocks in parallel with the Processing Elements and it provides a pipelined operation mode. We compared the time x area efficiency for both architectures as well as a RSA application. The fully systolic array implementation can run the 1024 bit RSA decryption process in just 3.23 ms and the parallel architecture executes the same operation in 6 ms, which means a competitive state-of-art performance for both architectures. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Decimal division: Algorithms and FPGA implementations

    Publication Year: 2010 , Page(s): 67 - 72
    Cited by:  Papers (5)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (670 KB) |  | HTML iconHTML  

    The work reported in this paper is devoted to the FPGA implementation of decimal dividers. Two types of dividers are described. The first one implements a decimal non-restoring like algorithm and uses ripple-carry operators. For medium size operators it gives a good compromise between cost and latency. The second one implements an SRT-like algorithm and uses carry-free operators. Their latencies are close to that of a binary radix-16 divider with the same range, implemented in the same FPGA. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel decimal multipliers using binary multipliers

    Publication Year: 2010 , Page(s): 73 - 78
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (227 KB) |  | HTML iconHTML  

    Human-centric applications, like financial and commercial, depend on decimal arithmetic since the results must match exactly those obtained by human calculations. The IEEE-754 2008 standard for floating point arithmetic has definitely recognized the importance of decimal for computer arithmetic. A number of hardware approaches have already been proposed for decimal arithmetic operations, including addition, subtraction, multiplication and division. However, few efforts have been done to develop decimal IP cores able to take advantage of the binary multipliers available in most reconfigurable computing architectures. In this paper, we analyze the tradeoffs involved in the design of a parallel decimal multiplier, for decimal operands with 8 and 16 digits, using existent coarse-grained embedded binary arithmetic blocks. The proposed circuits were implemented in a Xilinx Virtex 4 FPGA. The results indicate that the proposed parallel multipliers are very competitive when compared to decimal multipliers implemented with direct manipulation of BCD numbers. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Session 4 - image processing and vision

    Publication Year: 2010 , Page(s): 79 - 80
    Save to Project icon | Request Permissions | PDF file iconPDF (52 KB)  
    Freely Available from IEEE
  • A high performance hardware architecture for the H.264/AVC half-pixel interpolation unit

    Publication Year: 2010 , Page(s): 81 - 86
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (998 KB) |  | HTML iconHTML  

    This work presents a high performance half pixel interpolation unit for the H.264/AVC standard. The presented architecture is able to process very high definition videos (3840 × 2048 pixels) at real time processing (30 frames per second), and can be integrated in a complete motion estimation architecture without limiting the other modules' performance. It also presents a novel arrangement of interpolated samples which makes simple the search for the best fractional motion vector. The architecture was described in VHDL and synthesized to a Xilinx Virtex4 FPGA, and it achieved the best results when compared to related works published in the literature. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • FPGA-based real time processing of the plenoptic wavefront sensor for the european solar telescope (EST)

    Publication Year: 2010 , Page(s): 87 - 92
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1204 KB) |  | HTML iconHTML  

    This paper describes the development of the plenoptic wave front sensor for an adaptive optics systems proposed for the future EST Solar telescope. The plenoptic sensor offers additional optical information compared to traditional sensors at the expense of a significant increase in the image processing. This paper will concentrate on the processing required to develop a viable plenoptic sensor, describing the algorithm and the real time implementation in FPGAs (Field Programmable Gate Arrays). The aim of this work is to demonstrate that by using the advantages of the FPGAs in terms of parallel processing, speed and cost figures the plenoptic sensor real-time processing is perfectly viable. Consequently, the proposed system appears as a competitive alternative for the traditional wave front sensors systems. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Architecture for binary mathematical morphology reconfigurable by genetic programming

    Publication Year: 2010 , Page(s): 93 - 98
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1245 KB) |  | HTML iconHTML  

    Mathematical morphology supplies powerful tools for low level image analysis, with applications in robotic vision, visual inspection, medicine, texture analysis and many other areas. Many of the mentioned applications require dedicated hardware for real time execution. In this paper, the development of a novel reconfigurable hardware using logical and morphological instructions generated automatically by a linear approach based on genetic programming is proposed. The hardware is capable of processing binary images at high speed. The developed system is based on high-capacity PLDs and has among the possible applications: automatic construction of image filters, intelligent pattern recognition, to name just a few. Some applications using the developed reconfigurable system are presented and the results are discussed and compared with other approaches. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An optimized label-broadcast parallel algorithm for connected components labeling

    Publication Year: 2010 , Page(s): 99 - 104
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (651 KB) |  | HTML iconHTML  

    This paper presents a simple and fast algorithm for labeling connected components in binary images, based on a parallel label-broadcast paradigm. A grid of processing units (called spiders) is used and each element is responsible for updating its label value, during a specific number of iterations. We describe the design and implementation of an embedded architecture for real-time labeling of black and white images based on FPGA technology. Since the image is divided and processed independently by processing elements, it is possible to use the proposed algorithm in an FPGA platform attached to an image sensor and have a focal plane processor circuit-like. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Session 5.1 - FPGA architectures for specific applications

    Publication Year: 2010 , Page(s): 105 - 106
    Save to Project icon | Request Permissions | PDF file iconPDF (48 KB)  
    Freely Available from IEEE