By Topic

Programmable Logic (SPL), 2012 VIII Southern Conference on

Date 20-23 March 2012

Filter Results

Displaying Results 1 - 25 of 48
  • Authors index

    Page(s): 1 - 10
    Save to Project icon | Request Permissions | PDF file iconPDF (1026 KB)  
    Freely Available from IEEE
  • Committees

    Page(s): 1 - 2
    Save to Project icon | Request Permissions | PDF file iconPDF (1370 KB)  
    Freely Available from IEEE
  • SPL 2012 VIII Southern Programmable Logic Conference [Copyright notice]

    Page(s): 1
    Save to Project icon | Request Permissions | PDF file iconPDF (190 KB)  
    Freely Available from IEEE
  • SPL 2012 VIII Southern Programmable Logic Conference [Front cover]

    Page(s): c1
    Save to Project icon | Request Permissions | PDF file iconPDF (140 KB)  
    Freely Available from IEEE
  • Real time QFHD motion estimation architecture for DMPDS algorithm

    Page(s): 1 - 6
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (354 KB) |  | HTML iconHTML  

    This paper presents an efficient hardware architecture for motion estimation (ME) process in high resolution digital videos. This architecture uses the new Dynamic Multi-Point Diamond Search algorithm (DMPDS) which is a fast algorithm that increases the ME quality when compared with other fast algorithms for high resolution videos processing. The DMPDS achieves a better digital video quality reducing local minima falls especially in high definition videos. The designed architecture is focused on high performance, targeting real time processing at 30 frames per second (fps) in QFHD (Quad Full High Definition) resolution. The architecture was described in VHDL and synthesized to Stratix 4 Altera FPGA. The synthesis results show that the architecture is able to process QFHD videos at 34 fps. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • FPGA implementation of hardware countermeasures

    Page(s): 1 - 6
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (670 KB) |  | HTML iconHTML  

    A FPGA implementation of a frequency sensor has been presented. Its main mission consists to determine when the frequency of test signal is into an allowed range. This implementation has the following configurable parameters: timing resolution and allowed range of frequency. This component operates in real-time with a delay of only one operation cycle. Countermeasures against clock glitch attacks is one of its possible applications. Experimental results in a Spartan-3AN700 device show a minimum allowed period of about 16 ns and a minimum resolution of about 4 ns. The implementation of the sensor has been verified in an electronic lock, used as a case of study. This system has been attacked with clock glitches, showing its behavior without and with sensor. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An open-source framework for heterogeneous MPSoC generation

    Page(s): 1 - 6
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (741 KB) |  | HTML iconHTML  

    The design of a Multiprocessor System-on-Chip (MPSoC) is a complex task, including steps as application development, platform configuration, code generation, task mapping onto the platform and debugging. An integrated environment covering most of these steps is a gap in the literature. The present work first details an MPSoC architecture, which supports the execution of distributed applications, including an operating system enabling multitask execution at each processing element. The MPSoC is heterogeneous, due to the support to different processor architectures. Then, a framework able to cover the design steps previously mentioned is presented. The framework enables the design space exploration for applications to be executed in the MPSoC, varying for example the number and type of processors, the memory size, the task mapping. Results demonstrate the correct operation for different MPSoC configurations, generated from the proposed framework. Such open-source framework enables the research community to investigate new subjects related to MPSoC and Network on Chip (NoC) design, as well as evaluate distributed applications in a multiprocessor environment. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A fast interpolative wordlength optimization method for DSP systems

    Page(s): 1 - 6
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (375 KB) |  | HTML iconHTML  

    As Digital Signal Processing (DSP) systems grow in complexity, the classical simulation-based approaches to the wordlength optimization (WLO) problem for fixed-point data representation can no longer be used due to unaffordable execution times. Thus, it is necessary to accelerate the computations and significantly reduce the number of simulations performed in order to obtain optimized solutions in reasonable times. In this paper a new interpolative method is presented. This technique makes use of the information obtained in previous steps of the WLO process to guide the search so the number of required simulations is minimized. Experimental results show that this process provides optimized results several times faster than the traditional approaches without any significant penalty on the quality of the solutions. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Integration of IPs into the M8051 microcontroller

    Page(s): 1 - 6
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (470 KB) |  | HTML iconHTML  

    This paper presents the implementation and integration the AES 128 data encryption IP and the I2C serial communication interface IP, into the IP of the M8051 microcontroller. We detail each block and validate them though testbench simulation. We performed functionality testing in FPGA to verify the correct functioning of the IPs and their integration. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A co-design methodology for processor-centric embedded systems with hardware acceleration using FPGA

    Page(s): 1 - 8
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (567 KB) |  | HTML iconHTML  

    In this work a co-design flow for processor centric embedded systems with hardware acceleration using FPGAs is proposed. This flow helps to reduce design effort by raising abstraction level while not imposing the need for engineers to learn new languages and tools. The whole system is designed using well established high level modeling techniques, languages and tools from the software domain. That is, an OOP design approach expressed in UML and implemented in C++. Software coding effort is reduced since the C++ implementation not only provides a golden reference model, but may also be used as part of the final embedded software. Hardware coding effort is also reduced. The modular OOP design facilitates the engineer to find the exact methods that need to be accelerated by hardware using profiling tools, preventing useless translations to hardware. Moreover, the two-process structured VHDL design method used for hardware implementation has proven to reduce man-years, code lines and bugs in many major developments. A real-time image processing application for multiple robot localization is presented as a case study. The overall time improvement from the original software solution to the final hardware accelerated solution is 9.7×, with only 4% increase in area (143 extra slices). The embedded solution achieved following the proposed methodology runs 17% faster than in a standard PC, and it is a much smaller, cheaper and less power-consuming solution. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Secure configuration schemes for FPGA-based systems with simple key management

    Page(s): 1 - 6
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (159 KB) |  | HTML iconHTML  

    The growing evolution in field programmable gate array (FPGA) performances appeals to embedded systems designers to expansively incorporate FPGA devices in their systems. This expanding use makes FPGA-based systems more attractive to several attackers and hence vulnerable to a number of threats. In this paper, we propose two protection schemes against design theft, for SRAM-based FPGA devices, considering the key management issue at the customer facility. The first scheme proposes some improvements to a pre-reported scheme for reducing its implementation cost. The second proposition is distinct from others by combining both symmetric and asymmetric encryption. An evident comparison shows that our proposed schemes are more advantageous over other works and present the best tradeoff between hardware, security and key management, making good use of different cryptography features. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • FPGA implementation of robust asynchronous wrappers for Globally-Asynchronous systems (GALS)

    Page(s): 1 - 6
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (465 KB) |  | HTML iconHTML  

    Contemporary digital systems must be based on the “System-on-Chip - SoC” concept. An interesting style for SoC design is the GALS paradigm (Globally Asynchronous, Locally Synchronous), which can be used to implement circuits in FPGAs (Field Programmable Gate Arrays), but the implementation of asynchronous interfaces (asynchronous wrapper - AW) constitutes a major drawback for this kind of devices. Although there is a typical AW design style which is based on asynchronous controllers and provides communication between modules (called ports), Port controllers are subject to essential-hazard when implemented FPGA. In this context, this paper proposes a new asynchronous GALS wrapper architecture to be implemented in FPGAs that is essentially free from hazard, not needing any special cares in implementation concerning to LUTs choice and being fully compatible with FPGA. Additional advantages of the proposed architecture are the total autonomy that synchronous modules achieve when interacting with the asynchronous wrapper; its ports can be synthesized in the direct mapping style (so without knowledge of asynchronous logic synthesis); and ports interacts in Ib/Ob Mode, not needing a timing analysis and also being more robust than GFM. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An efficient packing algorithm based on constraint satisfaction problem technique

    Page(s): 1 - 5
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (335 KB) |  | HTML iconHTML  

    In this paper, an efficient packing algorithm based on constraint satisfaction problem technique is proposed for contemporary FPGA CLB architecture. No matter how complex the architecture is, there are a limited number of patterns, which can implement all functionalities of FPGA CLB logic. All the patterns are pre-designed and known as reference circuits. The proposed algorithm then matches the reference circuits from the given user logic circuit using specific constraints. Due to complex architecture of FPGA, to enumerate all the reference circuits in a fine-grain manner is impractical. Consequently, coarse-grain manner is adapted in the paper to overcome this problem. The experimental results show that the proposed algorithm achieves comparable performance in area and speed compared with literatures. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Historic behavior of the electronic technology: The Wave of Makimoto and Moore's Law in the Transistor's Age

    Page(s): 1 - 5
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (490 KB) |  | HTML iconHTML  

    The intention of this paper is to show how the history of the electronic technology can be seen, as well as science, through revolutions. Such changes can be predicted by means of two projections: Moore's Law and Makimoto's Wave. The first one, in the present of normative nature, indicates that procedure must follow the semiconductors industry. The second one, analytical, describes the industry behavior as a consequence of the observation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Real-time scheduling coprocessor for NIOS II processor

    Page(s): 1 - 6
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (310 KB) |  | HTML iconHTML  

    In this paper we describe and analyze the main features of the Hardware Real-Time Scheduler Coprocessor unit (HRTC) for NIOS II processor. We describe how the HRTSC supports time, events, task and priorities. The HRTSC was designed as a SOPC component to incorporate real-time features for embedded real-time applications. The hardware architecture has an easy integration with the IDE programming environment. The Avalon interface showed to be an efficient specification to share memory and data communication among memory, processor and HRTSC. The performance of the HRTSC architecture is analyzed considering real-time flexibility, programmability and power consumption reduction. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Generic construction of monitors for Floating Point Unit designs

    Page(s): 1 - 8
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (340 KB) |  | HTML iconHTML  

    This paper proposes a set of well defined steps to design functional verification monitors intended to verify Floating Point Units (FPU) described in HDL. The first step consists on defining the input and output domain coverage. Next, the corner cases are defined. Finally, an already verified reference model is used in order to test the correctness of the Device Under Verification (DUV). As a case study a monitor for an IEEE754-2008 compliant design is implemented. This monitor is built to be easily instantiated into verification frameworks such as OVM. Two different designs were verified reaching complete input coverage and successful compliant results. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The impact of operating system adoption in an embedded project: A case study

    Page(s): 1 - 7
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (365 KB) |  | HTML iconHTML  

    The use of an operating system (OS) is advocated as a means to simplify software development, freeing programmers from managing low-level hardware and providing a simpler programming interface for common tasks. The high complexity of modern desktop computers makes an OS indispensable; embedded systems, on the other hand, are limited architectures, usually severely cost- and power-constrained. Because of the additional demands imposed by an OS, embedded developers are faced with the crucial decision of whether to adopt an OS or not. In this paper, we present a case study in which a sample application (an embedded weather station) was developed under three different scenarios: without any OS, using the μC/OS-II real-time OS, and using the uClinux general-purpose OS. An FPGA and an SoPC were used to provide a flexible hardware platform able to accommodate all three configurations. The adoption of an OS provided a reduction of up to 48% in development time; on the other hand, it increased program memory requirements in at least 71%. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • FPGA design of H.264/AVC intra-frame prediction architecture for high resolution video encoding

    Page(s): 1 - 6
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (600 KB) |  | HTML iconHTML  

    Video coding applications are disseminated in a range of devices and require application-specific hardware support to deal with the ever increasing computational complexity of advanced video coding standards. The design of application-specific circuit for intra-frame prediction module in H.264/AVC standard is the most efficient solution, however, it make really difficult and costly for future design changes. In this work is presented an H.264/AVC intra-frame prediction hardware architecture targeting Field-Programmable Gate Array (FPGA). Taking advantage of the heterogeneous resources of FPGA, e.g. embedded memory and digital signal processing blocks, the performance of our architecture is improved. Storing intermediate data in block RAM memories reduces the number of cycles to process a macroblock in up to 73% and the memory bandwidth in 75%. The use of DSP blocks improves the critical path, increasing the maximum frequency, which enables the architecture to process 60 HD1080p frames per second. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Adapting a low complexity datapath to MIPS-1

    Page(s): 1 - 6
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (621 KB) |  | HTML iconHTML  

    This paper presents the process of implementation of the MIPS-1 ISA on a simple didactic processor, without increasing the datapath complexity. This implementation may be desirable for academic purposes or for the use of datapaths of different complexity and performance in the MPSoC (Multiprocessor System-on-Chip) design. This paper shows the physical changes needed in the target datapath to fit the features of the new ISA. The techniques used to maintain the datapath simplicity are also shown. Finally, we present a simple implementation example used to validate this datapath, with simulation and synthesis results on FPGA. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Memory-mapped I/O over dual port BRAM on FPGA

    Page(s): 1 - 6
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (273 KB) |  | HTML iconHTML  

    Nowadays, Direct Memory Access (DMA) is one of the most used mechanisms for data transfer between a processor and its peripherals. Another possibility is to map peripherals directly in the memory space, which has the disadvantage of requiring dual port memories when the device handles large quantities of data. It typically is the case of video and network applications. In this work we propose the use of dual port BRAM often available in modern FPGAs to implement a core using Memory mapped I/O (MMIO). As a case study, we present the development of an AVR microcontroller core with the Media Access Controller (MAC) Ethernet built in. It is capable of running the uIP TCP/IP stack, with a Web Server as example application. Additionally, we discuss the advantages of moving the program code to an external memory that use the Common Flash Interface (CFI) standard. This design was simulated with Free Software tools and it was verified in hardware using a Xilinx Virtex 4 FPGA. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • HardNoC: A platform to validate networks on chip through FPGA prototyping

    Page(s): 1 - 6
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (513 KB) |  | HTML iconHTML  

    The use of intrachip buses is no longer a consensus to build interconnection architectures for complex integrated circuits. Networks on chip (NoCs) are a choice in several real designs. However, the distributed nature of NoCs, the huge amount of wires and interfaces of large NoCs can make system/interconnection architecture debugging a nightmare. This work accelerates the NoC validation process using FPGA prototyping. HardNoC is a platform based on simple modules to inject traffic and collect basic statistics of NoCs. It can be used to early validate NoC designs and to provide initial numerical results for NoC evaluation and design. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Clock gating and clock enable for FPGA power reduction

    Page(s): 1 - 5
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1205 KB) |  | HTML iconHTML  

    This paper presents experimental measurements of power consumption using different techniques to turn off part of a system and switch between active and standby modes. The main ideas analyzed are: clock gating, clock enable, and blocking inputs. The laboratory work is described, including the measurement setups and the benchmark circuits. Quantitative measurements in both a 65 nm CMOS Cyclone III and a 45 nm CMOS Spartan 6 FPGAs are presented. The selected circuits used as benchmarks are different type of multipliers. Results of power consumption in active and standby modes are presented and compared. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Image convolution processing: A GPU versus FPGA comparison

    Page(s): 1 - 6
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1756 KB) |  | HTML iconHTML  

    Convolution is one of the most important operators used in image processing. With the constant need to increase the performance in high-end applications and the rise and popularity of parallel architectures, such as GPUs and the ones implemented in FPGAs, comes the necessity to compare these architectures in order to determine which of them performs better and in what scenario. In this article, convolution was implemented in each of the aforementioned architectures with the following languages: CUDA for GPUs and Verilog for FPGAs. In addition, the same algorithms were also implemented in MATLAB, using predefined operations and in C using a regular x86 quad-core processor. Comparative performance measures, considering the execution time and the clock ratio, were taken and commented in the paper. Overall, it was possible to achieve a CUDA speedup of roughly 200× in comparison to C, 70× in comparison to Matlab and 20× in comparison to FPGA. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A high performance and low memory bandwidth architecture for motion estimation targeting high definition digital videos

    Page(s): 1 - 6
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (339 KB) |  | HTML iconHTML  

    This work presents a high performance and low memory bandwidth hardware architecture based on the Full Search block matching algorithm for the motion estimation on high definition digital videos. The motion estimation is the most computational intensive module of the video encoder and it requires besides the high processing throughput, a very high bandwidth with the external memory. The presented architecture explores the parallelism to achieve high processing rates and it uses a memory hierarchy to reuse data, reducing the required bandwidth with external memory. The architecture was described in VHDL and synthesized in a Xilinx Virtex 4 FPGA, achieving an operation frequency of 292 MHz and processing more than 38 high definition 1080 frames (1920×1080 pixels) per second, surpassing the requirements for real time processing. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • FPGA based design for motion vector predicton in H.264/AVC encoders targeting HD1080p resolution

    Page(s): 1 - 6
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (240 KB) |  | HTML iconHTML  

    Motion vector coding is an important issue in low bitrate video coding, since it relatively increases the efficiency of modern video encoders. The motion vector prediction exploits the correlation between the motion of neighbor blocks, since they may represent the same object and then present the same movement direction. The motion vector prediction is performed by a difference between the current motion vector and the predictive motion vector (PMV), generated using the neighbor blocks as reference. This way, only the motion vector difference (MVD) is sent to the bit stream. Due to its performance the motion vector prediction is defined as an obligatory tool in the H.264/AVC standard. This work presents a FPGA based hardware architecture for the H.264/AVC motion vector predictor targeting HD1080p resolution. The architecture was described in VHDL and synthesized to Xilinx xc5vlx30 Virtex V FPGA. The results were compared with one motion vector prediction architecture from the literature. Our design has shown better results considering hardware usage and throughput than the related work. Besides, we used a motion estimation and motion compensation architecture composing a whole inter-frame prediction module, to perform a better evaluation of the results generated by our proposed motion vector predictor architecture. The results have shown that our architecture uses few hardware resources and it can process until 52 HD1080p frames per second. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.