Scheduled System Maintenance:
On May 6th, single article purchases and IEEE account management will be unavailable from 8:00 AM - 5:00 PM ET (12:00 - 21:00 UTC). We apologize for the inconvenience.
By Topic

Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC), 2011 6th International Workshop on

Date 20-22 June 2011

Filter Results

Displaying Results 1 - 25 of 57
  • [Title page]

    Publication Year: 2011 , Page(s): 1
    Save to Project icon | Request Permissions | PDF file iconPDF (68 KB)  
    Freely Available from IEEE
  • [Copyright notice]

    Publication Year: 2011 , Page(s): 1
    Save to Project icon | Request Permissions | PDF file iconPDF (60 KB)  
    Freely Available from IEEE
  • Asymmetric cache coherency: Improving multicore performance for non-uniform workloads

    Publication Year: 2011 , Page(s): 1 - 8
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (329 KB) |  | HTML iconHTML  

    Asymmetric coherency is a new concept to support non-uniform workloads in multicore processors. We present the theory behind asymmetric coherency policies and show our design requires no additional hardware over an existing system. Asymmetric coherency is designed to provide better performance for asymmetry in a workload and this is applicable to SoC multicores where the applications often are not evenly spread among the processors. The low cost and complexity makes it a desirable new coherency policy for future work. Our results show up to a 60% reduction in coherency costs for unshared data and up to a 174% improvement in memory access time for shared data. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Exploiting multicast messages in cache-coherence protocols for NoC-based MPSoCs

    Publication Year: 2011 , Page(s): 1 - 6
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (285 KB) |  | HTML iconHTML  

    MPSoCs are largely used in embedded systems, allowing the design of complex systems within short time-to-market. The shift in the communication infrastructure, from buses to networks-on-chip (NoCs), adds new design challenges. Standard directory-based cache coherence protocols represent a performance bottleneck due to number of transactions in the network, reducing performance and increasing the energy consumption. State-of-the-art works investigate new protocols, at abstract levels (e.g. TLM), to optimize the performance of the memory organization. Differently from previous works, we investigate the benefits NoCs can bring to directory-based cache coherence protocols using RTL modeling. The main functionality NoCs may provide for the protocols is the way messages are sent through the network. Most NoCs support multicast as a set of unicast messages. Such method is not suitable for cache coherence protocols, because transactions as block invalidate and block update are naturally multicast. This work proposes the use of multicast messages to reduce the number of transactions to improve the performance of cache coherence protocols in NoC-based MPSoCs. Results show that performance of some transactions is improved up to 32% when using multicast messages. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Evaluating the feasibility of network coding for NoCs

    Publication Year: 2011 , Page(s): 1 - 5
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (239 KB) |  | HTML iconHTML  

    Network coding is a novel technique that can increase network throughput by linearly combining data packets in intermediate nodes and recovering them before delivery at their destinations. This paper discusses the applicability of network coding to NoCs and evaluates the potential advantages of that technique when supporting multicast communication. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • FPGA SDK for nanoscale architectures

    Publication Year: 2011 , Page(s): 1 - 8
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2312 KB) |  | HTML iconHTML  

    As CMOS technology approaches its physical limits several emerging technologies are investigated to find the right replacement for the future computing systems. A number of different fabrics and architectures are currently under investigation. Unfortunately, at this time, no unified modeling exists to offer sound support for algorithmic design space exploration, with no compromise on device feasibility. This work presents a NASIC-compliant application-specific computing architecture template along with its performance models and optimization policies. From the tool-flow perspective, this architecture is similar to antifuse configurable architectures hence we propose a FPGA SDK based programming environment that support domain-space exploration. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • FPGA physical-design automation using Model-Driven Engineering

    Publication Year: 2011 , Page(s): 1 - 6
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (516 KB) |  | HTML iconHTML  

    The physical design automation is a difficult problem due to the huge number of devices, and physical constraints to be met. The Model-Driven Engineering (MDE) approach aims to tackle the complexity of software development using a high level method based on models and transformations. While this approach is used for High-Level circuit synthesis there is no work reported on the lower part of the circuit design automation flow, namely the physical-design automation. In this work, we use the MDE approach to model the physical synthesis process with a focus mainly on reconfigurable architectures. We present a model for island style FPGAs along with the transformations needed in the case of the physical synthesis. The main result of this work is to show the feasibility of the MDE approach for the physical design automation problem, and we argue that this methodology enables orthogonal composition of the architecture / algorithms / application design space, that enables incremental exploration based on quantitative evaluations. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast prototyping environment for embedded reconfigurable units

    Publication Year: 2011 , Page(s): 1 - 8
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (310 KB) |  | HTML iconHTML  

    In order to cope with the increasing complexity of embedded applications as well as their fast evolution, flexible systems with high-performance are mandatory. In this context, reconfigurable system-on-chip solutions that meet application needs have become common. However, the innovation race shrinks time-to-market and puts high pressure on designers. Therefore designers need appropriate methodologies and tools to efficiently perform design space exploration of reconfigurable units. This paper addresses this issue with an ADL-based toolsuite for fast prototyping of reconfigurable units. Starting from a high-level model it supports design space exploration for different architectural solutions, to produce an hardware prototype and to generate the applicative tools for exploitation. Benefits and feasability of the approach are demonstrated by the complete prototyping and implementation of a reconfigurable unit supporting ressources virtualization. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A prototyping environment for high performance reconfigurable computing

    Publication Year: 2011 , Page(s): 1 - 8
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2419 KB) |  | HTML iconHTML  

    In the face of power wall and high performance requirements, designers of hardware architectures are directed more and more towards reconfigurable computing with the usage of heterogeneous CPU/FPGA systems. In such architectures, multi-core processors come with high computation rates while the reconfigurable logic offers high performance per watt and adaptability to the application constraints. However, the design of heterogeneous architectures is facing extremely challenging requirements such as the appropriate programming model, design tools, and the rapid system prototyping. Focusing this issue, we present a prototyping environment for heterogeneous CPU/FPGA systems. Within this environment, we conceived a generic and scalable architecture based on a multi-core processor tightly-connected to FPGA in order to meet performance, power and flexibility goals. Furthermore, front-end interfaces are presented in order to establish communication, data sharing, and synchronisation between the different software and hardware processing units. Finally, we defined a design methodology that eases the development of applications onto heterogeneous systems. Our environment is conceived using standard host machine coupled with a Xilinx Virtex 6 FPGA through the PCI Express standard bus. In the experimental part, we evaluate first the reliability of different CPU/FPGA communication solutions in order to bring real-time capabilities to our system. Secondly, we demonstrate the efficiency of the presented design methodology for heterogeneous systems through the FIR signal processing application. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • HeMPS-S: A homogeneous NoC-based MPSoCs framework prototyped in FPGAs

    Publication Year: 2011 , Page(s): 1 - 8
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1021 KB) |  | HTML iconHTML  

    Current chip transistor density enables the design of multiprocessor systems-on-chip (MPSoCs). MPSoCs are an alternative to create complex computational systems because they reduce the cost, area, power dissipation and design time per chip. Due to their complexity and huge design space to explore for such systems, CAD tools and frameworks to customize MPSoCs are mandatory. The main goal of this paper is to present an open source platform for MPSoC development, named HeMPS Station (HeMPS-S). HeMPS-S is derived from the MPSoC HeMPS. HeMPS-S, in its present state, includes the platform (NoC, processors, DMA, NI), embedded software (microkernel and applications) and a dedicated CAD tool to generate the required binaries and perform debugging. Experiments show the execution of a real application running in HeMPS-S. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Achieving hardware security for reconfigurable systems on chip by a proof-carrying code approach

    Publication Year: 2011 , Page(s): 1 - 8
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (687 KB) |  | HTML iconHTML  

    Reconfigurable systems on chip are increasingly deployed in security and safety critical contexts. When downloading and configuring new hardware functions, we want to make sure that modules adhere to certain security specifications and do not, for example, contain hardware Trojans. As a possible approach to achieving hardware security we propose and demonstrate the concept of proof-carrying hardware, a concept inspired by previous work on proof-carrying code techniques in the software domain. In this paper, we discuss the hardware trust and threat models behind proof-carrying hardware and then present our experimental setup. We detail the employed open-source tool chain for the runtime verification of combinational equivalence and our bitstream format for an abstract FPGA architecture that allows us to experimentally validate the feasibility of our approach. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Secure extensions of FPGA soft core processors for symmetric key cryptography

    Publication Year: 2011 , Page(s): 1 - 8
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1016 KB) |  | HTML iconHTML  

    When used in cryptographic applications, general-purpose processors are often completed by a cryptographic accelerator - crypto-coprocessor. Secret keys are usually stored in the internal registers of the processor, and are vulnerable to attacks on protocols, software/firmware or cache memory. The paper presents three ways of extending soft general purpose processors for cryptographic applications. The proposed extension is aimed at symmetric key cryptography and it guarantees secure key management. Three security zones are created and physically separated in each of three configurations: processor, cipher and key storage zones. In the three zones, the secret keys are manipulated in a different manner - in clear or enciphered, as common data or keys. The security zones are separated on the protocol, system, architectural and physical levels. The proposed principle is validated on Altera NIOS II, Xilinx MicroBlaze and Actel Cortex M1 soft core processor extensions. The NIOS II processor needs fewer clock cycles per data block encryption, because the security module is included in the processor's data path. The data path of the MicroBlaze is unchanged and thus shorter, but additional clock cycles are necessary for data transfers between the processor and the security module. The Cortex M1 processor is connected via AHB bus and the cryptographic extension is accessed as an ordinary peripheral - a coprocessor. Although the interfacing is different, the three processors with their extensions attain the required high security level. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Secure remote reconfiguration of an FPGA-based embedded system

    Publication Year: 2011 , Page(s): 1 - 6
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (388 KB) |  | HTML iconHTML  

    This paper describes the protocol, architecture, and implementation details of an FPGA-based embedded system that is able to remotely reconfigure the FPGA, using a TCP/IP connection, in a secure way. When considering the security aspects, we imply data confidentiality, explicit key authentication and data origin authentication. Since these aspects are overhead for the main application, the system is to be as small as possible. Therefore we have focused on compactness rather than on speed for the implementation. The implemented solution exists out of 2 components: a communication part and a cryptographic part. The system can be easily integrated at any point in the design of an FPGA-based embedded system, due to the simple and modular architecture. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Monitoring communication channels on a shared memory multi-processor system on chip

    Publication Year: 2011 , Page(s): 1 - 8
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (436 KB) |  | HTML iconHTML  

    To meet performance requirements, streaming applications have been mapped to Multi-Processor System on Chip (MPSoC). The Kahn Process Network (KPN) paradigm is sufficient when dealing with pipeline parallelism, but such point-to-point channels are impractical in the presence of massive task farm parallelism. Multi Writer Multi Reader (MWMR) channels generalize KPN in such a way that multiple writers and multiple readers access the same channel. They are implemented as software channels stored in on-chip memory to accommodate access by hardware and software tasks alike. The price to pay for this implementation is increased traffic to and from memory. Typical representatives are telecommunication applications which may treat hundreds or thousands of flows at a time, where the same chain of treatments is applied to every packet. The latency for this treatment depends on the packet's content, and can thus not be foreseen. Among multiple tasks which access a MWMR channel, the time to obtain a lock is variable. In consequence, fill states of MWMR channels vary heavily and it is crucial to monitor it in order to detect potential bottlenecks. We show how this can be done early in the design process by using SoCLib/DSX. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Dynamic resource management in modern multicore SoCs by exposing NoC services

    Publication Year: 2011 , Page(s): 1 - 7
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (468 KB) |  | HTML iconHTML  

    Emerging multicore chips containing tens or even hundreds of cores require modern interconnect solutions with increased programmability to support dynamic resource management. Modern embedded devices appear that employ reconfigurable architectures or application specific hardware modules, but still runtime QoS optimizations or dynamic power management require more flexibility from the underlying hardware infrastructure and corresponding middleware. This paper demonstrates a methodology to expose NoC services for adaptive management of hardware resources through a software platform based on Spidergon STNoC technology which consists of a low level driver layer and libraries accessible at user level. Thus, the system designer can exploit the runtime programmable services of a Network-on-Chip so as to provide differentiated network services to multiple independent applications. This methodology can easily be extended to any NoC technology. Spidergon STNoC allows to design customized topologies through the iNoC GUI tool, which is extended to generate the appropriate driver for the Linux kernel. Moreover, an integrated C API allows the developer to capture application specific requirements and dynamically adjust QoS settings of the NoC. In this paper we improve the design methodology to facilitate dynamic management of SoC resources with the aid of appropriate driver and library extensions; we present tools that offer extreme flexibility and real examples of software applications that can exploit the NoC configurability, running on both typical Linux and Android environments. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multi-objective mapping for matrix-based nanocomputer architectures

    Publication Year: 2011 , Page(s): 1 - 7
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (826 KB) |  | HTML iconHTML  

    In this paper, we propose a method for the multi-objective mapping of applications onto matrix-based nanocomputer architectures. These architectures are composed from reconfigurable logic cells interconnected according to a given topology. The power consumption and data propagation delay of each cell depend on its internal function, e.g. NAND, OR, etc. By taking into account these cell characteristics, the mapping method optimizes power consumption, critical path delay and area of the whole system. We experimentally prove that the proposed method is efficient for generating mapping solutions with good trade-off between the optimized metrics. Furthermore, the method allows the comparison of matrix size and interconnect topologies in nanocomputer architectures, and thus aims to facilitate the development of such architectures. Experimental results demonstrate 38% of power reduction for systolic array and 44% of critical path delay improvement for the “Cell Matrix”. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Dataflow programming model for reconfigurable computing

    Publication Year: 2011 , Page(s): 1 - 8
    Cited by:  Papers (7)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1074 KB) |  | HTML iconHTML  

    This paper addresses the problem of image processing algorithms implementation onto dynamically and reconfigurable architectures. Today, these Systems-on-Chip (SoC), offer the possibility to implement several heterogeneous processing elements in a single chip. It means several processors, few hardware accelerators as well as communication mediums between all these components. Applications for this kind of platform are described with software threads, running on processors, and specific hardware accelerators, running on hardware partitions. This paper focuses on the complex problem of communication management between software and hardware actors for dataflow oriented processing, and proposes solutions to leverage this issue. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Evaluation of speculative execution techniques for high-level language to hardware compilation

    Publication Year: 2011 , Page(s): 1 - 8
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (419 KB) |  | HTML iconHTML  

    The PreCoRe approach allows the automatic generation of application-specific microarchitectures from C, thus supporting complex speculative execution on reconfigurable computers. In this work, we present the PreCoRe capability of using data-value speculation to reduce the latency of memory reads, as well as the lightweight extension of static datapath controllers to the dynamic replay of misspeculated operations. The experimental evaluation considers the performance / area impact of the approach and also discusses the individual effects of combining different speculation mechanisms. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A reconfigurable fabric supporting full C/C++ input

    Publication Year: 2011 , Page(s): 1 - 6
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (464 KB) |  | HTML iconHTML  

    Reconfigurable architectures have been widespread used to improve the performance of embedded applications. As most applications of this domain (e.g. video and audio standards) are traditionally specified in high-level programming languages (e.g. C/C++, Java, etc), their optimization process through a hardware accelerator relies on a source code translation from high- to low- level programming languages (e.g. VHDL and Verilog) to program the reconfigurable fabric. However, there is no automatic process to perform such translation. Moreover, the lower the level of the programming language is, the harder it is to manually specify the algorithm, which can greatly affect the hard time-to market imposed by the embedded market. In this paper, we present an easy-programmed reconfigurable fabric that accelerates embedded applications in a total transparent fashion. We propose the use of a run-time binary translation hardware that translates C/C++ source code to the reconfigurable fabric code, without human intervention. Experimental results show great speedups w.r.t. a general-purpose processor and advantageous tradeoff between performance and software productivity w.r.t. a traditional FPGA. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A dependable and dynamic network on chip suitable for FPGA-based reconfigurable systems

    Publication Year: 2011 , Page(s): 1 - 6
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (826 KB) |  | HTML iconHTML  

    In this paper, we present a new reliable and dynamic NoC-based communication approach called RCuNoC designed for the FPGA-based reconfigurable systems. The originality of the RCu routers are the online capacity to detect data packet errors and localize if the errors come from input ports or inside of the routers while distinguish between temporary and permanently errors. Our reliable router which requires a low-area architecture is based on the centralization of the buffer, routing logic and error detection/correction/localization blocks. We present the basic concept of the RCu switches, its main advantages and with regards to the other main already proposed dynamic NoC approaches and we prove its feasibility on examples through the simulations. Performance evaluation and FPGA implementation results are also given. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • High-performance on-chip network platform for memory-on-processor architectures

    Publication Year: 2011 , Page(s): 1 - 6
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (742 KB) |  | HTML iconHTML  

    Three Dimensional Integrated Circuits (3D ICs) are emerging to improve existing Two Dimensional (2D) designs by providing smaller chip areas, higher performance and lower power consumption. Stacking memory layers on top of a multiprocessor layer (logic layer) is a potential solution to reduce wire delay and increase the bandwidth. To fully employ this capability, an efficient on-chip communication platform is required to be integrated in the logic layer. In this paper, we present an on-chip network platform for the logic layer utilizing an efficient network interface to exploit the potential bandwidth of stacked memory-on-processor architectures. Experimental results demonstrate that the platform equipped with the presented network interface increases the performance considerably. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Temperature-based covert channel in FPGA systems

    Publication Year: 2011 , Page(s): 1 - 7
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (582 KB) |  | HTML iconHTML  

    This paper reports the temperature-based covert communication channel implemented in FPGA system. The channel enables bidirectional transmission and exchange of an arbitrary bit stream between two, electrically separated parts of the FPGA circuit during its normal operation. Transmission to and from the FPGA device is also reported. Transmitter and receiver modules are based on ring-oscillator which utilize 60 and 51 look-up tables respectively. The proof of concept was implemented in the Xilinx Spartan-IIE device and allows for transmission speed of 1/8 bit/s between FPGA and external transceiver. Internal communication is faster and allows to transmit up to 1 bit per second. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Partial reconfiguration in the implementation of autonomous radio receivers for space

    Publication Year: 2011 , Page(s): 1 - 6
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (417 KB) |  | HTML iconHTML  

    In space mission there are different scenarios where autonomous radio systems are very useful. In this paper we consider one of such scenarios, related to the communication infrastructure for the planet exploration. The basic idea is to obtain significant advantages in autonomous radio receiver implementation by using FPGA dynamic partial reconfiguration. Implementing the most significant part of the radio, we will show as this techniques can be used and what are the advantages we can obtain. In particular, by using this design methodology system complexity and power consumption is reduced improving the overall system reliability (mainly, in relation to possible SEU induced by ions in the configuration memory). View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Self-reparable system on FPGA for single event upset recovery

    Publication Year: 2011 , Page(s): 1 - 6
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (283 KB) |  | HTML iconHTML  

    Mission critical and reliable systems on FPGA require error mitigation and recovery techniques to protect them from the errors caused by high energy radiation also known as Single Event Upsets (SEU). Different solutions have been reported with different trade-off of area-overhead and fault latency. We propose a low area-overhead self-reparable procedure based on an internal error recovery mechanism, which is monitored by an external watchdog timer in the role of diagnostic hardcore. The proposed procedure has been verified by extensive fault emulation experiments. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A self-reconfigurable platform for general purpose image processing systems on low-cost spartan-6 FPGAs

    Publication Year: 2011 , Page(s): 1 - 9
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1241 KB) |  | HTML iconHTML  

    There is still no partial reconfiguration tool support on low-cost Field Programmable Gate Arrays (FPGAs) such as old-fashioned Spartan-3 and state-of-the-art Spartan-6 FPGA families by Xilinx. This forces the designers and engineers, who are using the partial reconfiguration capability of FPGAs, to use expensive families such as Virtex-4, Virtex-5 and Virtex-6 which are officially supported by partial reconfiguration (PR) software. Moreover, Xilinx still does not offer a portable, dedicated self-reconfiguration engine for all of the FPGAs. Self-reconfiguration is achieved with general-purpose processors such as MicroBlaze and PowerPC which are too overqualified for this purpose. In this study, we propose a new self-reconfiguration mechanism for Spartan-6 FPGAs. This mechanism can be used to implement large and complex designs on small FPGAs as chip area can be dramatically reduced by exploiting the dynamic partial reconfiguration feature for on-demand functionality loading and maximal utilization of the hardware. This approach is highly attractive for designing low-cost compute-intensive applications such as high performance image processing systems. For Spartan-6 FPGAs, we have developed hard-macros and exploited the self-reconfiguration engine, compressed Parallel Configuration Access Port (cPCAP) [1], that we designed for Spartan-3. The modified cPCAP core with block RAM controller, bitstream decompressor unit and Internal Configuration Access Port (ICAP) Finite State Machine (FSM) occupies only about 82 of 6,822 slices (1.2% of whole device) on a Spartan-XC6SLX45 FPGA and it achieves the maximum theoretical reconfiguration speed of 200MB/s (ICAP, 16-bit at 100MHz) proposed by Xilinx. We have also implemented a Reconfigurable Processing Element (RPE) whose arithmetic unit can be reconfigured on-the-fly. Multiple RPEs can be utilized to design a General Purpose Image Processing System (GPIPS) that can implement a number of different algorithms during runtime. As - - an illustrative example, we programmed the GPIPS on Spartan-6 for switching between two applications on-demand such as two-dimensional filtering and block-matching. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.