By Topic

Hardware/Software Codesign, 1999. (CODES '99) Proceedings of the Seventh International Workshop on

Date 3-3 March 1999

Filter Results

Displaying Results 1 - 25 of 44
  • Proceedings of the Seventh International Workshop on Hardware/Software Codesign (CODES'99) (IEEE Cat. No.99TH8450)

    Save to Project icon | Request Permissions | PDF file iconPDF (170 KB)  
    Freely Available from IEEE
  • Author index

    Page(s): 213
    Save to Project icon | PDF file iconPDF (47 KB)  
    Freely Available from IEEE
  • Full text access may be available. Click article title to sign in or learn about subscription options.
  • A compilation-based software estimation scheme for hardware/software co-simulation

    Page(s): 85 - 89
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (348 KB)  

    High-level cost and performance estimation, coupled with a fast hardware/software co-simulation framework, is a key enabler to a fast embedded system design cycle. Unfortunately, the problem of deriving such estimates without a detailed implementation available is very difficult. In this paper we focus on embedded software performance estimation. Current approaches use either behavioral simulation with (often manual) timing annotations, or a clock cycle-accurate model of instruction execution (e.g., an instruction set simulator). The former provides greater flexibility (no need to perform a detailed design) and high simulation speed, but cannot easily consider effects such as compiler optimization and processor architecture. The latter provides high accuracy, but requires a more detailed implementation model, and is much slower in general. We hence developed a hybrid approach, that incorporates some aspects of both. It provides a flexible and fast simulation platform, considering also compilation issues and processor features. The key idea is to use the GNU-C compiler (GCC) to generate “assembler-level” C code. This code can be annotated with timing information, and used as a very precise, yet fast, software simulation model. We report some experimental results that show the effectiveness of our approach, and we propose some future improvements View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A probabilistic performance metric for real-time system design

    Page(s): 90 - 94
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (356 KB)  

    At the system level design of a real-time embedded system, a major issue is to identify from alternative architectures the best one which satisfies the timing constraints. This issue leads to the need of a metric that is capable of evaluating the overall system timing performance. Some of the previous work in the related areas focus on predicting the system's timing performance based on a fixed computation time model. These approaches are often too pessimistic. Those that do consider varying computation times for each task are only concerned with the timing behavior of each individual task. Such predictions may not properly capture the timing behavior of the entire system. In this paper, we introduce a metric that reflects the overall timing behavior of RTES. Applying this metric allows a comprehensive comparison of alternative system level designs View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Power estimation for architectural exploration of HW/SW communication on system-level buses

    Page(s): 152 - 156
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (436 KB)  

    The power consumption due to the HW/SW communication on system-level buses represents one of the major contributions to the overall power budget. A model to estimate the switching activity of the on-chip and off-chip buses at the system-level has been defined to evaluate the power dissipation and to compare the effectiveness of power optimization techniques. The paper aims at providing a framework for architectural exploration of a system design, focusing on the power consumption estimation of memory communication. Experimental results, conducted on bus streams generated by a real microprocessor and a stream generator, show how the variation of cache parameters and the introduction of bus encoding at the different levels on the memory hierarchy can affect the system power dissipation. Therefore, the proposed model can be effectively adopted to appropriately configure the memory hierarchy and the system bus architecture from the power standpoint View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Aspects on system-level design

    Page(s): 209 - 210
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (132 KB)  

    There are probably as many descriptions of system-level design as there are system designers and codesign researchers. To define or even try to describe system-level design in a few paragraphs is not an easy task. However, the early stages of any system design effort have a few characteristics in common and two of the most important are incompleteness and exploration. We discuss some aspects related to the exploration of incompletely described electronic systems and indicate areas that deserve attention. The discussion is based on our industrial experience and it is important to understand that not all the requirements on system-level design come from the application domain itself. Rather they depend heavily on the economical and organisational context in which systems are developed View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Designing digital video systems: Modeling and scheduling

    Page(s): 64 - 68
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (384 KB)  

    An advanced Digital Video Broadcasting (DVB) system is used as a design driver for an IF-based real-time design methodology explored in the ESPRIT/OMI COSY project. The design methodology is supported by the Felix VCC environment, provided by a COSY partner Cadence, and tool-set developed for COSY. In this paper, we focus on two key aspects of the design: behavior modeling and code generation. For the behavior modeling, we present the model of computation used to represent the DVB and the technique for expressing this particular model with the more general model of computation supported by the Felix technology. In a companion paper, the architecture selection and communication refinement are described. Once the architecture is selected and a partitioning has been decided, the implementation phase starts. In this phase, for most system designs, a great deal of software has to be written to “customize” the programmable components of the architecture. Obtaining an optimized and correct-by-construction software implementation is fundamental in an effective design methodology. Here we focus on a software generation technique which aims to reduce run-time overhead for functions executed on a single CPU, by generating a minimal number of run-time tasks View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimizing geographically distributed timed cosimulation by hierarchically grouped messages

    Page(s): 100 - 104
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (380 KB)  

    This paper presents a concept called hierarchically grouped message to improve the performance of geographically distributed timed cosimulation. In the proposed method, messages which are transferred between simulators in a short period of simulated time are hierarchically grouped into a physical message to reduce the number of rollbacks in optimistic simulation as well as the communication overhead of message transfer. Experiments show the efficiency of the proposed method in an internationally distributed cosimulation environment View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Compiling Esterel into sequential code

    Page(s): 147 - 151
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (368 KB)  

    This paper presents a novel compiler for Esterel, a concurrent synchronous imperative language. It generates fast, small object code by compiling away concurrency, producing a single C function requiring no operating system support for threads. It translates an Esterel program into an acyclic concurrent control-flow graph from which code is synthesized that runs instructions in an order respecting inter-thread communication. Exceptions and preemption constructs become conditional branches. Variables save control state; conditional branches restore it. Although designed for Esterel, this approach could be applied to compiling other synchronous concurrent languages View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A statechart based HW/SW codesign system

    Page(s): 162 - 166
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (308 KB)  

    The Codesign Finite State Machine (CFSM) formal model provides a suitable approach for the description of hardware/software systems. The POLIS tool from Berkeley implements the CFSM methodology but currently relies on the textually based Esterel specification language as a high level for the description of individual CFSMs. The designer must then use the Ptolemy simulator to interconnect the CFSM network and perform co-simulation. This paper describes work in progress in developing a system which instead aims to use StatemateTM, a statechart based tool for seamless specification and co-simulation of the entire CFSM network, whilst using the POLIS tool for `C', VHDL code generation and performance estimation. This technique should give the clear advantages of using a graphical specification language together with a uniform co-simulation framework View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Scheduling with optimized communication for time-triggered embedded systems

    Page(s): 178 - 182
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (416 KB)  

    We present an approach to process scheduling for synthesis of safety-critical distributed embedded systems. Our system model captures both the flow of data and that of control. The communication model is based on a time-triggered protocol. We take into consideration overheads due to communication and the execution environment. Communications have been optimized through packaging of messages into slots with a properly selected order and lengths. Several experiments demonstrate the efficiency of the approach View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Development of an optimizing compiler for a Fujitsu fixed-point digital signal processor

    Page(s): 2 - 6
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (384 KB)  

    A common design methodology for embedded DSP systems is the integration of one or more digital signal processors (DSPs), program memory, and ASIC circuitry onto a single IC. Consequently, program memory size being limited, the criterion for optimality is that the embedded software must be very dense. We describe the development of an optimizing compiler, based on a retargetable compiler infrastructure, for the Fujitsu Elixir, a fixed-point DSP that is primarily used in cellular telephones. For small DSP benchmark programs (25-90 lines of C code), the average ratio of the size of compiler-generated code to the size of hand-written assembly code is 1.18. For a much larger program (more than 800 lines of C code), the ratio of the size of compiled code to the size of hand-written code is similar (1.14) View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Instruction set selection for ASIP design

    Page(s): 7 - 11
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (408 KB)  

    We describe an approach for application-specific processor design based on an extendible microprocessor core. Core-based design allows to derive application-specific instruction processors from a common base architecture with low non-recurring engineering cost. The results of this application-specific customization of a common base architecture are families of related and largely compatible processor families. These families can share support tools and even binary compatible code which has been written for the common base architecture. Critical code portions are customized using the application-specific instruction set extensions. We describe a hardware/software co-design methodology which can be used with this design approach. The presented approach uses the processor core to allow early evaluation of ASIP design options using rapid prototyping techniques. We demonstrate this approach with two case studies, based on the implementation and evaluation of application-specific processor extensions for Prolog program execution, and memory prefetching for vector and matrix operations View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Graph based communication analysis for hardware/software codesign

    Page(s): 131 - 135
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (368 KB)  

    In this paper we present a coarse grain CDFG (Control/Data Flow Graph) model suitable for hardware/software partitioning of single processes and demonstrate how it is necessary to perform various transformations on the graph structure before partitioning in order to achieve a structure that allows for accurate estimation of communication overhead between nodes mapped to different processors. In particular, we demonstrate how various transformations of control structures can lead to a more accurate communication analysis and more efficient implementations. The purpose of the transformations is to obtain a CDFG structure that is sufficiently fine grained as to support a correct communication analysis but not more fine grained than necessary as this will increase partitioning and analysis time View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Timing-driven HW/SW codesign based on task structuring and process timing simulation

    Page(s): 203 - 207
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (436 KB)  

    Task structuring is the process of determining the individual tasks of a system, leading to the system's description as a task graph. This paper shows that RADHA-RATAN, our rate derivation algorithms, can be used to validate various tradeoffs made during task structuring, making this step timing aware. We show how RADHA-RATAN enables construction of a high-level timing model of the system leading to a process timing simulation of the entire system. An interesting aspect of process timing simulation is that it provides the ability to observe system level timing behavior based on timing requirements and analysis before an implementation of the tasks has been carried out. Based on task structuring and process timing simulation we propose a codesign methodology by which a system designer can gain insight into the system's timing performance. This approach enables the designer to reduce expensive timing driven design iterations. We have implemented this methodology in the RADHA-RATAN framework. We illustrate its application by an example View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An MPEG-2 decoder case study as a driver for a system level design methodology

    Page(s): 33 - 37
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (376 KB)  

    We present a case study on the design of a heterogeneous architecture for MPEG-2 video decoding. The primary objective of the case study is the validation of the SPADE methodology for architecture exploration. The case study demonstrates that this methodology provides a structured approach to the efficient evaluation of the performance of candidate architectures for selected benchmark applications. We learned that the MPEG-2 decoder can conveniently be modeled as a Kahn process network using a simple API. Abstract models of architectures can be constructed efficiently using a library of generic building blocks. A trace driven simulation technique enables the use of these abstract models for performance analysis with correct handling of data dependent behavior. We performed a design space exploration to derive how the performance of the decoder depends on the busload and the frame rate View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Hardware/software co-design of an avionics communication protocol interface system: an industrial case study

    Page(s): 48 - 52
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (268 KB)  

    Hardware/Software co-design is not a new idea, since designers have been used to mixing programmable and specific hardware components for algorithms implementation. However, with the growing complexity of systems, a computer-aided co-design methodology becomes essential. This paper presents an application of the avionics domain: the ARINC communication protocol interface system. The co-design approach is based on the POLIS framework, coupled with the Esterel specification language View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast prototyping: a system design flow for fast design, prototyping and efficient IP reuse

    Page(s): 69 - 73
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (392 KB)  

    This paper describes a new design flow that significantly reduces time-to-market for highly complex multiprocessor-based system-on-chip (SOC) designs. This flow, put in place within STMicroelectronics and which is called fast prototyping, allows concurrent hardware and software development, early verification and enables the productive re-use of intellectual property. We describe how using this innovative system design flow, that combines different technologies, such as C modeling, emulation, hard virtual component re-use and CoWare N2C, we achieve better productivity on a multiprocessor SOC design View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Worst-case analysis of discrete systems based on conditional abstractions

    Page(s): 115 - 119
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (272 KB)  

    Recently, a methodology for worst-case analysis of systems with discrete observable signals has been proposed. We extend this methodology to make use of conditional system abstractions that are valid only in some system states. We show that the response-time analysis for single-processor systems is particularly well suited for use of such abstractions. We use an example to demonstrate that significantly better response-time bounds can be obtained using conditional abstractions View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Peer-based multithreaded executable co-specification

    Page(s): 105 - 109
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (412 KB)  

    We are integrating language-based software and hardware behaviors in C/pthreads and Verilog for unrestricted peer execution of the domains, including bounded (finite) and unbounded notions of computer system modeling. Since we do not restrict the modeling currently available in each domain, our co-specification is inclusive of both reactive and data-intensive systems. By viewing all mixed system state as shared memory accessible by threads in each domain, we differentiate domains by system resource inferences. We introduce a unified multithreading model for execution and motivate the need to expand the specification capabilities currently available in each domain for mixed-systems using widely accepted languages as a basis. We discuss specific aspects of our cosimulator, provide examples and results, and indicate future directions of our work View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Resource constrained dataflow retiming heuristics for VLIW ASIPs

    Page(s): 12 - 16
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (424 KB)  

    This paper addresses issues in code generation of time critical loops for VLIW ASIPs with heterogenous distributed register structures. We discuss a code generation phasing whereby one first considers binding options that minimize the significant delays that may be incurred on such processors. Given such a binding we consider retiming, subject to code size constraints, so as to enhance performance. Finally a compatible schedule, minimizing latency, is sought. Our main focus in this paper is on the role retiming plays in this complex code generation problem. We propose heuristic algorithms for exploring code size/performance tradeoffs through retiming. Experimental results are presented indicating that the heuristics perform well on a sample of dataflows View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Using codesign techniques to support analog functionality

    Page(s): 79 - 83
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (412 KB)  

    With the growth of System on a Chip (SoC), the functionality of analog components must also be considered in the design process. This paper describes some of the design implementation partitioning issues and experiences using analog and digital techniques for embedded systems. To achieve a quick turn around for new embedded system development, a design methodology was extended for analog codesign based on the specify-explore-refine paradigm and system-level design methodology. Many system-level issues were addressed including hardware/software codesign trade-offs View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Automatic detection of recurring operation patterns

    Page(s): 22 - 26
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (316 KB)  

    An important problem in the area of processor design for embedded systems is determining the proper instruction set architecture. Trade-offs have to be made between programmability and reusability of dedicated hardware for special functionality on the one hand, and a high performance dedicated instruction set on the other hand. This paper addresses the question of how to find specialized ISA extensions for a set of applications. We describe the application of a new pattern matching technique to the problem of the identification of recurring patterns of operations. By implementing frequently occurring operation patterns in hardware, and using this hardware as special function units, a fine-grained hardware/software partitioning can be found. The fine granularity, and the fact that patterns are taken from a number of different target applications rather than a single one, increase the opportunities for reuse of the special-purpose hardware. We illustrate our technique with experiments on a number of benchmarks from the DSP domain View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimized rapid prototyping for real-time embedded heterogeneous multiprocessors

    Page(s): 74 - 78
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (408 KB)  

    This paper presents an enhancement of our “Algorithm Architecture Adequation” (AAA) prototyping methodology which allows to rapidly develop and optimize the implementation of a reactive real-time dataflow algorithm on a embedded heterogeneous multiprocessor architecture, predict its real-time behavior and automatically generate the corresponding distributed and optimized static executive. It describes a new optimization heuristic able to support heterogeneous architectures and takes into account accurately inter-processor communications, which are usually neglected but may reduce dramatically multiprocessor performances View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.