By Topic

Hardware/Software Codesign, 2002. CODES 2002. Proceedings of the Tenth International Symposium on

Date 8-8 May 2002

Filter Results

Displaying Results 1 - 25 of 38
  • Proceedings of the Tenth International Symposium on Hardware/Software Codesign. CODES 2002 (IEEE Cat. No.02TH8627)

    Publication Year: 2002
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (160 KB)  

    The following topics are dealt with: advances in system specification and system design frameworks; system design methods: analysis and verification; design space exploration and architectural design of HW/SW systems; co-design architecture and synthesis; system partitioning and timing analysis; energy efficiency in system design; system design methods: scheduling advances. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Algorithmic transformation techniques for efficient exploration of alternative application instances

    Publication Year: 2002 , Page(s): 7 - 12
    Cited by:  Papers (12)
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (414 KB) |  | HTML iconHTML  

    Following the Y-chart paradigm for designing a system, an application and an architecture are modeled separately and mapped onto each other in an explicit design step. Next, a performance analysis for alternative application instances, architecture instances and mappings has to be done, thereby exploring the design space of the target system. Deriving alternative application instances is not trivially done. Nevertheless, many instances of a single application exist that are worth being derived for exploration. We present algorithmic transformation techniques for systematic and fast generation of alternative application instances that express task-level concurrency hidden in an application in some degree of explicitness. These techniques help a system designer to speedup significantly the design space exploration process. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Authors index

    Publication Year: 2002 , Page(s): 217
    Save to Project icon | PDF file iconPDF (55 KB)  
    Freely Available from IEEE
  • Communication speed selection for embedded systems with networked voltage-scalable processors

    Publication Year: 2002 , Page(s): 169 - 174
    Cited by:  Papers (9)
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (463 KB) |  | HTML iconHTML  

    High-speed serial network interfaces are gaining wide use in connecting multiple processors and peripherals in modem embedded systems, thanks to their size advantage and power efficiency. Many such interfaces also support multiple data rates, and this ability is opening a new dimension in the power/performance trade-offs between communication and computation on voltage scalable embedded processors. To minimize energy consumption in these networked architectures, designers must not only perform functional partitioning but also carefully balance the speeds between communication and computation, which compete for time and energy. Minimizing communication power without considering computation may actually lead to higher energy consumption at the system level due to elongated on-time as well as lost opportunities for dynamic voltage scaling on the processors. We propose a speed selection methodology for globally optimizing the energy consumption in embedded networked architectures. We formulate a multidimensional optimization problem by modeling communication dependencies between processors and their timing budgets. This enables engineers to systematically solve the problem of optimal speed selection for global energy reduction. We demonstrate the effectiveness of our speed selection approach with an image processing application mapped onto a multi-processor architecture with a multi-speed Ethernet View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Pruning-based energy-optimal device scheduling for hard real-time systems

    Publication Year: 2002 , Page(s): 175 - 180
    Cited by:  Papers (4)
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (499 KB) |  | HTML iconHTML  

    Dynamic power management (DPM) provides a simple, elegant and flexible method for reducing energy consumption in embedded real-time systems. However, I/O-centric DPM techniques have been studied largely for non-real-time environments. We present an offline device scheduling technique for real-time systems that generates an energy-optimal device schedule for a given task set while guaranteeing that all real-time deadlines are met. Our method takes as inputs a task set and a device-usage list for each task, and it schedules the tasks such that the energy consumed by the set of I/O devices is minimized. We compare our algorithm to an exhaustive enumeration method and show that the proposed algorithm is very efficient in terms of memory usage and computation time. We also present case studies to show that I/O-centric DPM methods can result in significant energy savings View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Hardware support for real-time embedded multiprocessor system-on-a-chip memory management

    Publication Year: 2002 , Page(s): 79 - 84
    Cited by:  Papers (7)  |  Patents (2)
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (438 KB) |  | HTML iconHTML  

    The aggressive evolution of the semiconductor industry smaller process geometries, higher densities, and greater chip complexity - has provided design engineers the means to create complex, high-performance Systems-on-a-Chip (SoC) designs. Such SoC designs typically have more than one processor and huge memory, all on the same chip. Dealing with the global onchip memory allocation/de-allocation in a dynamic yet deterministic way is an important issue for the upcoming billion transistor multiprocessor SoC designs. To achieve this, we propose a memory management hierarchy we call Two-Level Memory Management. To implement this memory management scheme which presents a paradigm shift in the way designers look at on-chip dynamic memory allocation - we present a System-on-a-Chip Dynamic Memory Management Unit (SoCDMMU) for allocation of the global on-chip memory, which we refer to as Level Two memory management (Level One is the operating system management of memory allocated to a particular on-chip Processing Element). In this way, processing elements (heterogeneous or non-heterogeneous hardware or software) in an SoC can request and be granted portions of the global memory in a fast and deterministic time (for an example of a four processing element SoC, the dynamic memory allocation of the global onchip memory takes sixteen cycles per allocation/deallocation in the worst case). In this paper, we show how to modify an existing Real-Time Operating System (RTOS) to support the new proposed SoCDMMU. Our example shows a multiprocessor SoC that utilizes the SoCDMMU has 440% overall speedup of the application transition time over fully shared memory that does not utilize the SoCDMMU View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Energy frugal tags in reprogrammable I-caches for application-specific embedded processors

    Publication Year: 2002 , Page(s): 181 - 186
    Cited by:  Papers (1)
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (519 KB) |  | HTML iconHTML  

    Presents a software-directed customization methodology for minimizing the energy dissipation in the instruction cache (I-cache), one of the most power-consuming microarchitectural components of high-end embedded processors. We target particularly the instruction cache tag operations and show how an exceedingly small number of tag bits, if any, are needed to compute the miss/hit behavior for the most frequently executed application loops, thus minimizing the energy needed to perform the tag reads and comparisons. The proposed methodology exploits the fact that the code layout structure of the program loops can be identified after compile and link, and that it typically resides in a very confined memory location, for which very few bits from the effective address can be utilized as a tag. Subsequently, we present an efficient, programmable implementation to apply the suggested energy minimization technique. The experimental results show a significant decrease in energy dissipation for a set of real-life applications View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multi-objective design space exploration using genetic algorithms

    Publication Year: 2002 , Page(s): 67 - 72
    Cited by:  Papers (18)
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (424 KB) |  | HTML iconHTML  

    In this work, we provide a technique for efficiently exploring a parameterized system-on-a-chip (SoC) architecture to find all Pareto-optimal configurations in a multi-objective design space. Globally, our approach uses a parameter dependency model of our target parameterized SoC architecture to extensively prune non-optimal subspaces. Locally, our approach applies genetic algorithms (GAs) to discover Pareto-optimal configurations within the remaining design points. The computed Pareto-optimal configurations will represent the range of performance (e.g., timing and power) tradeoffs that are obtainable by adjusting parameter values for a fixed application that is mapped on the parameterized SoC architecture. We have successfully applied our technique to explore Pareto-optimal configurations for a number of applications mapped on a parameterized SoC architecture View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Scratchpad memory: a design alternative for cache on-chip memory in embedded systems

    Publication Year: 2002 , Page(s): 73 - 78
    Cited by:  Papers (130)  |  Patents (19)
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (390 KB) |  | HTML iconHTML  

    In this paper we address the problem of on-chip memory selection for computationally intensive applications, by proposing scratch pad memory as an alternative to cache. Area and energy for different scratch pad and cache sizes are computed using the CACTI tool while performance was evaluated using the trace results of the simulator. The target processor chosen for evaluation was AT91M40400. The results clearly establish scratchpad memory as a low power alternative in most situations with an average energy reduction of 40%. Further the average area-time reduction for the scratchpad memory was 46% of the cache memory View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Dynamic run-time HW/SW scheduling techniques for reconfigurable architectures

    Publication Year: 2002 , Page(s): 205 - 210
    Cited by:  Papers (2)  |  Patents (1)
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (500 KB) |  | HTML iconHTML  

    Dynamic run-time scheduling in System-on-Chip platforms has become recently an active area of research because of the performance and power requirements of new applications. Moreover, dynamically reconfigurable logic (DRL) architectures are an exciting alternative for embedded systems design. However, all previous approaches to DRL multi-context scheduling and HW/SW scheduling for DRL architectures are based on static scheduling techniques. In this paper, we address this problem and present: (1) a dynamic scheduler hardware architecture, and (2) four dynamic run-time scheduling algorithms for DRL-based multi-context platforms. The scheduling algorithms have been integrated in our codesign environment, where a large number of experiments have been carried out. Results demonstrate the benefits of our approach View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Holistic scheduling and analysis of mixed time/event-triggered distributed embedded systems

    Publication Year: 2002 , Page(s): 187 - 192
    Cited by:  Papers (22)
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (518 KB) |  | HTML iconHTML  

    This paper deals with specific issues related to the design of distributed embedded systems implemented with mixed, event-triggered and time-triggered task sets, which communicate over bus protocols consisting of both static and dynamic phases. Such systems are emerging as the new standard for automotive applications. We have developed a holistic timing analysis and scheduling approach for this category of systems. We have also identified several new design problems characteristic to such hybrid systems. An example related to bus access optimization in the context of a mixed static/dynamic bus protocol is presented Experimental results prove the efficiency of such an optimization approach View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Transformation of SDL specifications for system-level timing analysis

    Publication Year: 2002 , Page(s): 121 - 126
    Cited by:  Papers (2)
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (448 KB) |  | HTML iconHTML  

    Complex embedded systems are typically specified using multiple domain-specific languages. After code-generation, the implementation is simulated and tested. Validation of non-functional properties, in particular timing, remains a problem because full test coverage cannot be achieved for realistic designs. The alternative, formal timing analysis, requires a system representation based on key application and architecture properties. These properties must first be extracted from a system specification to enable analysis. In this paper we present a suitable transformation of SDL specifications for system-level timing analysis. We show ways to vary modeling accuracy in order to apply available formal techniques. A practical approach utilizing a recently developed system model is presented View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Energy savings through compression in embedded Java environments

    Publication Year: 2002 , Page(s): 163 - 168
    Cited by:  Papers (3)  |  Patents (1)
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (497 KB) |  | HTML iconHTML  

    Limited energy and memory resources are important constraints in the design of an embedded system. Compression is a useful and widely employed mechanism to reduce the memory requirements of the system. As the leakage energy of a memory system increases with its size and because of the increasing contribution of leakage to overall system energy, compression also has a significant effect on reducing energy consumption. However, storing compressed data / instructions has a performance and energy overhead associated with decompression at runtime. The underlying compression algorithm, the corresponding implementation of the decompression and the ability to reuse decompressed information critically impact this overhead. In this paper, we explore the influence of compression on overall memory energy using a commercial embedded Java virtual machine (JVM) and a customized compression algorithm. Our results show that compression is effective in reducing energy even when considering the runtime decompression overheads; for most applications View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Hardware-software bipartitioning for dynamically reconfigurable systems

    Publication Year: 2002 , Page(s): 145 - 150
    Cited by:  Papers (3)  |  Patents (3)
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (488 KB) |  | HTML iconHTML  

    The main unique feature of dynamically reconfigurable systems is the ability to time-share the same reconfigurable hardware resources. However, the energy-delay cost associated with reconfiguration must be accounted for during hardware-software partitioning. We propose a method for mapping nodes of an application control flow graph either to software or reconfigurable hardware, explicitly targeting minimization of the energy-delay cost due to both computation and configuration. The addressed problems are energy-delay product minimization, delay-constrained energy minimization, and energy-constrained delay minimization. We show how these problems can be tackled by using network flow techniques, after transforming the original control flow graph into an equivalent network. If there are no constraints, as in the case of the energy-delay product minimization, we are able to generate an optimal solution in polynomial time View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Compiler-directed customization of ASIP cores

    Publication Year: 2002 , Page(s): 97 - 102
    Cited by:  Papers (2)
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (499 KB) |  | HTML iconHTML  

    This paper presents an automatic method to customize embedded application-specific instruction processors (ASIPs) based on compiler analysis. ASIPs, also known as embedded soft cores, allow certain hardware parameters in the processor to be customized for a specific application domain. They offer low design cost as they use pre-designed and verified components. Our design goal is choosing parameter values for fastest runtime within a given silicon area budget for a particular application set. Present-day technologies for choosing parameter values rely on exhaustive simulation of the application set on all possible combinations of parameter values - a time-consuming and non-scalable procedure. We propose a compiler-based method that automatically derives the optimal values of parameters without simulating any configuration. Further we expand the space of parameters that can be changed from the limited set today, and evaluate the importance of each. Results show that for our benchmarks, the runtimes for different configurations are predicted with an average error of 2.5%. In the two area constrained customization problem we evaluate, our method is able to recommend the same configuration that is recommended by brute force exhaustive simulation View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Locality-conscious process scheduling in embedded systems

    Publication Year: 2002 , Page(s): 193 - 198
    Cited by:  Papers (2)
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (534 KB) |  | HTML iconHTML  

    In many embedded systems, the existence of a data cache might influence the effectiveness of process scheduling policy significantly. Consequently, a scheduling policy that takes inter-process data reuse into account might result in large performance benefits. In this paper, we focus on array-intensive embedded applications and present a locality-conscious scheduling strategy where we first evaluate the potential data reuse between processes, and then, using the results of this evaluation, select an order for process executions. We also show how process codes can be transformed by an optimizing compiler for increasing inter-process data reuse, thereby making locality-conscious scheduling more effective. Our experimental results obtained using two large, multi-process application codes indicate significant runtime benefits View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The design context of concurrent computation systems

    Publication Year: 2002 , Page(s): 19 - 24
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (535 KB) |  | HTML iconHTML  

    The design for performance optimization of programmable, semicustom SoCs requires the ability to model and optimize the behavior of the system as a whole. Neither the hardware-testbench style nor the software-benchmark style is adequate to capture completely the design interactions required in concurrent software-on-hardware systems. We use a formal relationship between a computer system design content and its external context to motivate the need to consider a more effective modeling framework to which concurrent software-on-hardware computer systems are designed View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast system-level power profiling for battery-efficient system design

    Publication Year: 2002 , Page(s): 157 - 162
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (535 KB) |  | HTML iconHTML  

    An increasing disparity between the energy requirements of portable electronic devices and available battery capacities is driving the development of new design methodologies for battery-efficient systems. A crucial requirement for battery efficient system design is to be able to efficiently and accurately estimate battery life for candidate system architectures. Recently, efficient techniques have been developed to estimate battery life under given profiles of system power consumption over time. However, techniques for generating the power profiles themselves are either too cumbersome for system level exploration, or too inaccurate for battery life estimation. In this paper. we present a new methodology for efficiently and accurately generating power profiles for different system-level architectures. The designer can specify the manner in which (i) system tasks are mapped to a set of available implementations, and (ii) system communications are mapped to a specified communication architecture. For a given architecture, a power profile is automatically generated by analyzing an abstract representation of the system execution traces, while taking into account the selected implementations of the system's computations and communications. Experiments conducted on the design of an IEEE 802.11 MAC processor indicate that the power profiling approach offers run times that are several orders of magnitude lower than a simulation based power profiling technique. while sustaining negligible loss of accuracy (average profiling error was observed to be less than 3.4%) View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A novel codesign approach based on distributed virtual machines

    Publication Year: 2002 , Page(s): 109 - 114
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (450 KB) |  | HTML iconHTML  

    This paper describes a hardware/software codesign approach for the design of embedded systems based on digital signal processors and FPGAs. Our approach is based on distributed virtual machines for simulation and verification of the application on a Linux cluster and for running the application on different target architectures (DSPs, FPGAs) as well. The main focus is the description of the virtual machine, which was designed to make DSP applications portable across different platforms while maintaining optimal code View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Codesign-extended applications

    Publication Year: 2002 , Page(s): 1 - 6
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (453 KB) |  | HTML iconHTML  

    We challenge the widespread assumption that an embedded system's functionality can be captured in a single specification and then partitioned among software and custom hardware processors. The specification of some functions in software is very different from the specification of the same function in hardware - too different to conceive of automatically deriving one from the other. We illustrate this concept using a digital camera example. We introduce the idea of codesign-extended applications to deal with the situation, wherein critical functions are written in multiple versions, and integrated such that simple compiler/synthesis flags instantiate a particular version along with the necessary control and communication behavior. By capturing a specification as a codesign-extended application, a designer enables smooth migration among platforms with increasing amounts of on-chip configurable logic View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • FPGA resource and timing estimation from Matlab execution traces

    Publication Year: 2002 , Page(s): 31 - 36
    Cited by:  Papers (7)  |  Patents (8)
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (466 KB) |  | HTML iconHTML  

    We present a simulation-based technique to estimate area and latency of an FPGA implementation of a Matlab specification. During simulation of the Matlab model, a trace is generated that can be used for multiple estimations. For estimation the user provides some design constraints such as the rate and bit width of data streams. In our experience the runtime of the estimator is approximately only 1/10 of the simulation time, which is typically fast enough to generate dozens of estimates within a few hours and to build cost-performance trade-off curves for a particular algorithm and input data. In addition, the estimator reports on the scheduling and resource binding used for estimation. This information can be utilized not only to assess the estimation quality, but also as first starting point for the final implementation View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Extended quasi-static scheduling for formal synthesis and code generation of embedded software

    Publication Year: 2002 , Page(s): 211 - 216
    Cited by:  Papers (1)
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (480 KB) |  | HTML iconHTML  

    With the computerization of most daily-life amenities such as home appliances, the software in a real-time embedded system now accounts for as much as 70% of a system design. On one hand, this increase in software has made embedded systems more accessible and easy to use, while on the other hand, it has also necessitated further research on how complex embedded software can be designed automatically and correctly. Enhancing recent advances in this research, we propose an Extended Quasi-Static Scheduling (EQSS) method for formally synthesizing and automatically generating code for embedded software, using the Complex-Choice Petri Nets (CCPN) model. Our method improves on previous work in three ways: (1) by removing model restrictions to cover a much wider range of applications, (2) by proposing an extended algorithm to schedule the more unrestricted model, and (3) by implementing a code generator that can produce multi-threaded embedded software programs. The requirements of an embedded software are specified by a set of CCPN, which is scheduled using EQSS such that the schedules satisfy limited embedded memory requirements and task precedence constraints. Finally, a POSIX-based multi-threaded embedded software program is generated in the C programming language. Through an example, we illustrate the feasibility and advantages of the proposed EQSS method View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • HW/SW partitioning and code generation of embedded control applications on a reconfigurable architecture platform

    Publication Year: 2002 , Page(s): 151 - 156
    Cited by:  Papers (11)
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (462 KB) |  | HTML iconHTML  

    This paper studies the use of a reconfigurable architecture platform for embedded control applications aimed at improving real time performance. The HW/SW codesign methodology from POLIS is used. It starts from high-level specifications, optimizes an intermediate model of computation (extended finite state machines) and derives both hardware and software, based on performance constraints. We study a particular architecture platform, which consists of a general purpose processor core, augmented with a reconfigurable function unit and data-path to improve run time performance. A new mapping flow and algorithms to partition hardware and software are proposed to generate implementations that best utilize this architecture. Encouraging preliminary results are shown for automotive electronic control examples View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Hardware-software cosynthesis of multi-mode multi-task embedded systems with real-time constraints

    Publication Year: 2002 , Page(s): 133 - 138
    Cited by:  Papers (13)
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (400 KB) |  | HTML iconHTML  

    An embedded system is called multi-mode when it supports multiple applications by dynamically reconfiguring the system functionality. This paper proposes a hardware-software cosynthesis technique for multi-mode multi-task embedded systems with real-time constraints. The cosynthesis problem involves three subproblems: selection of appropriate processing elements, mapping and scheduling of function modules to the selected processing elements, and schedule analysis. The proposed cosynthesis framework defines an iteration loop of three steps that solve the subproblems separately. One of the key benefits of such a modular approach is extensibility and adaptability. Moreover, unlike the previous approaches, the proposed technique considers task sharing between modes and hardware sharing between tasks at the same time. We demonstrate the usefulness of the proposed technique with a realistic multimode embedded system that supports three modes of operation with 5 different tasks View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Large exploration for HW/SW partitioning of multirate and aperiodic real-time systems

    Publication Year: 2002 , Page(s): 85 - 90
    Save to Project icon | Click to expandQuick Abstract | PDF file iconPDF (453 KB) |  | HTML iconHTML  

    This paper addresses the domain of fine and coarse grain HW/SW codesign for real-time system on-chip. We propose a new method for the real-time scheduling and the HW/SW partitioning of multi-rate or aperiodic tasks. The large design space exploration is based on parallelism/delay trade-off curves View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.