By Topic

Signal Processing Systems, 2002. (SIPS '02). IEEE Workshop on

Date 16-18 Oct. 2002

Filter Results

Displaying Results 1 - 25 of 51
  • IEEE Workshop on Signal Processing Systems (SPIS'02) (Cat. No.02TH8638)

    Save to Project icon | Request Permissions | PDF file iconPDF (379 KB)  
    Freely Available from IEEE
  • Author index

    Page(s): 285 - 287
    Save to Project icon | Request Permissions | PDF file iconPDF (106 KB)  
    Freely Available from IEEE
  • VLSI architecture design of rake receivers for cdma2000 systems

    Page(s): 183 - 188
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (635 KB) |  | HTML iconHTML  

    We propose low-complexity architecture for rake receivers in cdma2000 systems. The hardware cost of rake receivers is significantly increased in cdma2000 systems, because rake receivers should demodulate multi-path signals transmitted through multiple sub-carriers. We, therefore, present a novel architecture which adopts a multifinger structure, arithmetic units shared by multi-fingers, and time-deskew buffers using a pre-combining technique. The results show that the proposed receiver reduces the hardware complexity by about 49.4% compared with a conventional one. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Integration of Mpeg-4 video tools onto multi-DSP architectures using AVSynDEx fast prototyping methodology

    Page(s): 207 - 212
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (746 KB) |  | HTML iconHTML  

    Mpeg-4 is a response to the growing need for coding method that can facilitate access to visual objects in natural and synthetic moving pictures. Future real time audio-visual applications using Mpeg-4 will have very important time constraints, that can be achieved with the use of several calculation units. Sequential software solutions actually developed for single processors can hardly be projected onto multiprocessor architectures, leading to extra load of source code and calculations, but also to a sub-optimal use of the architecture parallelism. A functional data flow description of the application is then a well suited front-end for optimal multi-components implementation. This paper presents an Mpeg-4 decoder with such description formalism, allowing incremental building, and easy handing-over up to date of the algorithms. Furthermore, we show that the use of our AVSynDEx methodology enables its optimized implementation onto a multi-C6X platform. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The poly-phase shift filter for a time-shared DAC in a digital convergence system

    Page(s): 213 - 217
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (440 KB) |  | HTML iconHTML  

    A digital CRT convergence control system usually employs an independent digital-to-analog converter (DAC) for each Red, Green and Blue signal. If the DAC is time-shared among R, G and B signals, the complexity and the size of hardware can be much reduced. However, sharing of the DAC introduces a time-delay among control signals, resulting in a misconvergence and a poor picture quality in projection TV. This paper proposes a digital convergence system that employs poly-phase shift filters to eliminate the time-delay problem in a time-shared DAC. The proposed system got a patent and has been successfully implemented in ASIC, and is currently used in commercial production of projection TV. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Speaking partner: an ARM7-based multimedia handheld device

    Page(s): 218 - 221
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (431 KB) |  | HTML iconHTML  

    We have implemented MP3 audio decoding, speech recognition, and image compression programs for a handheld foreign language learning device, Speaking Partner. The hardware is based on a relatively low-performance RISC CPU, ARM7TDMI. Several previously known software optimization techniques for RISC processors as well as a few algorithm specific optimization methods are employed. The number of clock cycles for the implementation is more significantly reduced by employing block data transfer instructions, rather than by reducing the accuracy of multiplication. The implementation results show that the ARM7 based CPU can conduct multimedia applications in a multi-tasking mode. The power consumption for each activity and each hardware component is also analyzed. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design approaches for MPEG engines for broadband and mobile applications

    Page(s): 3 - 8
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (533 KB) |  | HTML iconHTML  

    This paper presents recent advances in MPEG-2/4 audio and video (AV) codec implementations for broadband and mobile applications. Several design approaches based on special purpose hardwired LSIs, programmable DSPs, media processors, embedded MPUs, and MPUs for personal computers (PCs), which are used to implement MPEG engines are introduced. Then, design approaches for low power and low cost AV codec LSIs for broadband and mobile applications are discussed, by introducing recent implementation examples. Finally, future trends and challenges are discussed. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Battery aware task scheduling for a system-on-a-chip using voltage/clock scaling

    Page(s): 201 - 206
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (562 KB) |  | HTML iconHTML  

    Battery lifetime is a critical parameter in the operation of mobile computing devices. The lifetime of such devices is directly dependent on the battery discharge profile. In this paper we address the problem of task scheduling in single processor and multiprocessor systems such that the battery lifetime is maximized. We propose a procedure that achieves this by shaping the current load profile. The shaping algorithm makes extensive use of voltage/clock scaling and is guided by heuristics that are derived from the properties of the battery model. Simulations show that the proposed algorithm improves the battery lifetime significantly. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel decoding of interleaved single parity check turbo product codes

    Page(s): 27 - 32
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (526 KB) |  | HTML iconHTML  

    In this paper, both complexity and performance aspects of serially concatenated 2-D single parity check turbo product codes are investigated. The extremely simple Max-Log-MAP decoding is alternatively derived with only three additions needed to compute each bit's extrinsic information. Parallel decoding structure is proposed to increase the decoding throughput while a new helical interleaver is constructed to further improve the coding gain. For performance evaluation, (16, 14, 2)2 single parity check turbo product codes with code rate 0.766 over an AWGN channel using QPSK are considered. The simulation results using Max-Log-MAP decoding show that it can achieve BER of 10-5 at SNR of 3.8 dB with 8 iterations. Compared to the same rate and codeword length turbo product code composed of extended Hamming codes, the considered scheme can achieve similar performance with much less complexity. Other implementation issues such as the finite precision analysis and efficient sorting circuit design are also presented. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Reconfigurable low power cell search engine for UMTS-FDD mobile terminals

    Page(s): 171 - 176
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (552 KB) |  | HTML iconHTML  

    This paper addresses initial acquisition for the UMTS-FDD W-CDMA standard. It presents an innovative cell searcher architecture optimized for acquisition speed and for low power consumption. The proposed architecture utilizes a memory-based digital matched filter with permuted processing order. The same filtering hardware can be reconfigured to process all the steps of the UMTS-FDD initial cell search. It can also perform other functions such as initial delay profiling, neighboring cell search and idle-mode timing alignment, which are typically carried out by the Rake receiver. These additional capabilities allow for further system-level power reduction since they avoid activating the Rake. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Initial memory complexity analysis of the AVC codec

    Page(s): 222 - 227
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (542 KB) |  | HTML iconHTML  

    The Advanced Video Codec (AVC), currently being defined in a joined standardisation effort of ISO/IEC MPEG and ITU-T VCEG, aims at enhanced compression efficiency and network friendliness. To achieve these goals, a motion compensated hybrid DCT algorithm is introduced using advanced and complicated compression tools. As video coding is typically a data dominated process, we quantify the complexity cost in a memory centric way. The AVC codec is characterised by a large memory footprint and increased data transfer rate (an order of magnitude for the encoder) compared to previous video coding standards. The motion estimation/compensation are the initial implementation bottlenecks. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel architecture for video processing in a smart camera system

    Page(s): 9 - 14
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (671 KB) |  | HTML iconHTML  

    In this paper, we present our research on parallel architectures for a smart camera system. We analyze the available data independencies for a particular application, namely human detection and activity recognition, and discuss the potential architectures to exploit the parallelism resulted from these independencies. Three architectures-VLIW, symmetric parallel, and macro-pipeline architectures-are discussed and their performances are presented. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Low-power turbo equalizer architecture

    Page(s): 33 - 38
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (530 KB) |  | HTML iconHTML  

    In this paper, we propose a low complexity architecture for turbo equalizers. Turbo equalizers jointly equalize and decode the received signal by exchanging soft information iteratively. The proposed architecture employs early termination of the iterative process when it does not impact the bit-error rate (BER). Early termination enables the powering down parts of the soft-input soft-output (SISO) equalizer and decoder thereby saving power. Simulation results show that the complexity is reduced by 20% ∼ 59% and 8% ∼ 58% in equalization and decoding, respectively. In addition, the number of iterations is reduced by 30% ∼ 47% with negligible degradation in BER. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The Gauss-Seidel fast affine projection algorithm

    Page(s): 109 - 114
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (527 KB) |  | HTML iconHTML  

    In this paper we propose a new stable fast affine projection algorithm based on Gauss-Seidel iterations (GSFAP). We investigate its implementation using the logarithmic number system (LNS) and compare it with two other fast affine projection (FAP) algorithms. Simplified and multi-input GSFAP versions are also proposed. We show that the algorithm is only marginally more complex than NLMS and simpler than other FAP algorithms. Its application for acoustic echo cancellation is also investigated. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Audio application implementations on a block-floating-point DSP

    Page(s): 51 - 56
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (557 KB) |  | HTML iconHTML  

    Hierarchical block-floating-point arithmetic (H-BFP) is applied to a configurable DSP architecture. This new arithmetic has been proposed in order to solve a trade-off problem between complexity and accuracy in implementing conventional block-floating-point arithmetics. This paper describes an actual implementation of the DSP architecture on a field programmable gate array (FPGA) platform. Some signal processing quality evaluation results are also presented for two audio applications that are realized on the DSP architecture. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A/D precision requirements for an ultra-wideband radio receiver

    Page(s): 270 - 275
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (592 KB) |  | HTML iconHTML  

    Ultra wideband radio (UWB) is a new wireless technology that uses narrow pulses to transmit information. Implementing an "all-digital" UWB receiver has numerous potential benefits ranging from low-cost and ease-of-design to flexibility. Digitizing an RF signal near the antenna, however, introduces its own set of challenges and has traditionally been considered infeasible. A high-speed, high-resolution analog-digital converter (ADC) is difficult to design, and is extremely power-hungry. The viability of an "all-digital" architecture, therefore, hinges upon the specifications of this block. In this paper, we demonstrate that 4 bits of resolution are sufficient for reliable detection of a typical UWB signal that is swamped in noise and interference. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A 54 Mbps (3,6)-regular FPGA LDPC decoder

    Page(s): 127 - 132
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (580 KB) |  | HTML iconHTML  

    Applying a joint code and decoder design methodology, we develop a high-speed (3, k)-regular LDPC code partly parallel decoder architecture, based on which a 9216-bit, rate-1/2 (3,6)-regular LDPC code decoder is implemented on an Xilinx FPGA device. When performing maximum 18 iterations for each code block decoding, this partly parallel decoder supports a maximum symbol throughput of 54 Mbps and achieves BER 10-6 at 2 dB over an AWGN channel. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design space exploration of streaming multiprocessor architectures

    Page(s): 228 - 234
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (851 KB) |  | HTML iconHTML  

    In this paper, we present a comparison of two design-space exploration approaches. The comparison is in terms of (1) speed of simulation versus accuracy of performance numbers, and (2) connection to trajectories for detailed design. The two approaches are: the trace driven approach and the control data flow graph approach. The first approach leads to the shortest simulation time, but is insufficiently accurate in the performance numbers it provides. It also does not connect well to a trajectory for detailed design. The second method is leading to rather long simulation times, yet it can give fairly accurate performance numbers, and it produces results that can be readily taken as input for further design. The two approaches are somehow extreme in that several in-between methods can be conceived of. We also describe our search for an exploration trajectory which would be somehow "optimal" in terms of speed versus accuracy and closeness to a design trajectory. As expected, this trajectory appears somewhere in-between the two extremes mentioned above. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • System-on-a-chip design for broadband communications

    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (143 KB)  

    Summary form only given. A brief overview of the broadband communications industry is given and several examples of broadband SoC designs are presented. The key challenges faced by design engineers are discussed. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • QVGA/CIF resolution MPEG-4 video codec based on a low-power and general-purpose DSP

    Page(s): 15 - 20
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (767 KB) |  | HTML iconHTML  

    This paper describes a QVGA/CIF resolution MPEG-4 video codec based on a low-power, general-purpose DSP (NEC μPD77210,160 MHz, 80 mW, 1.5 V). To enhance video codec performance, the codec employs fast algorithms, including, in motion estimation, a successive similarity detection algorithm (SSDA; a fast block matching) whose decision timing for termination of block matching is optimized. Further, the use of a software DMA queue reduces the wasteful DSP wait cycles that can result from massive access to external frame memories. The resulting codec executes QVGA × 15 fps codec, or CIF × 15 fps encoding at 384 kbps, in real time, performance levels sufficient for next-generation wireless videotelephony. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fine-grain instruction scheduling for low energy

    Page(s): 258 - 263
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (591 KB) |  | HTML iconHTML  

    Energy consumption is increasingly becoming an important metric in designing computing systems. This paper presents an instruction scheduling algorithm to reduce energy consumption in processor data-paths. The unique aspect of our algorithm is that it is fine-granular; i.e., it works on a pipeline stage granularity. This is in contrast to current energy-aware instruction scheduling techniques that work on an instruction granularity. Our preliminary experimental results indicate that our fine-granular approach both leads to schedules with lower energy consumption (as compared to coarse-grain techniques) and helps us estimate the absolute data-path energy consumed by the code better. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A VLSI architecture for interpolation in soft-decision list decoding of Reed-Solomon codes

    Page(s): 39 - 44
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (532 KB) |  | HTML iconHTML  

    The Koetter-Vardy algorithm is an algebraic soft-decision decoder for Reed-Solomon codes which is based on the Guruswami-Sudan list decoder. There are three main steps: 1) multiplicity calculation, 2) interpolation and 3) root finding. The Koetter-Vardy algorithm is challenging to implement due to the high cost of interpolation. We propose a VLSI architecture for interpolation that uses a transformation of the received word to reduce the number of iterations of the interpolation algorithm. We also show how the memory requirements can be reduced and an important operation, the Hasse derivative, can be efficiently implemented in VLSI. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An adaptive filtering algorithm using mean field annealing techniques

    Page(s): 115 - 120
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (446 KB) |  | HTML iconHTML  

    We present a new approach to discrete adaptive filtering based on the mean field annealing algorithm. The main idea is to find the discrete filter vector that minimizes the matrix form of the Wiener-Hopf equations in a least-squares sense by a generalized mean field annealing algorithm. It is indicated by simulations that this approach, with complexity O(M2) where M is the filter length, finds a solution comparable to the one obtained by the recursive least squares (RLS) algorithm but without the transient behavior of the RLS algorithm. Further advantages of the proposed algorithm over other methods such as the recursive least-squares algorithm are that the filter coefficients are always limited and that it facilitates fast recovery after an abrupt system change. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A single-chip IPSEC cryptographic processor

    Page(s): 133 - 138
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (581 KB) |  | HTML iconHTML  

    The need for securing the Internet has become a fundamental issue over the last decade and the Internet Protocol Security (IPSec) standard, which incorporates cryptographic algorithms, has been developed as one solution to this problem. Typically, hardware implementations of cryptographic algorithms provide physical security and high speeds. In this paper a novel single-chip hardware IPSec cryptographic design is described, which comprises the Rijndael encryption algorithm and HMAC-SHA-1 authentication algorithm. In particular, the design supports the cryptographic requirements of the IP Authentication Header (AH) and Encapsulation Security Payload (ESP) and any combination of these two protocols. Indeed, it is capable of supporting any application requiring authentication and/or encryption, such as wireless local area networks (WLANs) the Secure Socket Layer (SSL) protocol, virtual private networks (VPNs) and firewalls. The IPSec cryptographic design can provide both the necessary security and performance for phone line modems, T1 wireless and 10 Mbit/s Ethernet networks. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient pseudo-noise sequence generation for spread-spectrum applications

    Page(s): 80 - 86
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (570 KB) |  | HTML iconHTML  

    Novel approaches for linear feedback shift register based pseudo noise generators are presented. Area-delay tradeoff between the various approaches is also presented. The principal approaches that are studied in this paper include 1) State-transition matrix based approaches, 2) State-transition matrix based approaches with lookahead, 3) State storage based approaches and 4) Polynomial multiplication based approaches. Synthesis results are also provided to illustrate the area/delay trade-offs. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.