By Topic

Signal Processing Systems, 2004. SIPS 2004. IEEE Workshop on

Date 13-15 Oct. 2004

Filter Results

Displaying Results 1 - 25 of 70
  • 2004 IEEE Workshop on Signal Processing Systems Design and Implementation

    Page(s): 0_1
    Save to Project icon | Request Permissions | PDF file iconPDF (311 KB)  
    Freely Available from IEEE
  • 2004 IEEE Workshop on Signal Processing Systems Design and Implementation (IEEE Cat. No.04TH8751)

    Save to Project icon | Request Permissions | PDF file iconPDF (183 KB)  
    Freely Available from IEEE
  • Copyright page

    Page(s): ii
    Save to Project icon | Request Permissions | PDF file iconPDF (184 KB)  
    Freely Available from IEEE
  • Organizing Committee

    Page(s): iii
    Save to Project icon | Request Permissions | PDF file iconPDF (173 KB)  
    Freely Available from IEEE
  • Table of contents

    Page(s): v - ix
    Save to Project icon | Request Permissions | PDF file iconPDF (183 KB)  
    Freely Available from IEEE
  • [Breaker page]

    Page(s): x
    Save to Project icon | Request Permissions | PDF file iconPDF (158 KB)  
    Freely Available from IEEE
  • A low complexity algorithm for proportional resource allocation in OFDMA systems

    Page(s): 1 - 6
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (337 KB) |  | HTML iconHTML  

    Orthogonal frequency division multiple access (OFDMA) basestations allow multiple users to transmit simultaneously on different subcarriers during the same symbol period. This paper considers basestation allocation of subcarriers and power to each user to maximize the sum of user data rates, subject to constraints on total power, bit error rate, and proportionality among user data rates. Previous allocation methods have been iterative nonlinear methods suitable for offline optimization. In the special high subchannel SNR case, an iterative root-finding method has linear-time complexity in the number of users and N log N complexity in the number of subchannels. We propose a non-iterative method that is made possible by our relaxation of strict user rate proportionality constraints. Compared to the root-finding method, the proposed method waives the restriction of high subchannel SNR, has significantly lower complexity, and in simulation, yields higher user data rates. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A novel pipelined fast Fourier transform architecture for double rate OFDM systems

    Page(s): 7 - 11
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (507 KB) |  | HTML iconHTML  

    A high throughput fast Fourier transform/inverse fast Fourier transform (FFT/IFFT) processor for double-rate wireless LAN, based on double-rate OFDM communication systems, is proposed. It is an efficiently pipelined radix-2 FFT architecture, which doubles the throughput with significant hardware reduction. The utilization rate of multipliers and the processing elements reach 100%. The core size is 10 mm2 with a power consumption of 208 mW at 20 MHz for data inputs with 15-bit word length, using 0.35 μm IP4M CMOS technology. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Globally optimal tradeoff curves for OFDM PAR reduction [peak-to-average power ratio]

    Page(s): 12 - 17
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (307 KB) |  | HTML iconHTML  

    This paper describes an efficient convex optimization technique for computing the globally optimal tradeoff curves between OFDM peak-to-average power ratio (PAR), constellation error, and free carrier power. The OFDM system designer can select a suitable PAR reduction method by comparing the achieved performance of various algorithms with these optimal tradeoff curves. Simulation results are presented for the 802.11a/g WLAN standard. The power wasted in the free carriers can be substantially reduced by taking advantage of the allowed constellation error and by backing off 1 dB from the globally minimum PAR. A convex interior-point method reaches the desired tradeoff point within two iterations for both QPSK and 64-QAM. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Energy-aware radio link control for OFDM-based WLAN

    Page(s): 18 - 23
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (502 KB) |  | HTML iconHTML  

    Next generation wireless local area network (WLAN) terminals have to cope with increasing performance requirements while energy budgets are more and more constrained by portability. Next to low power circuit and architecture design, system-level power management is a key technology to fill this gap. Recently, radio link control techniques have been proposed, not only as a way to maximize performance but also to reach energy awareness. Transmit rate and power are adapted to meet exactly the user requirements while minimizing the average power consumption. However, schemes proposed so far do not exploit the characteristics of the specific modulation scheme considered in most recent WLAN standards: orthogonal frequency division multiplexing (OFDM). In this paper, we design a practical energy aware radio link control scheme, optimized for OFDM transceivers and compatible with current standards. Simulation results depict up to 80% transceiver power reduction when compared with throughput maximizing schemes. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design of a lattice decoder for MIMO systems in FPGA

    Page(s): 24 - 29
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (251 KB) |  | HTML iconHTML  

    Hardware implementation of lattice decoding algorithms becomes a challenging task as the complexity of MIMO systems increases. This paper presents the design and implementation of a Schnorr-Euchner strategy based lattice decoder using an FPGA. This lattice decoding algorithm has high data dependency during the iterative closest lattice point search procedures. The parallelism of the algorithm is explored and efficient hardware architectures are developed with the decoding function on FPGA and the data preprocessing on DSP. The system prototype of the decoder shows that it supports 2.7 Mbits/s data rate on a Virtex2-1000 FPGA, and is about 4 times faster than a DSP-based lattice decoder. The bit error rate (BER) performance is also tested and verified with software simulation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Reduced-complexity sphere decoding via detection ordering for linear multi-input multi-output channels

    Page(s): 30 - 35
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (258 KB) |  | HTML iconHTML  

    Sphere decoding is a powerful approach for maximum-likelihood (ML) detection over Gaussian multi-input multi-output (MIMO) linear channels. We propose a new detection ordering approach, which minimizes the corresponding diagonal element of the upper-triangular matrix R over all possible column permutations in each step of the QR decomposition. Compared with the previously proposed V-BLAST ZF-DFE ordering approach, our approach has two major advantages: (1) it is efficiently embedded in the QR decomposition with a small computational overhead, rendering itself suitable for fast-varying channels, while the V-BLAST ZF-DFE ordering is not suitable for fast-varying channels since it incurs large computation overhead; (2) the sphere decoder with our proposed detection ordering achieves 17%-69% and 9%-59% reductions in the number of multiplications and the number of additions, respectively, in comparison to that with the V-BLAST ZF-DFE ordering. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the complexity and performance of strategies for symbol detection in multi-antenna wireless LAN

    Page(s): 36 - 41
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (330 KB) |  | HTML iconHTML  

    This paper discusses the symbol detection complexity of several orthogonal frequency division multiplexing wireless communication schemes using multiple-input multiple-output channels. We investigate spatial diversity (SD) and spatial multiplexing (SM) architectures. For the SD case we discuss an efficient example of space-time block coding, namely the Alamouti scheme. For the SM case we examine linear filtering, using both the zero-forcing and minimum mean squared error design criteria, and also the nonlinear maximum likelihood detection method. A comparison of the packet error rate performance of the various systems is then provided, followed by a thorough analysis of implementational complexity. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • QRD and SVD processor design based on an approximate rotations algorithm

    Page(s): 42 - 47
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (285 KB) |  | HTML iconHTML  

    A silicon implementation of the approximate rotations algorithm capable of carrying the computational load of algorithms such as QRD and SVD, within the real-time realisation of applications such as adaptive beamforming, is described. A modification to the original approximate rotations algorithm to simplify the method of optimal angle selection is proposed. Analysis shows that fewer iterations of the approximate rotations algorithm are required compared with the conventional CORDIC algorithm to achieve similar degrees of accuracy. The silicon design studies undertaken provide direct practical evidence of superior performance with the approximate rotations algorithm, requiring approximately 40% of the total computation time of the conventional CORDIC algorithm, for a similar silicon area cost. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Oversampled channelized receiver for transmitted reference UWB system in the presence of narrowband interference

    Page(s): 48 - 52
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (264 KB) |  | HTML iconHTML  

    An oversampled frequency channelized receiver for UWB radio in a transmitted reference (TR) system is presented. Unlike previous work that assumes a white input noise, this paper includes the effects of the automatic gain controller (AGC) and the analog-to-digital converter (ADC) when large narrowband interference (NBI) is present. A detection method for the frequency channelized receiver when input noise is colored in the TR UWB system is proposed. The proposed receiver significantly outperforms the full band receiver. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Implementing a receiver for terrestrial digital video broadcasting in software on an application-specific DSP

    Page(s): 53 - 58
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (434 KB) |  | HTML iconHTML  

    Terrestrial digital video broadcasting (DVB-T) is currently being introduced in many European countries and planned to supplement or replace current analogue broadcasting schemes in a large part of the world. It is also considered as an additional downlink medium for third generation UMTS mobile telephones, where a special variant, DVB-H, is under development. Current DVB-T receivers still are built upon dedicated application specific integrated circuits (ASIC). However, designing ASIC is a tedious and expensive task. We show that it is possible to implement a DVB-T receiver in software on an application-specific digital signal processor (AS-DSP). We analyze the computational requirements of a DVB-T receiver and investigate its potential for parallelization. Further, we present our AS-DSP, the M5-DSP, which is based on a novel architecture and design methodology, and report on implementing the core algorithms of a DVB-T receiver on it. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An iterative decorrelating receiver for DS-UWB multiple access systems using biphase modulation

    Page(s): 59 - 64
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (315 KB) |  | HTML iconHTML  

    In this paper, we consider an application of an iterative decorrelating receiver to direct sequence ultra wideband (DS-UWB) multiple access systems, which utilize biphase modulation. As the number of users increases in the DS-UWB system, multiple access interference becomes a dominant source to degrade system performance. In order to efficiently suppress multiple access interference, a multiuser receiver is required. The high computational complexity of the optimal multiuser receiver prohibits its application. The iterative decorrelating receiver approximates the conventional decorrelating receiver with lower computational complexity. According to the simulation results, the proposed decorrelating receiver clearly improves the system performance. In addition, the convergence characteristics of the proposed iterative decorrelator are investigated in terms of the optimal convergence constant and the error bound. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Two-stage interleaving network analysis to design area- and energy-efficient 3GPP-compliant receiver architectures

    Page(s): 65 - 70
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (317 KB) |  | HTML iconHTML  

    Interleaving is a key component of many digital communication systems where the encoded data is reshuffled prior to transmission to protect against burst errors. Coupled with multiplexing schemes such multi-stage subsystems achieve the necessary quality and flexibility to support a variety of different services. In 3GPP, a 2-stage multiplexing channel interleaver network is adopted. Its state-of-the-art implementation is both memory- and control-intensive, since the deinterleaving is done explicitly implying dedicated storage and processing units at each stage. In this paper, we show that the C-fold decimation property which characterizes typical block interleavers is preserved in 2-stage interleaving networks. Thus, the underlying architecture not only results in significant memory size and access rate reductions but also greatly simplifies control processing. A decline in memory size of up to 31% and in access energy of up to 54% has been observed for STMicroelectronics' 0.13 μm CMOS technology for various 3GPP capability classes. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Packet transmission policies for battery operated communication systems

    Page(s): 71 - 76
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (282 KB) |  | HTML iconHTML  

    In this paper, we address the problem of designing battery-friendly packet transmission policies for wireless data transmission. Our objective is to maximize the lifetime of battery for wireless devices subject to certain delay constraints. We present three packet transmission schemes and evaluate them with respect to battery performance. The first scheme based on combining multiple packets, utilizes the battery charge recovery due to long idle periods. The second scheme based on a modified version of lazy packet scheduling, draws lower current and is battery efficient. The third scheme which is based on a combination of the above two schemes, has superior battery performance at the expense of larger average packet delay. All three schemes were simulated for a wireless communication framework with Internet traffic, and the results validated. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Transport level performance-energy trade-off in wireless networks and consequences on the system-level architecture and design paradigm

    Page(s): 77 - 82
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (494 KB) |  | HTML iconHTML  

    Low power consumption is imperative to enable the deployment of broadband wireless connectivity in portable devices such as PDA or smart telephones. Next to low power circuit and architecture design, system-level power management is revealed to be a key technology for low power consumption. Recently, "lazy scheduling" has been proposed for system level power reduction. It has been shown to be very effective and complementary to more traditional shutdown based approaches. So far, analysis has been carried out from the viewpoint of medium access control (MAC) and data link control (DLC) layers. Yet, effective power management in radio communication requires consideration of end-to-end cross-layer interactions. In this paper, we analyze the implication of "lazy scheduling" from the transport layer perspective. It is shown that a key trade-off between queuing delay and physical layer energy drives the global trade-off between user throughput and system power. Conditions under which "lazy scheduling" is efficient are established and important conclusions on effective system-level architecture and cross-layer power management are drawn. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Novel low power pipelined FFT based on subexpression sharing for wireless LAN applications

    Page(s): 83 - 88
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (363 KB) |  | HTML iconHTML  

    This paper proposes a novel low power multiplierless radix-4 single-path delay commutator (R4SDC) FFT processor architecture for wireless LAN (IEEE 802.11 standard) applications, where short FFT are utilised in the implementation of the physical layer. The multiplierless architecture uses shift and addition operations to realize complex multiplications. By combining a new commutator architecture, and low power butterfly architectures with this approach, the resulting power savings are around 19% and 35% for 64-point and 16-point radix-4 FFT respectively, as compared to a conventional FFT architecture based on non-Booth coded Wallace tree multiplier. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A low power localization architecture and system for wireless sensor networks

    Page(s): 89 - 94
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (467 KB) |  | HTML iconHTML  

    Localization (or locationing) is a central concern for ubiquitous self-configuring sensor networks. The implementation of a distributed, least-squares-based localization algorithm is presented. Low power and energy dissipation are key requirements for sensor networks. As part of the sensor network, the localization system must also conform to these requirements. An ultra-low-power and dedicated hardware implementation of the localization system is therefore presented. The cost of fixed-point implementation is also investigated. The design is implemented in a 0.13 μ CMOS process. It dissipates 1.7 mW of active power and 0.122 nJ/op of active energy with a silicon area of 0.55 mm2. The mean calculated location error due to fixed-point implementation is shown to be 6%. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • ELMMA: a new low power high-speed adder for RNS

    Page(s): 95 - 100
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (295 KB) |  | HTML iconHTML  

    Modular adders are fundamental arithmetic components that are employed in residue number system (RNS) based digital signal processing (DSP) systems. They are widely used in modular multipliers, residue to binary converters and in implementing other arithmetic operations such as scaling. In addition, increasing operating frequencies, as well as a growing demand for portable electronics, have brought power reduction to the forefront of modern design methodologies. Thus, the design of power efficient modular adders is of great significance if RNS circuits are to be utilized in future DSP systems. We propose a new modular adder that is based on the ELM addition algorithm. VLSI implementations using 0.13 μm standard-cell technology show that the proposed architecture not only exhibits power efficiency, but also delay × area efficiency when compared to existing modular adder designs in the literature. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast factorization architecture in soft-decision Reed-Solomon decoding

    Page(s): 101 - 106
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (440 KB) |  | HTML iconHTML  

    Reed-Solomon (RS) codes are among the most widely utilized block error-correcting codes in modern communication and computer systems. Compared to hard-decision decoding, soft-decision decoding offers considerably higher error-correcting capability. Among the soft-decision decoding algorithms, the polynomial time complexity Koetter-Vardy (KV) algorithm can achieve substantial coding gain for high-rate RS codes. In the KV algorithm, the factorization step can consume a major part of the decoding latency. A novel architecture based on root-order prediction is proposed to speed up the factorization step. As a result, the time-consuming exhaustive-search-based root computation in each iteration of the factorization step is circumvented with more than 99% probability. Using the proposed architecture, a speedup of 141% can be achieved over prior efforts for a (255, 239) RS code, while the area consumption is reduced to 31.9%. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A reduced complexity decoder architecture via layered decoding of LDPC codes

    Page(s): 107 - 112
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (343 KB) |  | HTML iconHTML  

    We apply layered belief propagation decoding to our previously devised irregular partitioned permutation LDPC codes. These codes have a construction that easily accommodates a layered decoding and we show that the decoding performance is improved by a factor of two in the number of iterations required. We show how our previous flexible decoding architecture can be adapted to facilitate layered decoding. This results in a significant reduction in the number of memory bits and memory instances required, in the range of 45-50%. The faster decoding speed means the decoder logic can also be reduced by nearly 50% to achieve the same throughput and error performance. In total, the overall decoder architecture can be reduced by nearly 50%. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.