By Topic

Circuits and Systems, IEEE Transactions on

Issue 10 • Date Oct 1989

Filter Results

Displaying Results 1 - 14 of 14
  • A real-time video signal processor suitable for motion picture coding applications

    Page(s): 1259 - 1266
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (800 KB)  

    A real-time video signal processor (VSP) system has been developed. The system employs three multiprocessor clusters, each of which has 12 video signal processor modules (VSPMs). A VSPM in a cluster processes its assigned subimage by using an overlap-save technique. Each cluster uses the same multiprocessor structure, in which homogeneous processor modules are connected to input, output, and feedback buses in parallel and two bus switch units. By controlling these units, the clusters can be combined in pipeline and/or parallel forms. Each cluster also uses a variable delay unit which achieves up to one frame delay on the feedback bus. By using this unit, interframe processing can be carried out without using internal data memories in VSPMs for the frame delay. The employment of the bus switches and the variable delay unit increases flexibility for a variety of signal processing algorithms. The system performs 500 million operations per second and is currently used as a real-time evaluation system for low-bit-rate picture encoders View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A DSP architectural design for low bit-rate motion video codec

    Page(s): 1267 - 1274
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (804 KB)  

    A new digital signal processor (DSP) architecture is presented. This DSP consists of the usual components, such as instruction set, buses, data memories, execution unit, address generators, sequencer, and direct memory access controller, optimized for video signal processing. A 24-bit 50-ns DSP called the digital image signal processor (DISP) has been developed using 1-μm CMOS technology. The performance of the DSP is evaluated by a benchmark test based on an actual video coding sequence. A multi-DSP configuration for a video codec that allows flexible algorithms and variable picture formats is studied. A low-bit-rate motion video codec can be built very easily using the DSPs presented by the authors View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A memory control chip for formatting data into blocks suitable for video coding applications

    Page(s): 1275 - 1280
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (480 KB)  

    Most compression algorithms for motion television require large data storage, usually several television fields, and typically operate on blocks of data. A chip has been built to support both of these features. It generates, from a single clock source, all of the control and address signals required by standard off-the-shelf dynamic RAMs (DRAMs). This includes data packing and unpacking and automatic refresh when required. Counters are provided to address the data into and out of the memories of the form of blocks. The block sizes and field dimensions are programmable and are independent for both read and write operations. Thus, one set of counters can be programmed for sequentially scanned data coming from a camera or going to a television monitor, and other set of counters can be programmed for the block size employed in the compression hardware. Blocks of data can be accessed either continuously or one at a time. When data are read from the memories, a single pel-width pulse marks the start of valid data. Signals marking both end of the block and end of field have also been provided to ease system interfacing View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A VLSI architecture for a pel recursive motion estimation algorithm

    Page(s): 1291 - 1300
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (884 KB)  

    Motivated by a growing demand for an efficient motion compensated (MC) coder operating in real time, the authors propose a VLSI architecture based on parallel and pipelined processing for implementing the pel-recursive motion estimation algorithm for predictive coding of time-varying images. In order to maximize the processing concurrency, the displacement estimation process is divided into its integer and fractional part calculations, and the displacement estimation and the interpolation calculations are decoupled so that each calculation can be computed on a separate processor. The proposed architecture, which exploits pipelining, parallelism, and simple adjacent-neighbor interprocessor wiring, is appropriate for VLSI implementation. The performance of the proposed architecture on the real image sequences is evaluated. Issues regarding the fixed-point arithmetic and coding are discussed View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parameterizable VLSI architectures for the full-search block-matching algorithm

    Page(s): 1309 - 1316
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (688 KB)  

    Systolic VLSI architectures for implementing the full-search block-matching algorithm are described. A large range of data rates can be efficiently covered by the proposed architectures. The input bandwidth problem for the search-area data is solved by on-chip line buffers, allowing a low frame-buffer access rate. An architecture for block-scan data input is described in detail. A VLSI realization with a low transistor count can be achieved by linear arrays in conjunction with compact memory blocks based on three-transistor cells View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A family of VLSI designs for the motion compensation block-matching algorithm

    Page(s): 1317 - 1325
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (728 KB)  

    A family of modular VLSI architectures and chip implementations of the motion-compensation full-search block-matching algorithm are described. This set of application-specific integrated circuits is motivated by the intensive computations required to perform motion compensation in real time. The architectures are based on data-flow designs, which allow sequential inputs but perform parallel processing with 100% efficiency. On the basis of these architectures, a programmable chip can be designed for motion vector estimation with different block sizes. The chips can be cascaded for a larger tracking range or for a video source with a higher pixel sampling rate. A chip-pair design is also derived for calculating fractional motion vectors with quarter-pel precision. The chip-pair design has been laid out, and the chip characteristics are given. Test circuitry is also included to increase the testability of the chips View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A 50-MHz CMOS geometrical mapping processor

    Page(s): 1360 - 1364
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (404 KB)  

    A recently developed microprogrammable, high-resolution, real-time geometrical mapping processor VLSI is presented. The processor computes pixel addresses within the frame-buffer memory according to user-specified geometrical mapping functions. Its architecture permits high-speed operations and library extensions through a combination of elementary functions. It includes a CORDIC function generator, consisting of a one-dimensional pipeline array with high-speed parallel arithmetic circuits, and a pipeline control method. This results in a 50-MHz throughput rate with an accuracy of 20 bits using 1.2-μm CMOS technology. The processor will be useful in high-definition television (HDTV) systems View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Array architectures for block matching algorithms

    Page(s): 1301 - 1308
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (724 KB)  

    A description is given of VLSI architectures for block-matching algorithms utilizing systolic array processors. A well-known mapping procedure has been applied to derive the array processors from the algorithm. Examples of two- and one-dimensional systolic arrays are presented. The transistor-count of the architectures using presently available CMOS technology and their maximum processable frame rates for real-time computation of video signals have been estimated View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A VLSI architecture for real-time and flexible image template matching

    Page(s): 1336 - 1342
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (680 KB)  

    A modular and flexible architecture that realizes a parallel algorithm for real-time image template matching is described. Symmetrically permuted template data (SPTD) are employed in this algorithm to obtain a processing structure with a high degree of parallelism and pipelining, reduce the number of memory accesses to a minimum, and eliminate the use of delay elements that render the dimension of search area to be processed unchangeable. The inherent temporal parallelism and spatial parallelism of the algorithm are fully exploited in developing the hardware architecture. The latter, which is mainly constructed from two types of basic cells, exhibits a high degree of modularity and regularity. The architecture is especially suitable for applications in which adjustments of the dimension of the search area are constantly required. A hardware prototype has been constructed using standard integrated circuits for moving-object detection and interframe motion estimation. It is capable of operating on a search area of size up to 256×256 pixels in real time View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An efficient ASIC architecture for real-time edge detection

    Page(s): 1350 - 1359
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (996 KB)  

    An efficient application-specific architecture is presented for a real-time edge detection system. The architecture is based on the cooperating-data-path model, which allow both the throughput and the area to be optimized for this recursive algorithm. Careful scheduling of the operations on the partly parallel, partly shared hardware has allowed the load to be balanced on each of the four data paths. In this way, the inherently high degree of concurrency in the algorithm has been effectively exploited in the parallel pipelined hardware. The layout of the data paths has been generated by means of powerful CAD tools and the use of a parameterizable functional-building-block library. The corresponding global controller has been partitioned in order to optimize the critical path. This has increased the achievable clock rate even further, up to 10 MHz. The stringent I/O requirements have been taken into account. The resulting ASIC has been verified by register-transfer simulation. It is more than twice as fast as existing designs. The effectiveness of the cooperating-data-path model is thus clearly substantiated by this large, practical test vehicle View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • VLSI architecture for digital picture comparison

    Page(s): 1326 - 1335
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (780 KB)  

    The authors propose a VLSI architecture consisting of m×n processing elements with extensive parallel and pipelining computational capabilities. The worst-case time complexity is reduced to O(max(m, n)), which is a significant improvement over the uniprocessor approach. The algorithm partition problem, an important issue in VLSI design, and the verification of the proposed architecture are also studied. A series of experiments conducted to verify the proposed algorithms is described View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A high speed image codec VLSI for document retrieval

    Page(s): 1343 - 1349
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (564 KB)  

    A new high-speed expansion scheme for binary image data compressed using the MMR code is presented. The proposed scheme enables the output image data to flow at nearly the maximum data flow rate of 1 byte per machine cycle, even for a rather complicated image, such as CCITT reference document No. 4. A key to this performance is the proposed concept of parallel image streaming combined with bit-parallel, concurrent decoding of the next codeword. Based on these concepts, a VLSI chip was designed, fabricated, and evaluated. The effectiveness of the scheme is evidenced by the fact that any of the CCITT reference documents are expanded in 0.11 to 0.13 s View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Bit-serial VLSI implementation of vector quantizer for real-time image coding

    Page(s): 1281 - 1290
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1108 KB)  

    A practical high-throughput architecture and its implementation for real-time coding of television-quality signals are presented. The architecture is directed toward the implementation of multistage vector quantization (VQ), as the authors' simulation results show that the latter is more suitable for real-time coding. However, the implementation is suitable for both single-stage and multistage VQ. The functional blocks of the VQ encoder system have been designed and implemented in VLSI technology. The VQ encoding scheme designed has an encoding delay of 25 clock cycles and is independent of the codebook size View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimal image computations on reduced VLSI architectures

    Page(s): 1365 - 1375
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1148 KB)  

    A communication-efficient parallel organization with a reduced number of processors is considered for problems in image processing and computer vision. The organization consists of n processors having row and column access to an n×n array of memory modules which stores an n×n image. It can be looked upon as a reduced mesh-of-trees organization in which the n2 leaf processors are replaced by n2 memory locations and each row (column) tree is replaced by a single processor with a row (column) bus. The class of image problems considered here requires dense data movement as well as global operations on image pixels. Examples include histogramming, image labeling, computing convexity and nearest neighbors. It is shown that while such problems can be solved in O(n) time on a two-dimensional mesh-connected computer with n2 processors, they can also be solved on the proposed organization in O(n) time using n processors only. In addition, all of the parallel solutions presented are processor-time optimal solutions View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.