Scheduled System Maintenance:
Some services will be unavailable Sunday, March 29th through Monday, March 30th. We apologize for the inconvenience.
By Topic

Circuits and Systems for Video Technology, IEEE Transactions on

Issue 11 • Date Nov. 2003

Filter Results

Displaying Results 1 - 13 of 13
  • Introduction to the special issue on image-based modeling, rendering, and animation

    Publication Year: 2003 , Page(s): 1017 - 1019
    Save to Project icon | Request Permissions | PDF file iconPDF (226 KB) |  | HTML iconHTML  
    Freely Available from IEEE
  • Spectral analysis for sampling image-based rendering data

    Publication Year: 2003 , Page(s): 1038 - 1050
    Cited by:  Papers (35)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1727 KB) |  | HTML iconHTML  

    Image-based rendering (IBR) has become a very active research area in recent years. The spectral analysis problem for IBR has not been completely solved. In this paper, we present a new method to parameterize the problem, which is applicable for general-purpose IBR spectral analysis. We notice that any plenoptic function is generated by light ray emitted/reflected/refracted from the object surface. We introduce the surface plenoptic function (SPF), which represents the light rays starting from the object surface. Given that radiance along a light ray does not change unless the light ray is blocked, SPF reduces the dimension of the original plenoptic function to 6D. We are then able to map or transform the SPF to IBR representations captured along any camera trajectory. Assuming some properties on the SPF, we can analyze the properties of IBR for generic scenes such as scenes with Lambertian or non-Lambertian surfaces and scenes with or without occlusions, and for different sampling strategies such as lightfield/concentric mosaic. We find that in most cases, even though the SPF may be band-limited, the frequency spectrum of IBR is not band-limited. We show that non-Lambertian reflections, depth variations and occlusions can all broaden the spectrum, with the latter two being more significant. SPF is defined for scenes with known geometry. When the geometry is unknown, spectral analysis is still possible. We show that with the "truncating windows" analysis and some conclusions obtained with SPF, the spectrum expansion caused by non-Lambertian reflections and occlusions can be quantatively estimated, even when the scene geometry is not explicitly known. Given the spectrum of IBR, we also study how to sample IBR data more efficiently. Our analysis is based on the generalized periodic sampling theory with arbitrary geometry. We show that the sampling efficiency can be up to twice of that when we use rectangular sampling. The advantages and disadvantages of generalized periodic sampling for IBR are also discussed. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Image-based rendering by joint view triangulation

    Publication Year: 2003 , Page(s): 1051 - 1063
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2167 KB)  

    The creation of novel views using prestored images or image-based rendering has many potential applications, such as visual simulation, virtual reality, and telepresence, for which traditional computer graphics based on geometric modeling would be unsatisfactory particularly with very complex three-dimensional scenes. This paper presents a new image-based rendering system that tackles the two most difficult problems of image-based modeling: pixel matching and visibility handling. We first introduce the joint view triangulation (JVT), a novel representation for pairs of images that handles the visibility and occlusion problems created by the parallaxes between the images. The JVT is built from matched planar patches regularized by local smooth constraints encoded by plane homographies. Then, we introduce an incremental edge-constrained construction algorithm. Finally, we present a pseudo-painter's rendering algorithm for the JVT and demonstrate the performance of these methods experimentally. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Terminal QoS for real-time 3-D visualization using scalable MPEG-4 coding

    Publication Year: 2003 , Page(s): 1136 - 1143
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (636 KB) |  | HTML iconHTML  

    Terminal quality of service is the process of optimally scaling down the decoding and rendering computations to the available processing power, while maximizing the overall perceived quality. This process is applied in a real-time three-dimensional (3-D) decoding and rendering engine, exploiting scalable MPEG-4 coding algorithms. We derive a relation between the quality of the 3-D rendered objects and their processing requirements, expressed by simple CPU time models. The performance dependency parameters are in direct relation to the high-level 3-D content characteristics (number of triangles, number of rendered screen pixels per object) and are calibrated for the platform under test. Examples show that, for any pre-established frame rate, the platform workload can reliably be anticipated for adjusting the content and process parameters accordingly. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Registration and partitioning-based compression of 3-D dynamic data

    Publication Year: 2003 , Page(s): 1144 - 1155
    Cited by:  Papers (10)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (769 KB) |  | HTML iconHTML  

    Generation and transmission of complex animation sequences can benefit substantially from the availability of tools for handling large amounts of data associated with dynamic three-dimensional (3-D) models. Previous works in 3-D dynamic compression consider only the simplest situation where the connectivity changes do not occur with time. In this paper, we present an approach for compressing 3-D dynamic models in which both the vertex data and the connectivity data can change with time. Using our framework, 3-D animation sequences generated using commercial graphics tools or dynamic range data captured using range scanners can be compressed significantly. We use 3-D registration to identify the changes in the vertex data and the connectivity of the 3-D geometry between successive frames. Next, the interframe motion is encoded using affine motion parameters and the differential pulse coded modulation (DPCM) predictor. Our work is the first to exploit the temporal coherence in the connectivity data between frames and presents a detailed encoding scheme for 3-D dynamic data. We also discuss the issue of inserting I-frames in the compressed data for better performance. We show that our algorithm has a far superior performance when compared with existing techniques for both vertex compression and connectivity compression of 3-D dynamic datasets. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Compression of illumination-adjustable images

    Publication Year: 2003 , Page(s): 1107 - 1118
    Cited by:  Papers (9)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1033 KB)  

    The image-based modeling and rendering (IBMR) approaches allow the time complexity of synthesizing novel images to be independent of scene complexity. Unfortunately, illumination control (relighting) is no longer trivial under the image-based framework. To relight the image-based scenery, the scene must be captured under various illumination conditions. This drastically increases the data volume. Hence, data compression is a must. In this paper, we describe a compression algorithm for an illumination-adjustable image representation. We focus on compressing constant-viewpoint images. A divide-and-conquer approach is proposed. The compression algorithm consists of three parts which exploit the intrapixel, interpixel, and interchannel data correlations. Experimental result shows that the proposed method not just effectively compresses the data but also outperforms standard image and video coding methods. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Large environment rendering using plenoptic primitives

    Publication Year: 2003 , Page(s): 1064 - 1073
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (877 KB)  

    One of the most difficult tasks in computer graphics is to enable virtual walkthroughs in very large and complicated environments that are photorealistic, seamless, and in real time. Current image-based rendering techniques, while capable of photorealism and interactive speeds, have failed in practice to extend to visualizations of such environments. We demonstrate an approach that defines a virtual walkthrough experience using plenoptic primitives (PPs). A PP can be any type of local visual experience: 360° static panorama, panoramic video (PV), lumigraph/light field representation, or concentric mosaics (CMs). By combining them judiciously, user experience can be authored with significantly reduced effort while maintaining high-quality user experience. We illustrate our technique on synthetic and real environments using PVs and CMs and show how the problem of achieving smooth transitions among PVs and CMs can be solved by using position-dependent local geometries. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Learning and synthesizing MPEG-4 compatible 3-D face animation from video sequence

    Publication Year: 2003 , Page(s): 1119 - 1128
    Cited by:  Papers (7)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (981 KB) |  | HTML iconHTML  

    We present a new system that applies an example-based learning method to learn facial motion patterns from a video sequence of individual facial behavior such as lip motion and facial expressions, and using that to create vivid three-dimensional (3-D) face animation according to the definition of MPEG-4 face animation parameters. The system consists of three key modules, face tracking, pattern learning, and face animation. In face tracking, to reduce the complexity of the tracking process, a novel coarse-to-fine strategy combined with a Kalman filter is proposed for localizing key facial landmarks in each image of the video. The landmarks' sequence is normalized into a visual feature matrix and then fed to the next step of process. In pattern learning, in the pretraining stage, the parameters of the camera that took the video are requested with the training video data so the system can estimate the basic mapping from a normalized two-dimensional (2-D) visual feature matrix to the representation in 3-D MPEG-4 face animation parameter space, in assistance with the computer vision method. In the practice stage, considering that in most cases camera parameters are not provided with video data, the system uses machine learning technology to complement the incomplete 3-D information for the mapping that information is needed in face orientation presentation. The example-based learning in this system integrates several methods including clustering, HMM, and ANN to make a better conversion from a 2-D to 3-D model and better estimation of incomplete 3-D information for good mapping; this will be used to drive face animation thereafter. In face animation, the system can synthesize face animation following any type of face motion in video. Experiments show that our system produces more vivid face motion animation, compared to other early systems. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Survey of image-based representations and compression techniques

    Publication Year: 2003 , Page(s): 1020 - 1037
    Cited by:  Papers (94)  |  Patents (25)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1049 KB)  

    We survey the techniques for image-based rendering (IBR) and for compressing image-based representations. Unlike traditional three-dimensional (3-D) computer graphics, in which 3-D geometry of the scene is known, IBR techniques render novel views directly from input images. IBR techniques can be classified into three categories according to how much geometric information is used: rendering without geometry, rendering with implicit geometry (i.e., correspondence), and rendering with explicit geometry (either with approximate or accurate geometry). We discuss the characteristics of these categories and their representative techniques. IBR techniques demonstrate a surprising diverse range in their extent of use of images and geometry in representing 3-D scenes. We explore the issues in trading off the use of images and geometry by revisiting plenoptic-sampling analysis and the notions of view dependency and geometric proxies. Finally, we highlight compression techniques specifically designed for image-based representations. Such compression techniques are important in making IBR techniques practical. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Creating and encoding of cartoons using MPEG-4 BIFS: methods and results

    Publication Year: 2003 , Page(s): 1129 - 1135
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (293 KB)  

    Our work focuses on the encoding of two-dimensional (2-D) graphics animated cartoons with MPEG-4. This stems from a need to stop encoding cartoons as videos and to start using methods more adapted to animated 2-D vector graphics sequences. We expected this could be an efficient coding with good versatility for various types of terminals. First, we present some technical specification for such cartoons. Next, we give an overview of MPEG-4 BIFS encoding methods. Then, we analyze how to optimize the use of these methods; some of our proposals could be useful for 2-D graphics applications other than animated cartoon. Last, we show some results with significantly better encoding efficiency over simple BIFS encoding. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Interactive rendering from compressed light fields

    Publication Year: 2003 , Page(s): 1080 - 1091
    Cited by:  Papers (11)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (636 KB)  

    A light field is a collection of multiview images which represent a three-dimensional scene. Rendering from a light field provides a simple and efficient way to generate arbitrary new views of the scene, bypassing the difficult problem of acquiring accurate geometric and photometric models. The enormous amount of data required in a light field poses a key challenge in rendering. Tree-structured vector quantization (TSVQ) provides a moderate compression ratio of around 24:1, which alleviates, but does not solve, the problem. Compression schemes based on video coding techniques exploit the data redundancy very effectively, but do not provide adequate random access for rendering. The paper presents an analysis of the data-access pattern during the rendering process and describes a compression scheme that supports interactive rendering directly from compressed light field data. The proposed algorithm provides a high compression ratio of as much as ten times that of TSVQ, while slowing down the rendering speed by only a factor smaller than 2. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multi-view coding for image-based rendering using 3-D scene geometry

    Publication Year: 2003 , Page(s): 1092 - 1106
    Cited by:  Papers (34)  |  Patents (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1453 KB) |  | HTML iconHTML  

    To store and transmit the large amount of image data necessary for Image-based Rendering (IBR), efficient coding schemes are required. This paper presents two different approaches which exploit three-dimensional scene geometry for multi-view compression. In texture-based coding, images are converted to view-dependent texture maps for compression. In model-aided predictive coding, scene geometry is used for disparity compensation and occlusion detection between images. While both coding strategies are able to attain compression ratios exceeding 2000:1, individual coding performance is found to depend on the accuracy of the available geometry model. Experiments with real-world as well as synthetic image sets show that texture-based coding is more sensitive to geometry inaccuracies than predictive coding. A rate-distortion theoretical analysis of both schemes supports these findings. For reconstructed approximate geometry models, model-aided predictive coding performs best, while texture-based coding yields superior coding results if scene geometry is exactly known. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Specialized hardware for deformable object modeling

    Publication Year: 2003 , Page(s): 1074 - 1079
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (456 KB) |  | HTML iconHTML  

    Deformable object modeling is a technology that will enable a broad range of new applications. Unfortunately, current techniques are far from being able to offer interactive performance for realistic scenes. We examine the idea of using low-cost highly specialized hardware in order to accelerate deformable object modeling. Details are presented regarding two generations of such experimental systems. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

The emphasis is focused on, but not limited to:
1. Video A/D and D/ A
2. Video Compression Techniques and Signal Processing
3. Multi-Dimensional Filters and Transforms
4. High Speed Real-Tune Circuits
5. Multi-Processors Systems—Hardware and Software
6. VLSI Architecture and Implementation for Video Technology 

 

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Dan Schonfeld
Multimedia Communications Laboratory
ECE Dept. (M/C 154)
University of Illinois at Chicago (UIC)
Chicago, IL 60607-7053
tcsvt-eic@tcad.polito.it

Managing Editor
Jaqueline Zelkowitz
tcsvt@tcad.polito.it