By Topic

Multimedia, IEEE Transactions on

Issue 2 • Date Feb. 2013

Filter Results

Displaying Results 1 - 25 of 28
  • Table of contents

    Page(s): C1 - C4
    Save to Project icon | Request Permissions | PDF file iconPDF (178 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Multimedia publication information

    Page(s): C2
    Save to Project icon | Request Permissions | PDF file iconPDF (130 KB)  
    Freely Available from IEEE
  • Guest Editorial: Special Section on New Software/Hardware Paradigms for Error-Tolerant Multimedia Systems

    Page(s): 241
    Save to Project icon | Request Permissions | PDF file iconPDF (81 KB)  
    Freely Available from IEEE
  • A Scalable Precision Analysis Framework

    Page(s): 242 - 256
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1930 KB) |  | HTML iconHTML  

    In embedded computing, typically some form of silicon area or power budget restricts the potential performance achievable. For algorithms with limited dynamic range, custom hardware accelerators manage to extract significant additional performance for such a budget via mapping operations in the algorithm to fixed-point. However, for complex applications requiring floating-point computation, the potential performance improvement over software is reduced. Nonetheless, custom hardware can still customize the precision of floating-point operators, unlike software which is restricted to IEEE standard single or double precision, to increase the overall performance at the cost of increasing the error observed in the final computational result. Unfortunately, because it is difficult to determine if this error increase is tolerable, this task is rarely performed. We present a new analytical technique to calculate bounds on the range or relative error of output variables, enabling custom hardware accelerators to be tolerant of floating point errors by design. In contrast to existing tools that perform this task, our approach scales to larger examples and obtains tighter bounds, within a smaller execution time. Furthermore, it allows a user to trade the quality of bounds with execution time of the procedure, making it suitable for both small and large-scale algorithms. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Robust and Energy Efficient Multimedia Systems via Likelihood Processing

    Page(s): 257 - 267
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3040 KB) |  | HTML iconHTML  

    This paper presents likelihood processing (LP) for designing robust and energy-efficient multimedia systems in the presence of nanoscale non-idealities. LP exploits error statistics of the underlying hardware to compute the probability of a particular bit being a one or a zero. Multiple output observations are generated via either: 1) modular redundancy (MR), 2) estimation, or 3) exploiting spatio-temporal correlation. Energy efficiency and robustness of a 2D discrete-cosine transform (DCT) image codec employing LP is studied. Simulations in a commercial 45-nm CMOS process show that LP can tolerate up to 100×, and 5× greater component error probability as compared to conventional and triple-MR (TMR)-based systems, respectively, while achieving a peak-signal-to-noise ratio (PSNR) of 30 dB at a pre-correction error rate of 20%. Furthermore, LP is able to achieve energy savings of 71% over TMR at a PSNR of 28 dB, while tolerating a pre-correction error rate of 4%. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Markov Decision Process Based Energy-Efficient On-Line Scheduling for Slice-Parallel Video Decoders on Multicore Systems

    Page(s): 268 - 278
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2201 KB) |  | HTML iconHTML  

    We consider the problem of energy-efficient on-line scheduling for slice-parallel video decoders on multicore systems with Dynamic Voltage Frequency Scaling (DVFS) enabled processors. In the past, scheduling and DVFS policies in multi-core systems have been formulated heuristically due to the inherent complexity of the on-line multicore scheduling problem. The key contribution of this paper is that we rigorously formulate the problem as a Markov decision process (MDP), which simultaneously takes into account the on-line scheduling and per-core DVFS capabilities; the power consumption of the processor cores and caches; and the loss tolerant and dynamic nature of the video decoder. The objective of the MDP is to minimize long-term power consumption subject to a minimum Quality of Service (QoS) constraint related to the decoder's throughput. We evaluate the proposed on-line scheduling algorithm in Matlab using realistic video decoding traces generated from a cycle-accurate multiprocessor ARM simulator. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Branch and Data Herding: Reducing Control and Memory Divergence for Error-Tolerant GPU Applications

    Page(s): 279 - 290
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2037 KB) |  | HTML iconHTML  

    Control and memory divergence between threads within the same execution bundle, or warp, have been shown to cause significant performance bottlenecks for GPU applications. In this paper, we exploit the observation that many GPU applications exhibit error tolerance to propose branch and data herding. Branch herding eliminates control divergence by forcing all threads in a warp to take the same control path. Data herding eliminates memory divergence by forcing each thread in a warp to load from the same memory block. To safely and efficiently support branch and data herding, we propose a static analysis and compiler framework to prevent exceptions when control and data errors are introduced, a profiling framework that aims to maximize performance while maintaining acceptable output quality, and hardware optimizations to improve the performance benefits of exploiting error tolerance through branch and data herding. Our software implementation of branch herding on NVIDIA GeForce GTX 480 improves performance by up to 34% (13%, on average) for a suite of NVIDIA CUDA SDK and Parboil benchmarks. Our hardware implementation of branch herding improves performance by up to 55% (30%, on average). Data herding improves performance by up to 32% (25%, on average). Observed output quality degradation is minimal for several applications that exhibit error tolerance, especially for visual computing applications. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Error Tolerant Multimedia Stream Processing: There's Plenty of Room at the Top (of the System Stack)

    Page(s): 291 - 303
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1074 KB) |  | HTML iconHTML  

    There is a growing realization that the expected fault rates and energy dissipation stemming from increases in CMOS integration will lead to the abandonment of traditional system reliability in favor of approaches that offer reliability to hardware-induced errors across the application, runtime support, architecture, device and integrated-circuit (IC) layers. Commercial stakeholders of multimedia stream processing (MSP) applications, such as information retrieval, stream mining systems, and high-throughput image and video processing systems already feel the strain of inadequate system-level scaling and robustness under the always-increasing user demand. While such applications can tolerate certain imprecision in their results, today's MSP systems do not support a systematic way to exploit this aspect for cross-layer system resilience. However, research is currently emerging that attempts to utilize the error-tolerant nature of MSP applications for this purpose. This is achieved by modifications to all layers of the system stack, from algorithms and software to the architecture and device layer, and even the IC digital logic synthesis itself. Unlike conventional processing that aims for worst-case performance and accuracy guarantees, error-tolerant MSP attempts to provide guarantees for the expected performance and accuracy. In this paper we review recent advances in this field from an MSP and a system (layer-by-layer) perspective, and attempt to foresee some of the components of future cross-layer error-tolerant system design that may influence the multimedia and the general computing landscape within the next ten years. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Compressing 3D Trees With Rendering Efficiency Based on Differential Data

    Page(s): 304 - 315
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2902 KB) |  | HTML iconHTML  

    Trees or forests are indispensable and ubiquitous in virtual outdoor environments. Artists usually use high polygon counts to construct detailed 3D tree models to increase realism. However, large memory spaces are required, and considerable computation power is used for rendering. This paper proposes a compression method for 3D tree models to achieve rendering efficiency with less memory space and high visual fidelity of trees simultaneously. Trees are clustered using automatic K-medoids clustering method, and each member tree in the cluster can be reconstructed based on the representative tree with differential data. An abstract representation of an ordered rooted tree for each tree model is introduced, and the similarity between trees was measured using the modified tree edit distance. Moreover, an LOD priority for each component was associated to facilitate the LOD mechanism by considering the contribution to the visual perception after rendering. For accelerating rendering, GPU was exploited to benefit the LOD mechanism, and the geometry instancing facilitates the rendering for component instances shared among member trees. The effectiveness of compression was tested using four sample forests. As demonstrated by the experimental results, our method can save substantial memory space, retain the visual fidelity of the reconstructed trees, and accelerate the rendering. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Reversible Data Hiding With Optimal Value Transfer

    Page(s): 316 - 325
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2354 KB) |  | HTML iconHTML  

    In reversible data hiding techniques, the values of host data are modified according to some particular rules and the original host content can be perfectly restored after extraction of the hidden data on receiver side. In this paper, the optimal rule of value modification under a payload-distortion criterion is found by using an iterative procedure, and a practical reversible data hiding scheme is proposed. The secret data, as well as the auxiliary information used for content recovery, are carried by the differences between the original pixel-values and the corresponding values estimated from the neighbors. Here, the estimation errors are modified according to the optimal value transfer rule. Also, the host image is divided into a number of pixel subsets and the auxiliary information of a subset is always embedded into the estimation errors in the next subset. A receiver can successfully extract the embedded secret data and recover the original content in the subsets with an inverse order. This way, a good reversible data hiding performance is achieved. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Latent Mixture of Discriminative Experts

    Page(s): 326 - 338
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1627 KB) |  | HTML iconHTML  

    In this paper, we introduce a new model called Latent Mixture of Discriminative Experts which can automatically learn the temporal relationship between different modalities. Since, we train separate experts for each modality, LMDE is capable of improving the prediction performance even with limited amount of data. For model interpretation, we present a sparse feature ranking algorithm that exploits L1 regularization. An empirical evaluation is provided on the task of listener backchannel prediction (i.e., head nod). We introduce a new error evaluation metric called User-adaptive Prediction Accuracy that takes into account the difference in people's backchannel responses. Our results confirm the importance of combining five types of multimodal features: lexical, syntactic structure, part-of-speech, visual and prosody. Latent Mixture of Discriminative Experts model outperforms previous approaches. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Real-Time, Full 3-D Reconstruction of Moving Foreground Objects From Multiple Consumer Depth Cameras

    Page(s): 339 - 358
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3349 KB) |  | HTML iconHTML  

    The problem of robust, realistic and especially fast 3-D reconstruction of objects, although extensively studied, is still a challenging research task. Most of the state-of-the-art approaches that target real-time applications, such as immersive reality, address mainly the problem of synthesizing intermediate views for given view-points, rather than generating a single complete 3-D surface. In this paper, we present a multiple-Kinect capturing system and a novel methodology for the creation of accurate, realistic, full 3-D reconstructions of moving foreground objects, e.g., humans, to be exploited in real-time applications. The proposed method generates multiple textured meshes from multiple RGB-Depth streams, applies a coarse-to-fine registration algorithm and finally merges the separate meshes into a single 3-D surface. Although the Kinect sensor has attracted the attention of many researchers and home enthusiasts and has already appeared in many applications over the Internet, none of the already presented works can produce full 3-D models of moving objects from multiple Kinect streams in real-time. We present the capturing setup, the methodology for its calibration and the details of the proposed algorithm for real-time fusion of multiple meshes. The presented experimental results verify the effectiveness of the approach with respect to the 3-D reconstruction quality, as well as the achieved frame rates. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Patch-Based Image Warping for Content-Aware Retargeting

    Page(s): 359 - 368
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3282 KB) |  | HTML iconHTML  

    Image retargeting is the process of adapting images to fit displays with various aspect ratios and sizes. Most studies on image retargeting focus on shape preservation, but they do not fully consider the preservation of structure lines, which are sensitive to human visual system. In this paper, a patch-based retargeting scheme with an extended significance measurement is introduced to preserve shapes of both visually salient objects and structure lines while minimizing visual distortions. In the proposed scheme, a similarity transformation constraint is used to force visually salient content to undergo as-rigid-as-possible deformation, while an optimization process is performed to smoothly propagate distortions. These processes enable our approach to yield pleasing content-aware warping and retargeting. Experimental results and a user study show that our results are better than those generated by state-of-the-art approaches. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Learning Semantic Signatures for 3D Object Retrieval

    Page(s): 369 - 377
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1415 KB) |  | HTML iconHTML  

    In this paper, we propose two kinds of semantic signatures for 3D object retrieval (3DOR). Humans are capable of describing an object using attribute terms like “symmetric” and “flyable”, or using its similarities to some known object classes. We convert such qualitative descriptions into attribute signature (AS) and reference set signature (RSS), respectively, and use them for 3DOR. We also show that AS and RSS can be understood as two different quantization methods of the same semantic space of human descriptions of objects. The advantages of the semantic signatures are threefold. First, they are much more compact than low-level shape features yet working with comparable retrieval accuracy. Therefore, the proposed semantic signatures require less storage space and computation cost in retrieval. Second, the high-level signatures are a good complement to low-level shape features. As a result, by incorporating the signatures we can improve the performance of state-of-the-art 3DOR methods by a large margin. To the best of our knowledge, we obtain the best results on two popular benchmarks. Third, the AS enables us to build a user-friendly interface, with which the user can trigger a search by simply clicking attribute bars instead of finding a 3D object as the query. This interface is of great significance in 3DOR considering the fact that while searching, the user usually does not have a 3D query at hand that is similar to his/her targeted objects in the database. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multimodal Analysis for Identification and Segmentation of Moving-Sounding Objects

    Page(s): 378 - 390
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2488 KB) |  | HTML iconHTML  

    In this paper, we propose a novel method that exploits correlation between audio-visual dynamics of a video to segment and localize objects that are the dominant source of audio. Our approach consists of a two-step spatiotemporal segmentation mechanism that relies on velocity and acceleration of moving objects as visual features. Each frame of the video is segmented into regions based on motion and appearance cues using the QuickShift algorithm, which are then clustered over time using K-means, so as to obtain a spatiotemporal video segmentation. The video is represented by motion features computed over individual segments. The Mel-Frequency Cepstral Coefficients (MFCC) of the audio signal, and their first order derivatives are exploited to represent audio. The proposed framework assumes there is a non-trivial correlation between these audio features and the velocity and acceleration of the moving and sounding objects. The canonical correlation analysis (CCA) is utilized to identify the moving objects which are most correlated to the audio signal. In addition to moving-sounding object identification, the same framework is also exploited to solve the problem of audio-video synchronization, and is used to aid interactive segmentation. We evaluate the performance of our proposed method on challenging videos. Our experiments demonstrate significant increase in performance over the state-of-the-art both qualitatively and quantitatively, and validate the feasibility and superiority of our approach. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Affective Labeling in a Content-Based Recommender System for Images

    Page(s): 391 - 400
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1353 KB) |  | HTML iconHTML  

    Affective labeling of multimedia content has proved to be useful in recommender systems. In this paper we present a methodology for the implicit acquisition of affective labels for images. It is based on an emotion detection technique that takes as input the video sequences of the users' facial expressions. It extracts Gabor low level features from the video frames and employs a k nearest neighbors machine learning technique to generate affective labels in the valence-arousal-dominance space. We performed a comparative study of the performance of a content-based recommender (CBR) system for images that uses three types of metadata to model the users and the items: (i) generic metadata, (ii) explicitly acquired affective labels and (iii) implicitly acquired affective labels with the proposed methodology. The results show that the CBR performs best when explicit labels are used. However, implicitly acquired labels yield a significantly better performance of the CBR than generic metadata while being an unobtrusive feedback tool. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Script-to-Movie: A Computational Framework for Story Movie Composition

    Page(s): 401 - 414
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2303 KB) |  | HTML iconHTML  

    Traditional movie production has always been a highly professional work that needs team collaboration, advanced devices and techniques, and vast time and money investment. These high threshold requirements not only prevent mass amateur enthusiasts entering this field, but also hinder professionals quickly previewing their conceived story plots. In this paper, we raise a novel application, named script-to-movie (S2M) composition, to automatically produce new movies from existing videos in accordance with user created script. Our motivation is to liberate producers from complex filming and editing operations, thereby people's story idea can be instantly converted into the vivid movie video. To support the novel “What You Dream Is What You See” (WYDIWYS) production mode, we first propose a hierarchical alignment method to automatically construct a video material database with detailed semantic description. Considering diverse story plots in user designed script, the database contains abundant video materials about different characters appearing in various time and places conditions. On this basis, the S2M composition is formulated as a constrained optimization problem, where semantic story plot and syntactic visual content are synthetically considered to identify a group of optimal video segments to narrate the user designed script story. Both quantitative and qualitative experimental results are reported to illustrate the effectiveness of the proposed S2M application. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Smooth Nonnegative Matrix Factorization for Unsupervised Audiovisual Document Structuring

    Page(s): 415 - 425
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1718 KB) |  | HTML iconHTML  

    This paper introduces a new paradigm for unsupervised audiovisual document structuring. In this paradigm, a novel Nonnegative Matrix Factorization (NMF) algorithm is applied on histograms of counts (relating to a bag of features representation of the content) to jointly discover latent structuring patterns and their activations in time. Our NMF variant employs the Kullback-Leibler divergence as a cost function and imposes a temporal smoothness constraint to the activations. It is solved by a majorization-minimization technique. The approach proposed is meant to be generic and is particularly well suited to applications where the structuring patterns may overlap in time. As such, it is evaluated on two person-oriented video structuring tasks (one using the visual modality and the second the audio). This is done using a challenging database of political debate videos. Our results outperform reference results obtained by a method using Hidden Markov Models. Further, we show the potential that our general approach has for audio speaker diarization. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Beyond Text QA: Multimedia Answer Generation by Harvesting Web Information

    Page(s): 426 - 441
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2311 KB) |  | HTML iconHTML  

    Community question answering (cQA) services have gained popularity over the past years. It not only allows community members to post and answer questions but also enables general users to seek information from a comprehensive set of well-answered questions. However, existing cQA forums usually provide only textual answers, which are not informative enough for many questions. In this paper, we propose a scheme that is able to enrich textual answers in cQA with appropriate media data. Our scheme consists of three components: answer medium selection, query generation for multimedia search, and multimedia data selection and presentation. This approach automatically determines which type of media information should be added for a textual answer. It then automatically collects data from the web to enrich the answer. By processing a large set of QA pairs and adding them to a pool, our approach can enable a novel multimedia question answering (MMQA) approach as users can find multimedia answers by matching their questions with those in the pool. Different from a lot of MMQA research efforts that attempt to directly answer questions with image and video data, our approach is built based on community-contributed textual answers and thus it is able to deal with more complex questions. We have conducted extensive experiments on a multi-source QA dataset. The results demonstrate the effectiveness of our approach. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Query-Adaptive Image Search With Hash Codes

    Page(s): 442 - 453
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2130 KB) |  | HTML iconHTML  

    Scalable image search based on visual similarity has been an active topic of research in recent years. State-of-the-art solutions often use hashing methods to embed high-dimensional image features into Hamming space, where search can be performed in real-time based on Hamming distance of compact hash codes. Unlike traditional metrics (e.g., Euclidean) that offer continuous distances, the Hamming distances are discrete integer values. As a consequence, there are often a large number of images sharing equal Hamming distances to a query, which largely hurts search results where fine-grained ranking is very important. This paper introduces an approach that enables query-adaptive ranking of the returned images with equal Hamming distances to the queries. This is achieved by firstly offline learning bitwise weights of the hash codes for a diverse set of predefined semantic concept classes. We formulate the weight learning process as a quadratic programming problem that minimizes intra-class distance while preserving inter-class relationship captured by original raw image features. Query-adaptive weights are then computed online by evaluating the proximity between a query and the semantic concept classes. With the query-adaptive bitwise weights, returned images can be easily ordered by weighted Hamming distance at a finer-grained hash code level rather than the original Hamming distance level. Experiments on a Flickr image dataset show clear improvements from our proposed approach. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Continuous Birdsong Recognition Using Gaussian Mixture Modeling of Image Shape Features

    Page(s): 454 - 464
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2517 KB) |  | HTML iconHTML  

    Traditional birdsong recognition approaches used acoustic features based on the acoustic model of speech production or the perceptual model of the human auditory system to identify the associated bird species. In this paper, a new feature descriptor that uses image shape features is proposed to identify bird species based on the recognition of fixed-duration birdsong segments where their corresponding spectrograms are viewed as gray-level images. The MPEG-7 angular radial transform (ART) descriptor, which can compactly and efficiently describe the gray-level variations within an image region in both angular and radial directions, will be employed to extract the shape features from the spectrogram image. To effectively capture both frequency and temporal variations within a birdsong segment using ART, a sector expansion algorithm is proposed to transform its spectrogram image into a corresponding sector image such that the frequency and temporal axes of the spectrogram image will align with the radial and angular directions of the ART basis functions, respectively. For the classification of 28 bird species using Gaussian mixture models (GMM), the best classification accuracy is 86.30% and 94.62% for 3-second and 5-second birdsong segments using the proposed ART descriptor, which is better than traditional descriptors such as LPCC, MFCC, and TDMFCC. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Effective CU Size Decision Method for HEVC Encoders

    Page(s): 465 - 470
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (949 KB) |  | HTML iconHTML  

    The emerging high efficiency video coding standard (HEVC) adopts the quadtree-structured coding unit (CU). Each CU allows recursive splitting into four equal sub-CUs. At each depth level (CU size), the test model of HEVC (HM) performs motion estimation (ME) with different sizes including 2N × 2N, 2N × N, N × 2N and N × N. ME process in HM is performed using all the possible depth levels and prediction modes to find the one with the least rate distortion (RD) cost using Lagrange multiplier. This achieves the highest coding efficiency but requires a very high computational complexity. In this paper, we propose a fast CU size decision algorithm for HM. Since the optimal depth level is highly content-dependent, it is not efficient to use all levels. We can determine CU depth range (including the minimum depth level and the maximum depth level) and skip some specific depth levels rarely used in the previous frame and neighboring CUs. Besides, the proposed algorithm also introduces early termination methods based on motion homogeneity checking, RD cost checking and SKIP mode checking to skip ME on unnecessary CU sizes. Experimental results demonstrate that the proposed algorithm can significantly reduce computational complexity while maintaining almost the same RD performance as the original HEVC encoder. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • TMM EDICS

    Page(s): 471
    Save to Project icon | Request Permissions | PDF file iconPDF (308 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Multimedia information for authors

    Page(s): 472 - 473
    Save to Project icon | Request Permissions | PDF file iconPDF (131 KB)  
    Freely Available from IEEE
  • Special issue on statistical parametric speech synthesis

    Page(s): 474
    Save to Project icon | Request Permissions | PDF file iconPDF (244 KB)  
    Freely Available from IEEE

Aims & Scope

The scope of the Periodical is the various aspects of research in multimedia technology and applications of multimedia.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Chang Wen Chen
State University of New York at Buffalo