By Topic

Multimedia, IEEE Transactions on

Issue 3 • Date April 2013

Filter Results

Displaying Results 1 - 25 of 27
  • Table of Contents

    Page(s): C1 - C4
    Save to Project icon | Request Permissions | PDF file iconPDF (152 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Multimedia publication information

    Page(s): C2
    Save to Project icon | Request Permissions | PDF file iconPDF (130 KB)  
    Freely Available from IEEE
  • A New Fast Encoding Algorithm Based on an Efficient Motion Estimation Process for the Scalable Video Coding Standard

    Page(s): 477 - 484
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1636 KB) |  | HTML iconHTML  

    In this paper, a new fast encoding algorithm based on an efficient motion estimation (ME) process is proposed to accelerate the encoding speed of the scalable video coding standard. Through analysis of the ME process performed in the enhancement layer, we discovered that there are redundant MEs and some MEs can simply be unified at the fully overlapped search range (FOSR). In order to make the unified ME more efficient, we theoretically derive a skip criterion to determine whether the computation of rate-distortion cost can be omitted. In the proposed algorithm, the unnecessary MEs are removed and a unified ME with the skip criterion is applied in the FOSR. Simulation results show that the proposed algorithm achieves computational savings of approximately 46% without coding performance degradation when compared with the original SVC encoder. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient Fine-Granular Scalable Coding of 3D Mesh Sequences

    Page(s): 485 - 497
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2457 KB) |  | HTML iconHTML  

    An efficient fine-granular scalable coding algorithm of 3-D mesh sequences for low-latency streaming applications is proposed in this work. First, we decompose a mesh sequence into spatial and temporal layers to support scalable decoding. To support the finest-granular spatial scalability, we decimate only a single vertex at each layer to obtain the next layer. Then, we predict the coordinates of decimated vertices spatially and temporally based on a hierarchical prediction structure. Last, we quantize and transmit the spatio-temporal prediction residuals using an arithmetic coder. We propose an efficient context model for the arithmetic coding. Experiment results show that the proposed algorithm provides significantly better compression performance than the conventional algorithms, while supporting finer-granular spatial scalability. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Self-Learning Approach to Single Image Super-Resolution

    Page(s): 498 - 508
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3318 KB) |  | HTML iconHTML  

    Learning-based approaches for image super-resolution (SR) have attracted the attention from researchers in the past few years. In this paper, we present a novel self-learning approach for SR. In our proposed framework, we advance support vector regression (SVR) with image sparse representation, which offers excellent generalization in modeling the relationship between images and their associated SR versions. Unlike most prior SR methods, our proposed framework does not require the collection of training low and high-resolution image data in advance, and we do not assume the reoccurrence (or self-similarity) of image patches within an image or across image scales. With theoretical supports of Bayes decision theory, we verify that our SR framework learns and selects the optimal SVR model when producing an SR image, which results in the minimum SR reconstruction error. We evaluate our method on a variety of images, and obtain very promising SR results. In most cases, our method quantitatively and qualitatively outperforms bicubic interpolation and state-of-the-art learning-based SR approaches. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Multimodal Approach to Speaker Diarization on TV Talk-Shows

    Page(s): 509 - 520
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1502 KB) |  | HTML iconHTML  

    In this article, we propose solutions to the problem of speaker diarization of TV talk-shows, a problem for which adapted multimodal approaches, relying on other streams of data than only audio, remain largely under exploited. Hence we propose an original system that leverages prior knowledge on the structure of this type of content, especially the visual information relating to the active speakers, for an improved diarization performance. The architecture of this system can be decomposed into two main stages. First a reliable training set is created, in an unsupervised fashion, for each participant of the TV program being processed. This data is assembled by the association of visual and audio descriptors carefully selected in a clustering cascade. Then, Support Vector Machines are used for the classification of the speech data (of a given TV program). The performance of this new architecture is assessed on two French talk-show collections: Le Grand Échiquier and On n'a pas tout dit. The results show that our new system outperforms state-of-the-art methods, thus evidencing the effectiveness of kernel-based methods, as well as visual cues, in multimodal approaches to speaker diarization of challenging contents such as TV talk-shows. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • VideoPuzzle: Descriptive One-Shot Video Composition

    Page(s): 521 - 534
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2367 KB) |  | HTML iconHTML  

    A large amount of short, single-shot videos are created by personal camcorder every day, such as the small video clips in family albums, and thus a solution for presenting and managing these video clips is highly desired. From the perspective of professionalism and artistry, long-take/shot video, also termed one-shot video, is able to present events, persons or scenic spots in an informative manner. This paper presents a novel video composition system “Video Puzzle” which generates aesthetically enhanced long-shot videos from short video clips. Our task here is to automatically composite several related single shots into a virtual long-take video with spatial and temporal consistency. We propose a novel framework to compose descriptive long-take video with content-consistent shots retrieved from a video pool. For each video, frame-by-frame search is performed over the entire pool to find start-end content correspondences through a coarse-to-fine partial matching process. The content correspondence here is general and can refer to the matched regions or objects, such as human body and face. The content consistency of these correspondences enables us to design several shot transition schemes to seamlessly stitch one shot to another in a spatially and temporally consistent manner. The entire long-take video thus comprises several single shots with consistent contents and ίuent transitions. Meanwhile, with the generated matching graph of videos, the proposed system can also provide an efficient video browsing mode. Experiments are conducted on multiple video albums and the results demonstrate the effectiveness and the usefulness of the proposed scheme. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Edge-Preserving Texture Suppression Filter Based on Joint Filtering Schemes

    Page(s): 535 - 548
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (5349 KB) |  | HTML iconHTML  

    Obtaining a texture-smoothing and edge-preserving filtered output is significant to image decomposition. Although the edge and the texture have salient difference in human vision, automatically distinguishing them is a difficult task, for they have similar intensity difference or gradient response. The state-of-the-art edge-preserving smoothing (EPS) based decomposition approaches are hard to obtain a satisfactory result. We propose a novel edge-preserving texture suppression filter, exploiting the joint bilateral filter as a bridge to achieve the purpose of both properties of texture-smoothing and edge-preserving. We develop the iterative asymmetric sampling and the local linear model to produce the degenerative image to suppress the texture, and apply the edge correction operator to achieve edge-preserving. An efficient accelerating implementation is introduced to improve the performance of filtering response. The experiments demonstrate that our filter produces satisfactory outputs with both properties of texture-smoothing and edge-preserving, while compared with the results of other popular EPS approaches in signal, visual and time analysis. Finally, we extend our filter to a variety of image processing applications. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Example-Based Color Transfer for Gradient Meshes

    Page(s): 549 - 560
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2414 KB) |  | HTML iconHTML  

    Editing a photo-realistic gradient mesh is a tough task. Even only editing the colors of an existing gradient mesh can be exhaustive and time-consuming. To facilitate user-friendly color editing, we develop an example-based color transfer method for gradient meshes, which borrows the color characteristics of an example image to a gradient mesh. We start by exploiting the constraints of the gradient mesh, and accordingly propose a linear-operator-based color transfer framework. Our framework operates only on colors and color gradients of the mesh points and preserves the topological structure of the gradient mesh. Bearing the framework in mind, we build our approach on PCA-based color transfer. After relieving the color range problem, we incorporate a fusion-based optimization scheme to improve color similarity between the reference image and the recolored gradient mesh. Finally, a multi-swatch transfer scheme is provided to enable more user control. Our approach is simple, effective, and much faster than color transferring the rastered gradient mesh directly. The experimental results also show that our method can generate pleasing recolored gradient meshes. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Feature Processing and Modeling for 6D Motion Gesture Recognition

    Page(s): 561 - 571
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1518 KB) |  | HTML iconHTML  

    A 6D motion gesture is represented by a 3D spatial trajectory and augmented by another three dimensions of orientation. Using different tracking technologies, the motion can be tracked explicitly with the position and orientation or implicitly with the acceleration and angular speed. In this work, we address the problem of motion gesture recognition for command-and-control applications. Our main contribution is to investigate the relative effectiveness of various feature dimensions for motion gesture recognition in both user-dependent and user-independent cases. We introduce a statistical feature-based classifier as the baseline and propose an HMM-based recognizer, which offers more flexibility in feature selection and achieves better performance in recognition accuracy than the baseline system. Our motion gesture database which contains both explicit and implicit motion information allows us to compare the recognition performance of different tracking signals on a common ground. This study also gives an insight into the attainable recognition rate with different tracking devices, which is valuable for the system designer to choose the proper tracking technology. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multi-Feature Fusion via Hierarchical Regression for Multimedia Analysis

    Page(s): 572 - 581
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2173 KB) |  | HTML iconHTML  

    Multimedia data are usually represented by multiple features. In this paper, we propose a new algorithm, namely Multi-feature Learning via Hierarchical Regression for multimedia semantics understanding, where two issues are considered. First, labeling large amount of training data is labor-intensive. It is meaningful to effectively leverage unlabeled data to facilitate multimedia semantics understanding. Second, given that multimedia data can be represented by multiple features, it is advantageous to develop an algorithm which combines evidence obtained from different features to infer reliable multimedia semantic concept classifiers. We design a hierarchical regression model to exploit the information derived from each type of feature, which is then collaboratively fused to obtain a multimedia semantic concept classifier. Both label information and data distribution of different features representing multimedia data are considered. The algorithm can be applied to a wide range of multimedia applications and experiments are conducted on video data for video concept annotation and action recognition. Using Trecvid and CareMedia video datasets, the experimental results show that it is beneficial to combine multiple features. The performance of the proposed algorithm is remarkable when only a small amount of labeled training data are available. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Learning to Reassemble Shredded Documents

    Page(s): 582 - 593
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1750 KB) |  | HTML iconHTML  

    In this paper, we address the problem of automatically assembling shredded documents. We propose a two-step algorithmic framework. First, we digitize each fragment of a given document and extract shape- and content-based local features. Based on these multimodal features, we identify pairs of corresponding points on all pairs of fragments using an SVM classifier. Each pair is considered a point of attachment for aligning the respective fragments. In order to restore the layout of the document, we create a document graph in which nodes represent fragments and edges correspond to alignments. We assign weights to the edges by evaluating the alignments using a set of inter-fragment constraints which take into account shape- and content-based information. Finally, we use an iterative algorithm that chooses the edge having the highest weight during each iteration. However, since selecting edges corresponds to combining groups of fragments and thus provides new evidence, we reevaluate the edge weights after each iteration. We quantitatively evaluate the effectiveness of our approach by conducting experiments on a novel dataset. It comprises a total of 120 pages taken from two magazines which have been shredded and annotated manually. We thus provide the means for a quantitative evaluation of assembly algorithms which, to the best of our knowledge, has not been done before. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Interactive Multimodal Visual Search on Mobile Device

    Page(s): 594 - 607
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1957 KB) |  | HTML iconHTML  

    This paper describes a novel multimodal interactive image search system on mobile devices. The system, the Joint search with ImaGe, Speech, And Word Plus (JIGSAW+ ), takes full advantage of the multimodal input and natural user interactions of mobile devices. It is designed for users who already have pictures in their minds but have no precise descriptions or names to address them. By describing it using speech and then refining the recognized query by interactively composing a visual query using exemplary images, the user can easily find the desired images through a few natural multimodal interactions with his/her mobile device. Compared with our previous work JIGSAW, the algorithm has been significantly improved in three aspects: 1) segmentation-based image representation is adopted to remove the artificial block partitions; 2) relative position checking replaces the fixed position penalty; and 3) inverted index is constructed instead of brute force matching. The proposed JIGSAW+ is able to achieve 5% gain in terms of search performance and is ten times faster. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Gram-Based String Paradigm for Efficient Video Subsequence Search

    Page(s): 608 - 620
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1560 KB) |  | HTML iconHTML  

    The unprecedented increase in the generation and dissemination of video data has created an urgent demand for the large-scale video content management system to quickly retrieve videos of users' interests. Traditionally, video sequence data are managed by high-dimensional indexing structures, most of which suffer from the well-known “curse of dimensionality” and lack of support of subsequence retrieval. Inspired by the high efficiency of string indexing methods, in this paper, we present a string paradigm called VideoGram for large-scale video sequence indexing to achieve fast similarity search. In VideoGram, the feature space is modeled as a set of visual words. Each database video sequence is mapped into a string. A gram-based indexing structure is then built to tackle the effect of the “curse of dimensionality” and support video subsequence matching. Given a high-dimensional query video sequence, retrieval is performed by transforming the query into a string and then searching the matched strings from the index structure. By doing so, expensive high-dimensional similarity computations can be completely avoided. An efficient sequence search algorithm with upper bound pruning power is also presented. We conduct an extensive performance study on real-life video collections to validate the novelties of our proposal. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Unsupervised Hierarchical Feature Learning Framework for One-Shot Image Recognition

    Page(s): 621 - 632
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1954 KB) |  | HTML iconHTML  

    One-shot recognition has attracted increasing attention recently, inspired by the fact that human cognitive systems could perform recognition tasks well provided only one or a few labeled training samples, in contrast to the conventional object recognition systems that require a large number of labeled training images. One-shot recognition is a visual classification task, where only one training sample is available for each object category in the target test domain, with the help of prior-knowledge data from the source domain. In this paper, we tackle this challenging one-shot recognition problem under a more exciting setting by using only unlabeled images as prior knowledge, which requires less labeling effort than previous works which adopt fully labeled data and/or a sophisticated attribute table designed by human experts. We propose a novel unsupervised hierarchical feature learning framework to learn a feature pyramid from the prior-knowledge domain. The proposed feature learning method also could be applied across multiple feature spaces. Furthermore, we propose using pyramid-matching kernels to combine multilevel features. Examining the “Animals with Attributes” and Caltech-4 data sets in our one-shot recognition setting, we show that the proposed unsupervised feature learning approach with very limited information could achieve comparable performance to that of supervised ones. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Video-to-Shot Tag Propagation by Graph Sparse Group Lasso

    Page(s): 633 - 646
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2371 KB) |  | HTML iconHTML  

    Traditional approaches to video tagging are designed to propagate tags at the same level, such as assigning the tags of training videos (or shots) to the test videos (or shots), such as generating tags for the test video when the training videos are associated with the tags at the video-level or assigning tags to the test shot when given a collection of annotated shots. This paper focuses on automatical shot tagging given a collection of videos with the tags at the video-level. In other words, we aim to assign specific tags from the training videos to the test shot. The paper solves the V2S issue by assigning the test shot with the tags deriving from parts of the tags in a part of training videos. To achieve the goal, the paper first proposes a novel Graph Sparse Group Lasso (shorted for GSGL) model to linearly reconstruct the visual feature of the test shot with the visual features of the training videos, i.e., finding the correlation between the test shot and the training videos. The paper then proposes a new tagging propagation rule to assign the video-level tags to the test shot by the learnt correlations. Moreover, to effectively build the reconstruction model, the proposed GSGL simultaneously takes several constraints into account, such as the inter-group sparsity, the intra-group sparsity, the temporal-spatial prior knowledge in the training videos and the local structure of the test shot. Extensive experiments on public video datasets are conducted, which clearly demonstrate the effectiveness of the proposed method for dealing with the video-to-shot tag propagation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Segmentation and Rectification of Pictures in the Camera-Captured Images of Printed Documents

    Page(s): 647 - 660
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2844 KB) |  | HTML iconHTML  

    This paper presents an algorithm that segments and rectifies pictures in camera-captured document images. Most of the conventional methods for this purpose require the 3-D shape of document surface, which are usually measured or inferred by a depth-measuring device, structured light, or stereo system. Unlike these methods, our method requires only a single-view image and a user-provided rough bounding box on the picture. Hence, the main features of the proposed algorithm are simple user interaction and short processing time: a mega-pixel size image can be segmented and rectified within 1-2 s, on receiving the user's bounding box. To achieve this goal, we develop a novel boundary extraction algorithm that exploits the specific properties of printed material. In the method, a set of boundary candidates is generated, and the optimal boundary is found by using an alternating optimization scheme. In addition to the segmentation method, we also propose a new rectification method, which can largely remove perspective distortions. Experimental results on a variety of images show that our method is efficient, robust, and easy to use. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Feature Selection for Multimedia Analysis by Sharing Information Among Multiple Tasks

    Page(s): 661 - 669
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1614 KB) |  | HTML iconHTML  

    While much progress has been made to multi-task classification and subspace learning, multi-task feature selection has long been largely unaddressed. In this paper, we propose a new multi-task feature selection algorithm and apply it to multimedia (e.g., video and image) analysis. Instead of evaluating the importance of each feature individually, our algorithm selects features in a batch mode, by which the feature correlation is considered. While feature selection has received much research attention, less effort has been made on improving the performance of feature selection by leveraging the shared knowledge from multiple related tasks. Our algorithm builds upon the assumption that different related tasks have common structures. Multiple feature selection functions of different tasks are simultaneously learned in a joint framework, which enables our algorithm to utilize the common knowledge of multiple tasks as supplementary information to facilitate decision making. An efficient iterative algorithm is proposed to optimize it, whose convergence is guaranteed. Experiments on different databases have demonstrated the effectiveness of the proposed algorithm. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Online Allocation of Communication and Computation Resources for Real-Time Multimedia Services

    Page(s): 670 - 683
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3053 KB) |  | HTML iconHTML  

    In a network, the location of the node on which a service is computed is inextricably linked to the locations of the paths through which the service communicates. Hence, service location can have a profound effect on quality of service, especially for communication-centric applications such as real-time multimedia. In this paper, we propose an online algorithm that uses pricing to consider server load, route congestion, and propagation delay jointly when locating servers and routes for real-time multimedia services in a network with fixed computing and communication capacities. The algorithm is online in the sense that it is able to sequentially allocate resources for services with long and unknown duration as demands arrive, without the benefit of looking ahead to later demands. By formulating the problem as one of lowest cost subgraph packing, we prove that our algorithm is nevertheless C-competitive with the optimal algorithm that looks ahead, meaning that our performance is within a constant factor C of optimal, as measured by the total number of service demands satisfied, or total user utility. Using mixing services as an example, we show through experimental results that our algorithm can adapt to cross traffic and automatically route around congestion and failure of nodes and edges, can reduce latency by 40% or more, and can pack 20% more sessions or alternatively can double the number of sessions before significant call rejection, compared with conventional approaches. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Differential Coding-Based Scheduling Framework for Wireless Multimedia Sensor Networks

    Page(s): 684 - 697
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2383 KB) |  | HTML iconHTML  

    In wireless multimedia sensor networks (WMSNs), visual correlation exists among multiple nearby cameras, thus leading to considerable redundancy in the collected images. This paper proposes a differential coding-based scheduling framework for efficiently gathering visually correlated images. This framework consists of two components including MinMax Degree Hub Location (MDHL) and Maximum Lifetime Scheduling (MLS). The MDHL problem aims to find the optimal locations for the multimedia processing hubs, which operate on different channels for concurrently collecting images from adjacent cameras, such that the number of channels required for frequency reuse is minimized. After associating camera sensors with proper hubs, the MLS problem targets at designing a schedule for the cameras such that the network lifetime of the cameras is maximized by letting highly correlated cameras perform differential coding on the fly. It is proven in this paper that the MDHL problem is NP-complete, and the MLS problem is NP-hard. Consequently, approximation algorithms are proposed to provide bounded performance. Since the designed algorithms only take the camera settings as inputs, they are independent of specific multimedia applications. Experiments and simulations show that the proposed differential coding-based scheduling can effectively enhance the network throughput and the energy efficiency of camera sensors. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Joint Social and Content Recommendation for User-Generated Videos in Online Social Network

    Page(s): 698 - 709
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1757 KB) |  | HTML iconHTML  

    Online social network is emerging as a promising alternative for users to directly access video contents. By allowing users to import videos and re-share them through the social connections, a large number of videos are available to users in the online social network. The rapid growth of the user-generated videos provides enormous potential for users to find the ones that interest them; while the convergence of online social network service and online video sharing service makes it possible to perform recommendation using social factors and content factors jointly. In this paper, we design a joint social-content recommendation framework to suggest users which videos to import or re-share in the online social network. In this framework, we first propose a user-content matrix update approach which updates and fills in cold user-video entries to provide the foundations for the recommendation. Then, based on the updated user-content matrix, we construct a joint social-content space to measure the relevance between users and videos, which can provide a high accuracy for video importing and re-sharing recommendation. We conduct experiments using real traces from Tencent Weibo and Youku to verify our algorithm and evaluate its performance. The results demonstrate the effectiveness of our approach and show that our approach can substantially improve the recommendation accuracy. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Optimization Framework for QoS-Enabled Adaptive Video Streaming Over OpenFlow Networks

    Page(s): 710 - 715
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (889 KB) |  | HTML iconHTML  

    OpenFlow is a programmable network protocol and associated hardware designed to effectively manage and direct traffic by decoupling control and forwarding layers of routing. This paper presents an analytical framework for optimization of forwarding decisions at the control layer to enable dynamic Quality of Service (QoS) over OpenFlow networks and discusses application of this framework to QoS-enabled streaming of scalable encoded videos with two QoS levels. We pose and solve optimization of dynamic QoS routing as a constrained shortest path problem, where we treat the base layer of scalable encoded video as a level-1 QoS flow, while the enhancement layers can be treated as level-2 QoS or best-effort flows. We provide experimental results which show that the proposed dynamic QoS framework achieves significant improvement in overall quality of streaming of scalable encoded videos under various coding configurations and network congestion scenarios. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • TMM EDICS

    Page(s): 716
    Save to Project icon | Request Permissions | PDF file iconPDF (308 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Multimedia information for authors

    Page(s): 717 - 718
    Save to Project icon | Request Permissions | PDF file iconPDF (131 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Multimedia 2013 Prize Paper Award

    Page(s): 719
    Save to Project icon | Request Permissions | PDF file iconPDF (70 KB)  
    Freely Available from IEEE

Aims & Scope

The scope of the Periodical is the various aspects of research in multimedia technology and applications of multimedia.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Chang Wen Chen
State University of New York at Buffalo