By Topic

Multimedia, IEEE Transactions on

Issue 4 • Date June 2010

Filter Results

Displaying Results 1 - 16 of 16
  • Table of contents

    Page(s): C1 - C4
    Save to Project icon | Request Permissions | PDF file iconPDF (44 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Multimedia publication information

    Page(s): C2
    Save to Project icon | Request Permissions | PDF file iconPDF (36 KB)  
    Freely Available from IEEE
  • A Bayesian Approach to Automated Creation of Tactile Facial Images

    Page(s): 233 - 246
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1921 KB) |  | HTML iconHTML  

    Portrait photos (facial images) play important social and emotional roles in our life. This type of visual media is unfortunately inaccessible by users with visual impairment. This paper proposes a systematic approach for automatically converting human facial images into a tactile form that can be printed on a tactile printer and explored by a user who is blind. We propose a deformable Bayesian Active Shape Model (BASM), which integrates anthropometric priors with shape and appearance information learnt from a face dataset. We design an inference algorithm under this model for processing new face images to create an input-adaptive face sketch. Further, the model is enhanced by input-specific details through semantic-aware processing. We report experiments on evaluating the accuracy of face alignment using the proposed method, with comparison with other state-of-the-art results. Furthermore, subjective evaluations of the produced tactile face images were performed by 17 persons including six visually-impaired users, confirming the effectiveness of the proposed approach in conveying via haptics vital visual information in a face image. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Real-Time Framework for Video Time and Pitch Scale Modification

    Page(s): 247 - 256
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1023 KB) |  | HTML iconHTML  

    A framework is presented which addresses the issues related to the real-time implementation of synchronized video and audio time-scale and pitch-scale modification algorithms. It allows for seamless real-time transition between continually varying, independent time-scale and pitch-scale parameters arising as a result of manual or automatic intervention. We illuminate the problems which arise in a real-time context as well as provide novel solutions to prevent artifacts, minimize latency, and improve synchronization. The time and pitch scaling approach is based on a modified phase vocoder with optional phase locking and an integrated transient detector which enables high-quality transient preservation in real-time. A novel method for audio/visual synchronization was implemented in order to ensure no perceptible latency between audio and video while real-time time scaling and pitch shifting is applied. Evaluation results are reported which demonstrate both high audio quality and minimal synchronization error. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Image-Based Approach to Video Copy Detection With Spatio-Temporal Post-Filtering

    Page(s): 257 - 266
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1136 KB) |  | HTML iconHTML  

    This paper introduces a video copy detection system which efficiently matches individual frames and then verifies their spatio-temporal consistency. The approach for matching frames relies on a recent local feature indexing method, which is at the same time robust to significant video transformations and efficient in terms of memory usage and computation time. We match either keyframes or uniformly sampled frames. To further improve the results, a verification step robustly estimates a spatio-temporal model between the query video and the potentially corresponding video segments. Experimental results evaluate the different parameters of our system and measure the trade-off between accuracy and efficiency. We show that our system obtains excellent results for the TRECVID 2008 copy detection task. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Stochastic Approach to Image Retrieval Using Relevance Feedback and Particle Swarm Optimization

    Page(s): 267 - 277
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (860 KB) |  | HTML iconHTML  

    Understanding the subjective meaning of a visual query, by converting it into numerical parameters that can be extracted and compared by a computer, is the paramount challenge in the field of intelligent image retrieval, also referred to as the ??semantic gap?? problem. In this paper, an innovative approach is proposed that combines a relevance feedback (RF) approach with an evolutionary stochastic algorithm, called particle swarm optimizer (PSO), as a way to grasp user's semantics through optimized iterative learning. The retrieval uses human interaction to achieve a twofold goal: 1) to guide the swarm particles in the exploration of the solution space towards the cluster of relevant images; 2) to dynamically modify the feature space by appropriately weighting the descriptive features according to the users' perception of relevance. Extensive simulations showed that the proposed technique outperforms traditional deterministic RF approaches of the same class, thanks to its stochastic nature, which allows a better exploration of complex, nonlinear, and highly-dimensional solution spaces. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Image Classification With Kernelized Spatial-Context

    Page(s): 278 - 287
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1089 KB) |  | HTML iconHTML  

    The goal of image classification is to classify a collection of unlabeled images into a set of semantic classes. Many methods have been proposed to approach this goal by leveraging visual appearances of local patches in images. However, the spatial context between these local patches also provides significant information to improve the classification accuracy. Traditional spatial contextual models, such as two-dimensional hidden Markov model, attempt to construct one common model for each image category to depict the spatial structures of the images in this class. However due to large intra-class variances in an image category, one single model has difficulties in representing various spatial contexts in different images. In contrast, we propose to construct a prototype set of spatial contextual models by leveraging the kernel methods rather than only one model. Such an algorithm combines the advantages of rich representation ability of spatial contextual models as well as the powerful classification ability of kernel method. In particular, we propose a new distance measure between different spatial contextual models by integrating joint appearance-spatial image features. Such a distance measure can be efficiently computed in a recursive formulation that scales well to image size. Extensive experiments demonstrate that the proposed approach significantly outperforms the state-of-the-art approaches. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Constructing Concept Lexica With Small Semantic Gaps

    Page(s): 288 - 299
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1553 KB) |  | HTML iconHTML  

    In recent years, constructing mathematical models for visual concepts by using content features, i.e., color, texture, shape, or local features, has led to the fast development of concept-based multimedia retrieval. In concept-based multimedia retrieval, defining a good lexicon of high-level concepts is the first and important step. However, which concepts should be used for data collection and model construction is still an open question. People agree that concepts that can be easily described by low-level visual features can construct a good lexicon. These concepts are called concepts with small semantic gaps. Unfortunately, there is very little research found on semantic gap analysis and on automatically choosing multimedia concepts with small semantic gaps, even though differences of semantic gaps among concepts are well worth investigating. In this paper, we propose a method to quantitatively analyze semantic gaps and develop a novel framework to identify high-level concepts with small semantic gaps from a large-scale web image dataset. Images with small semantic gaps are selected and clustered first by defining a confidence score and a content-context similarity matrix in visual space and textual space. Then, from the surrounding descriptions (titles, categories, and comments) of these images, concepts with small semantic gaps are automatically mined. In addition, considering that semantic gap analysis depends on both features and content-contextual consistency, we construct a lexicon family of high-level concepts with small semantic gaps (LCSS) based on different low-level features and different consistency measurements. This set of lexica is both independent to each other and mutually complimentary. LCSS is very helpful for data collection, feature selection, annotation, and modeling for large-scale image retrieval. It also shows a promising application potential for image annotation refinement and rejection. The experimental results demonstrate the validity- - of the developed concept lexica. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Adaptive Computational Model for Salient Object Detection

    Page(s): 300 - 316
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2939 KB) |  | HTML iconHTML  

    Salient object detection is a basic technique for many computer vision applications. In this paper, we propose an adaptive computational model to detect the salient object in color images. Firstly, three human observation behaviors and scalable subtractive clustering techniques are used to construct attention Gaussian mixture model (AGMM) and background Gaussian mixture model (BGMM). Secondly, the Bayesian framework is employed to classify each pixel into salient object or background object. Thirdly, expectation-maximization (EM) algorithm is utilized to update the parameters of AGMM, BGMM, and Bayesian framework based on the detection results. Finally, the classification and update procedures are repeated until the detection results evolve to a steady state. Experiments on a variety of images demonstrate the robustness of the proposed method. Extensive quantitative evaluations and comparisons demonstrate that the proposed method significantly outperforms state-of-the-art methods. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Adaptive Strategy for Mobile Ad Hoc Media Streaming

    Page(s): 317 - 329
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (536 KB) |  | HTML iconHTML  

    Mobile devices are increasingly popular and many of them are capable of handling multimedia content. Users enjoy the ability to access their collection of media objects anywhere. Wireless connectivity is often integrated into these handhelds, therefore providing the opportunity to stream multimedia content among mobile and ad hoc peers. An important consideration is to transfer a multimedia object in its entirety. This is often challenging since the transmission time for such an object can be considerable due to an unfavorable combination of a large object size and limited available bandwidth. We previously introduced a novel strategy to improve the probability of success to stream a video sequence based on studying the minimum buffer size. This strategy takes advantage of layered video encoding schemes such as scalable video coding (SVC) or multiple description coding (MDC). The technique adaptively selects the number of layers to be streamed to deliver more frames before the wireless link disconnects while keeping the video quality high. In our current study, we simplify this strategy by using the streaming probability alone to dynamically adjust the number of layers to be delivered. Our proposed technique improves the prediction accuracy by incorporating the 802.11 Auto-Rate Fallback (ARF) scheme along with two popular mobility models: the random waypoint and the random walk mobility models. While ARF-which steps down the sending rate when consecutive transmission errors occur-is implemented in all hardware that follows the popular IEEE 802.11 standard, it is not commonly modeled in existing work. In addition, our approach can retransmit missing layers if peers reconnect after a link break, hence improving the rendering quality. We have performed extensive simulations to validate our technique and the results show an improvement in streaming probability as well as the number of layers that are transmitted. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • In-Image Accessibility Indication

    Page(s): 330 - 336
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1487 KB) |  | HTML iconHTML  

    There are about 8% of men and 0.8% of women suffering from colorblindness. Due to the loss of certain color information, regions or objects in several images cannot be recognized by these viewers and this may degrade their perception and understanding of the images. This paper introduces an in-image accessibility indication scheme, which aims to automatically point out regions in which the content can hardly be recognized by colorblind viewers in a manually designed image. The proposed method first establishes a set of points around which the patches are not prominent enough for colorblind viewers due to the loss of color information. The inaccessible regions are then detected based on these points via a regularization framework. This scheme can be applied to check the accessibility of designed images, and consequently it can be used to help designers improve the images, such as modifying the colors of several objects or components. To our best knowledge, this is the first work that attempts to detect regions with accessibility problems in images for colorblindness. Experiments are conducted on 1994 poster images and empirical results have demonstrated the effectiveness of our approach. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Improvements on Sun 's Conditional Access System in Pay-TV Broadcasting Systems

    Page(s): 337 - 340
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (177 KB) |  | HTML iconHTML  

    A conditional access system (CAS) proposed by Sun has a critical security weakness in its inability to preserve backward secrecy; a former subscriber can still access programs despite his or her change in status. This weakness in Sun 's CAS originates because 1) no change is made to a group key after a new member arrives, and 2) updates of group keys are done in an insecure manner. We show how simple protocol changes can fix these weaknesses and thus render Sun 's CAS capable of preserving backward secrecy. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • IEEE Transactions on Multimedia EDICS

    Page(s): 341
    Save to Project icon | Request Permissions | PDF file iconPDF (16 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Multimedia Information for authors

    Page(s): 342 - 343
    Save to Project icon | Request Permissions | PDF file iconPDF (46 KB)  
    Freely Available from IEEE
  • Special issue on Using the Physical the Layer for Securing the Next Generation of Communication Systems

    Page(s): 344
    Save to Project icon | Request Permissions | PDF file iconPDF (140 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Multimedia society information

    Page(s): C3
    Save to Project icon | Request Permissions | PDF file iconPDF (27 KB)  
    Freely Available from IEEE

Aims & Scope

The scope of the Periodical is the various aspects of research in multimedia technology and applications of multimedia.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Chang Wen Chen
State University of New York at Buffalo