Scheduled System Maintenance:
Some services will be unavailable Sunday, March 29th through Monday, March 30th. We apologize for the inconvenience.
By Topic

MultiMedia, IEEE

Early Access Articles

Early Access articles are new content made available in advance of the final electronic or print versions and result from IEEE's Preprint or Rapid Post processes. Preprint articles are peer-reviewed but not fully edited. Rapid Post articles are peer-reviewed and edited but not paginated. Both these types of Early Access articles are fully citable from the moment they appear in IEEE Xplore.

Filter Results

Displaying Results 1 - 25 of 29
  • Cross-Platform Social Event Detection

    Publication Year: 2015 , Page(s): 1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (253 KB)  

    A large part of media shared on online platforms such as Flickr and YouTube is captured at various social events (e.g. music festivals, exhibitions, and sport events). While it is quite easy to share personal impressions online, it is much more challenging to identify content that is related to the same social event across different platforms. In this paper we focus on the detection of social events in a data collection from Flickr and YouTube. We propose an unsupervised, multi-staged approach that explores commonly available, real-world metadata for the detection and linking of social events across sharing platforms. The proposed methodology and the performed experiments allow for a thorough evaluation of the usefulness of available metadata in the context of social event detection in both single-platform and cross-platform scenario. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Manipulating Ultra High Definition Video Traffic

    Publication Year: 2015 , Page(s): 1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (777 KB)  

    The Ultra High Definition (UHD) video format was recently defined by Recommendation ITU-R BT.2020. Compared to the widely deployed HD video format, UHD defines video parameters associated with higher spatial resolutions, higher frame rates, higher sample bit depths, and a wider color gamut. The UHD format promises to significantly enhance user experience with pictures that offers the "look out the window" effect. However, the promise of increased picture quality comes at the cost of increased bandwidth required to deliver the UHD video. Despite the fact that broadband network capacities continue to increase rapidly around the globe, the potential burden of delivering UHD video service to the home requires advanced video compression, storage, and delivery solutions. This article explores a set of suitable solutions based on the latest video compression and delivery technologies. Focusing on the on-demand video streaming use case, the proposed solution uses the Scalable extensions of High Efficiency Video Coding standard (SHVC) to efficiently compress, store, and deliver the UHD video content in a large-scale streaming system. With the scalability features offered by SHVC, the proposed solution can intrinsically maintain backward compatibility to legacy devices, thereby ensuring that quality of the existing HD video services are not degraded. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Designing an Interactive Audio Interface for Climate Science

    Publication Year: 2015 , Page(s): 1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (6420 KB)  

    This paper presents a user centred design approach to create an audio interface in the context of climate science. Contextual inquiry including think-aloud protocols was used to gather information about scientists' workflows in a first round of interviews. Furthermore, focus groups were used to assess data about the specific use of language by climate experts. The interviews have been analysed for their language content as well. The goal is to help realising a domain-specific description of the sonifications, and identifying climate metaphors to help building a metaphoric sound identity for the sonification. An audio interface shall enrich the perceptualisation possibilities, based on the language metaphors derived from the interviews. Later, in a separate set of experiments, the participants were asked to pair sound stimuli with climate terms extracted from the first interviews and evaluate the sound samples aesthetically. They were asked to choose sound textures (from a set of sounds given to them) that best express the specific climate parameter and rate the relevance of the sound to the metaphor. Correlations between climate terminology and sound stimuli for the sonification tool is assessed to improve the sound design. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Sonic trampoline: the effect of audio feedback on the user experience during an exercise of jumping

    Publication Year: 2015 , Page(s): 1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1347 KB)  

    In this paper we examine the influence of auditory augmentation on the act of jumping on an elastic trampoline. To do so we have developed a system that interactively augments the sound of a trampoline when a user is jumping on it. The sound design is inspired by iconic jumping sounds from games and the synthesis engine allows for parametric control of the sound features. The sensing technology is based on a combination of motion tracking with depth camera and audio-based contact sensing between the feet and the trampoline. The system lets the user interactively control the sound by jumping. We conducted a study to evaluate the effect of manipulating the auditory feedback during the jumping exercises. Results show that our interactive sonification positively affects the user experience during the exercise and stimulates changes in the user's behavior toward increased performance. Our system and study provide evidence that interactive sonification can act as a motivational tool in training and add an extra fun-factor to body-controlled games. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Instrumented Ankle-Foot Orthosis with Auditory Biofeedback for Blind and Sighted Individuals

    Publication Year: 2015 , Page(s): 1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (387 KB)  

    GaitEcho, a wearable auditory biofeedback device using an instrumented ankle-foot orthosis for gait rehabilitation, was developed. Its feasibility for rehabilitating sighted and blind individuals was investigated employing a reference-tracking task for an ankle-joint exercise. Experimental results suggested that GaitEcho offers similarly adequate functionality (i.e., angle controllability, timing controllability, and task difficulty) for both blind and sighted participants in conducting ankle-joint exercises. Furthermore, blind participants reported higher understandability and enjoyment than sighted participants, suggesting positive emotional effect of auditory biofeedback for blind users. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Interactive Sonification in Rowing: An Application of Acoustic Feedback for On-Water Training

    Publication Year: 2015 , Page(s): 1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (410 KB)  

    Feedback-systems used in elite sport mainly provide visual information. A different approach displays the information audibly by using sonification to create coherency between action and reaction. This paper describes the experiences of using sonification as acoustic feedback (AF) in on-water rowing training with elite athletes. On the theoretical basis of an ecological dynamics approach, the audio-motor relationship was elucidated for understanding expertise and skill acquisition in sport. The results gained from athlete surveys within the years 2009-2013 were presented to determine if AF reflects specific sections within the rowing-motion comprehensible for the athletes and if the information is provided appropriately to be useful for technique training. The final aim was to provide criteria and recommendations for the development of sonification-based applications within a moving context for both, sports and rehabilitation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Rhythmic Walking Interaction with Auditory Feedback: Ecological Approaches in a Tempo Following Experiment

    Publication Year: 2015 , Page(s): 1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (194 KB)  

    This study presents a system capable of rhythmic walking interactions by auditory display. The feedback is based on footstep sounds, and follows either detected footsteps, or suggests a tempo, which is either constant or adapts to the walker. The auditory display contains simple sinusoidal tones or ecological, physically-based synthetic walking sounds. In the tempo-following experiment, we investigate the different interaction modes (step versus constant or adaptive tempo) and auditory feedback (sinusoidal tones versus ecological walking sounds) with respect to their effect on the walking tempo. Quantitatively, we calculate the mean square error (MSE) between the performed and target tempo, and the stability of the performed tempo. The results indicate that the MSE with ecological sounds is better or comparable to that with the sinusoidal tones, yet ecological sounds are considered more natural. Allowing deviations from the cues in the adaptive conditions results in a tempo that is still stable, but closer to the natural walking pace of the subjects. These results have implications on the design of interactive entertainment or rehabilitation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Data-Driven Scene Understanding with Adaptively Retrieved Exemplars

    Publication Year: 2015 , Page(s): 1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (678 KB)  

    This article investigates a data-driven approach for semantically scene understanding, without pixelwise annotation and classifier pre-training. Our framework parses a target image with two steps: (i) retrieving its exemplars (i.e. references) from an image database, where all images are unsegmented but annotated with tags; (ii) recovering its pixel labels by propagating semantics from the references. We present a novel framework making the two steps mutually conditional and bootstrapped under the probabilistic Expectation-maximization (EM) formulation. In the first step, the references are selected by jointly matching their appearances with the target as well as the semantics (i.e. the assigned labels of the target and the references). We process the second step via a combinatorial graphical representation, in which the vertices are superpixels extracted from the target and its selected references. Then we derive the potentials of assigning labels to one vertex of the target, which depends upon the graph edges that connect the vertex to its spatial neighbors of the target and to its similar vertices of the references. Two steps can be both solved analytically, and the inference is conducted in a self-driven fashion. In the experiments, we validate our approach on two public databases, and demonstrate superior performances over the state-of-the-art methods. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Exploring Effects of Auditory Feedback on Menu Selection in Hand-Gesture Interfaces

    Publication Year: 2015 , Page(s): 1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (738 KB)  

    With the increasing number of new motion-sensing hardware, improving the usability of gesture interfaces is becoming more important. However, people do not accept gesture interfaces as comfortably as traditional ones because they lack in tactile or haptic feedback that traditional input methods such as mice or keyboards provide. However, having tactile feedback in gesture interfaces is not possible unless users are wearing a device with actuators. Therefore, auditory feedback is an appropriate and unique alternative for assisting visual feedback in gesture interfaces. In this article, we propose various types of novel auditory feedback methods and explore their effects as secondary feedback on complementing visual feedback. We performed a user study for a menu selection task and experimental results show that the proposed auditory feedback is significantly more efficient and more effective than visual-only feedback or conventional auditory feedback in terms of time and accuracy. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Sonification of virtual and real surface tapping: evaluation of behavior changes, surface perception and emotional indices

    Publication Year: 2015 , Page(s): 1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (146 KB)  

    The audio-feedback resulting from object interaction provides information about the material of the surface and about one's own motor behavior. With the current developments in interactive sonification, it is now possible to digitally change this audio-feedback, and thus, the use of interactive sonification becomes a compelling approach to shape tactile surface interactions. Here, we present a prototype for a sonic interactive surface, capable of delivering surface tapping sounds in real-time, when triggered by the users' taps on a real surface or on an imagined, "virtual" surface. In this system, the delivered audio-feedback can be varied so that the heard tapping sounds correspond to different applied strength during tapping. Here, we also propose a multi-dimensional measurement approach to evaluate user experiences of multi-modal interactive systems. We evaluated our system by looking at the effect of the altered tapping sounds on emotional action-related responses, users' way of interacting with the surface, and perceived surface hardness. Results show the influence of the sonification of tapping at all levels: emotional, behavioral and perceptual. These results have implications in the design of interactive sonification displays and tangible auditory interfaces aiming to change perceived and subsequent motor behaviour, as well as perceived material properties. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Selecting Interesting Image Regions to Automatically Create Cinemagraphs

    Publication Year: 2015 , Page(s): 1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (475 KB)  

    A cinemagraph is a novel medium that infuses a static image with the dynamics of one or several particular image regions. It is in many ways between a photograph and video, and it has numerous potential applications, such as in the creation of dynamic scenes in computer games and interactive environments. However, creating cinemagraphs is a time-consuming process requiring high proficiency in photo-editing techniques. This paper presents a novel framework for automatically creating cinemagraphs from video sequences, with specific emphasis on determining the composition of masks and layers in creating aesthetically pleasing cinemagraphs. Treating video as a spatiotemporal data volume, the problem is considered a type of constrained optimization problem involving the discovery of a connected subgraph in video frames with maximal cumulative interestingness scores. The proposed framework accommodates multiple criteria describing qualities of interest in local image patches based on appearance and motion. Furthermore, the selected regions are not limited to certain shapes--the proposed approach facilitates capturing arbitrary objects. Experiments demonstrate the performance of the proposed approach. The findings of this study provide valuable information regarding various design choices for developing an easy and versatile authoring tool for cinemagraphs. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Saliency-guided deep framework for image quality assessment

    Publication Year: 2014 , Page(s): 1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1663 KB)  

    Image quality assessment (IQA) has thrived for decades, and still remains its significance in the fields of image processing and computer vision. However, instead of pure engineering applications, researchers become keener to explore how human brain perceives the visual stimuli. Massive psychological evidences show that human beings prefer qualitative description to evaluate image quality, however most IQA researches still concentrate on the numerical one. Furthermore, the hand-crafting features are widely used in this community, which constrains the models' flexibility. Therefore, a novel model is proposed with two major advantages: 1) the saliency-guided feature learning is able to learn features unsupervisedly; 2) the deep framework recasts IQA as a classification problem, analogous to human qualitative evaluation. The experiments are conducted on popular databases to validate the effectiveness of the proposed model. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Media contracts formalization using a standardized contract expression language

    Publication Year: 2014 , Page(s): 1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (379 KB)  

    Contract Expression Languages allow representing business contracts in a digital, structured form. Some examples of XML-based languages are the Content Reference Forum format, the OASIS eContracts standard or a proposed extension for MPEG-21 Part 5 for contracts. These formats have influenced in the design of the MPEG-21 Parts 20 Contract Expression Language (CEL) and 21 Media Contract Ontology (MCO), which have been recently specified by modelling the most relevant clauses found in a large set of contracts in the audiovisual sector. The MPEG-21 CEL, described in this paper, defines a language for representing media contracts as XML. It is structured in two schemas, a core defining the structural elements of a contract, and an extension with vocabulary for specific applications. An exemplary mapping of a contract instance is discussed. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Bi-directional Mesh-based Frame Rate Up-conversion with a Dense Motion Vector Map

    Publication Year: 2014 , Page(s): 1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1159 KB)  

    In this paper, we propose a new frame rate up-conversion (FRUC) method for temporal video quality enhancement. The proposed algorithm generates an interpolated frame in between two given frames based on a bi-directional mesh interpolation (BMI), in order to cope with not only translation, but also with scale and rotation changes. BMI performance is highly influenced by the accuracy of the correspondences between the control points in the two frames. To achieve an accurate dense motion vector map (MVM) through bi-directional and uni-directional motion estimation, an initial MVM is formed by the transmitted motion vectors from coded bitstream with low computational complexity. Then, the interpolated frame is generated by frame-based BMI with the dense MVM. In our experiments, we found out that the proposed algorithm is about 2dB better than several conventional FRUC methods. Furthermore, block artifacts and blur artifacts are significantly diminished by the proposed algorithm. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Near-duplicate Video Retrieval: Current Research and Future Trends

    Publication Year: 2013 , Page(s): 1
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (174 KB)  

    The exponential growth of online videos, along with the increasing user involvements to video-related activities, has been observed as a constant phenomenon during last decade. User's time spent on video capturing, editing, uploading, searching and viewing has boosted to an unprecedented level. The massive publishing and sharing of videos has given rise to the existence of a already-large amount of near-duplicate content. This imposes urgent demands on near-duplicate video retrieval as a key role in novel tasks such as video search, video copyright protection, video recommendation, and many more. Driven by its significance, near-duplicate video retrieval has recently attracted lots of attention. As discovered in recent works, latest improvements and progresses in near-duplicate video retrieval as well as related topics including low-level feature extraction, signature generation and high-dimensional indexing, are employed to assist the process. As we survey the works in near-duplicate video retrieval, we comparatively investigate existing variants of the definition of near-duplicate video, describe a generic framework, summarize state-of-the-art practices, and explore the emerging trends of this research topic. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • GeoDec: A Framework to Effectively Visualize and Query Geospatial Data for Decision-Making

    Publication Year: 2010 , Page(s): 1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (778 KB) |  | HTML iconHTML  

    In this paper, we discuss GeoDec, our end-to-end system that enables geospatial decision-making by virtualizing the real-world geolocations. With GeoDec, first the geolocation of interest is rapidly and realistically simulated and all relevant geospatial data are accurately fused and embedded in the virtualized model. Subsequently, users can interactively formulate abstract decision-making queries in terms of a wide range of fundamental spatiotemporal queries supported by GeoDec and evaluate the queries in order to verify decisions in the virtual world prior to executing the decisions in real world. GeoDec blends a variety of techniques developed in the fields of databases, artificial intelligence, computer graphics and computer vision into an integrated three-tier architecture. We elaborate on various components of this architecture, which includes an extensive, multimodal and dynamic data tier, an efficient, expressive and extensible query-interface tier and an immersive, flexible and effective presentation tier. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Example-Based Objective Quality Estimation for Compressed Images

    Publication Year: 2009 , Page(s): 1
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (675 KB) |  | HTML iconHTML  

    Quantization noise is one of the dominant distortions of image compression, and its amplitude usually needs to be estimated for image quality assessment, restoration and enhancement. One such estimation, the peak signal-to-noise ratio (PSNR), has commonly been used as an objective quality measure. However, this measure has limitation in practical applications as it requires as a reference the original image, which is not always available to end users. To overcome the limitation, blind or non-reference PSNR estimation has received much attention in the literature as it requires not the original image, but some statistics of the original image, such as the probability density functions (PDFs) of original discrete cosine transform (DCT) coefficients. Assuming that PDFs of DCT coefficients follow Laplacian distribution, we propose here a new method to estimate the key parameter of the distribution from a set of training data, consisting of a variety of typical images compressed with various quantization parameters. Our experimental results show that the proposed method can estimate the PSNR of a given image more accurately, with smaller estimation bias and variance, as compared to the existing methods. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multimedia Presentation for Computer Games and Web 3.0

    Publication Year: 2009 , Page(s): 1
    Cited by:  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (296 KB)  

    The HTML standard is an old and text-based format that is designed for narrowband networks. New enhancements to HTML, as well as new formats, are being evaluated. We wish to present a multimedia presentation system, format, and use cases for a wide range of context uses, including Web 3.0 and computer games. The system is compact and enables high performance 2D and 3D graphics. Use cases include in-game heads-up displays, remote playing of games, and 3D authoring for Web 3.0. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An MPEG-7 Compatible Video Retrieval System with Integrated Support for Complex Multimodal Queries

    Publication Year: 2012 , Page(s): 1
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (932 KB) |  | HTML iconHTML  

    We present BilVideo-7, an MPEG-7 compatible, video indexing and retrieval system that supports complex multimodal queries in a unified framework. An MPEG-7 profile is developed to represent the videos by decomposing them into Shots, Keyframes, Still Regions and Moving Regions. The MPEG-7 compatible XML representations of videos according to this profile are obtained by the MPEG-7 compatible video feature extraction and annotation tool of BilVideo-7, and stored in a native XML database. Users can formulate text-based semantic, color, texture, shape, location, motion and spatio-temporal queries on an intuitive, easy-to-use Visual Query Interface, whose Composite Query Interface can be used to specify very complex queries containing any type and number of video segments with their descriptors. The multi-threaded Query Processing Server parses incoming queries into subqueries and executes each subquery in a separate thread. Then, it fuses subquery results in a bottom-up manner to obtain the final query result. The whole system is unique in that it provides very powerful querying capabilities with a wide range of descriptors and multimodal query processing in an MPEG-7 compatible interoperable environment. We present sample queries to demonstrate the capabilities of the system. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Diversifying Image Retrieval by Affinity Propagation Clustering on Visual Manifolds

    Publication Year: 2009 , Page(s): 1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (813 KB)  

    Many image retrieval users are concerned about the diversity of the retrieval results, as well as their relevance. In this paper, we develop a post-processing system, which is based on affinity propagation clustering on manifolds, to improve the diversity of the retrieval results without reduction of their relevance. In order to obtain the top 20 outputs (usually only the top 20 outputs of the retrieval results are shown to users) containing diverse items representing different sub-topics, a modified affinity propagation clustering on manifolds, whose parameters are optimized by minimizing the Davies-Bouldin criterion, is proposed and then performed on the top hundreds of output images of the previous support vector machine (SVM) system. Finally, after getting the clusters, to diversify the top retrieval results, we put the image with the lowest rank in each cluster into the top of the answer list. We test our proposed system on the ImageCLEF PHOTO 2008 task. The experimental results show that our method performs better in enhancing the diversity performance of the image retrieval results, comparing to other diversifying methods such as K-means, Quality Threshold (QT), date clustering (ClusterDMY), and so on. Furthermore, our method does not lead to any loss of the relevance of the retrieval results. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Managing and querying efficiently distributed semantic multimedia metadata collections

    Publication Year: 2009 , Page(s): 1
    Cited by:  Papers (2)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (277 KB)  

    Currently, many multimedia contents are acquired and stored in real time and on different locations. In order to retrieve efficiently the desired information and to avoid centralizing all metadata, we propose to compute a centralized metadata resume, i.e., a concise version of the whole metadata, which locates some desired multimedia contents on remote servers. The originality of this resume is that it is automatically constructed based on the extracted metadata. In this paper, we present a method to construct such resume and illustrate our framework with current Semantic Web technologies, such as RDF and SPARQL for representing and querying semantic metadata. Some experimental results are provided in order to show the benefits of indexing and retrieving multimedia contents without centralizing multimedia contents or their associated metadata, and to prove the efficiency of a metadata resume. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Web-based Music Lecture Database Framework with Aligned MIDI Score and Real Performance Audio

    Publication Year: 2009 , Page(s): 1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (926 KB)  

    This paper presents a framework for authoring, storing, retrieving, and presenting music lectures on the Web. For a synchronized presentation between score and recorded performance audio, we propose a dynamic programming-based algorithm for MIDI-to-Wave alignment to explore the temporal relations between MIDI and the corresponding performance recording. With rapid advances in music transcription technology, it had become more possible to align MIDI and wave in a symbolic domain. However, transcription errors usually occur when transcribing polyphonic music or multi-instruments music because the complex harmonic of different instruments. The proposed alignment algorithm works in the symbolic domain even if many transcription errors have occurred. The aligned MIDI and wave can be attached to many kinds of teaching materials. With a synchronized presentation, learners can read music scores and get instructional information when listening to certain sections of music pieces. We built an evaluation system for doing a subjective evaluation. The percentage of bars which were regarded as aligned perfectly and aligned within acceptable limits is 97.08%. The questionnaire in the evaluation system also reported positive opinions from both engineers and musicians. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Novel 2D Urban Map Search Framework Based on Attributed Graph Matching

    Publication Year: 2009 , Page(s): 1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (975 KB) |  | HTML iconHTML  

    This paper presents a novel framework for urban map search. The search capabilities of the existing GIS systems are restricted to text-based search, neglecting significant topological and semantic information. We propose a framework that aims to extend the search capabilities by offering sketch-based search. Initially, the urban maps are processed in an offline step in order to extract the topological information in the form of an attributed graph. In the online step, the user queries the system by sketching the desired network structure. The search algorithm is based on attributed graph matching of the query graph and the attributed graphs of the urban maps and allows both partial and global matching of the query. Experimental results illustrate the excellent performance of the system for intuitive search in maps. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Semantic MPEG Query Format Validation and Processing

    Publication Year: 2009 , Page(s): 1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (529 KB)  

    The MPEG Query Format (MPQF) has been developed for providing a standardized interface to multimedia document repositories by members of MPEG (ISO/IEC JTC1/SC29/WG11). The MPQF schema specifies a format for queries and replies to be interchanged between client and servers in a multimedia information search and retrieval environment. This paper presents a validation and processing architecture for the MPEG Query Format. The components of the framework introduced in detail consists of two parts. The first one is a syntactic and semantic validator which checks syntactic and semantic rules according to the underlying XML schema in order to create a semantic valid request. Especially, this paper introduces methods for evaluating MPQF semantic validation rules that are not expressed by syntactic means within the XML schema. Second, this paper highlights a first prototype implementation of a MPQF capable processing engine considering the evaluation of the textit{QueryByFreeText, QueryByXQuery, QueryByDescription} and textit{QueryByMedia} query types on a set of MPEG-7 based annotations of images. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Media Value Chain Ontology for MPEG-21

    Publication Year: 2009 , Page(s): 1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (253 KB)  

    This paper describes the Media Value Chain Ontology we have specified, a semantic representation of the Intellectual Property along the Value Chain that it is in the way of being standardized as MPEG-21 Part 19. This model defines the minimal set of kinds of Intellectual Property, the roles of the users interacting with them, and the relevant actions regarding the Intellectual Property law. Besides this, a basis for authorizations along the chain has been laid out, and the model is ready for managing class instances representing real objects and users. The computer representation has been made publicly available so that applications can interoperate around this common shared basis. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

The magazine contains technical information covering a broad range of issues in multimedia systems and applications

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
John R. Smith
IBM T.J. Watson Research Center