By Topic

Multimedia and Expo, 2000. ICME 2000. 2000 IEEE International Conference on

Date July 30 2000-Aug. 2 2000

Go

Filter Results

Displaying Results 1 - 25 of 132
  • 2000 ieee international conference on multimedia and expo - ICME 2000 [front matter]

    Page(s): i - xxxv
    Save to Project icon | Request Permissions | PDF file iconPDF (1854 KB)  
    Freely Available from IEEE
  • Joint downlink beamforming, power control, and data rate allocation for DS-CDMA mobile radio with multimedia services

    Page(s): 1455 - 1458 vol.3
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (446 KB)  

    Power control and rate allocation are two fundamental issues for improving the spectrum efficiency of downlink transmission for wireless multimedia communications. Downlink beamforming, on the other hand, can be used to suppress stronger interference induced by high rate users, thereby improving the system performance. We present a joint downlink beam-forming, power control and rate allocation technique suitable for DS-CDMA systems with multimedia services. To simplify the computational complexity a practical rate allocation algorithm as also proposed. Computer simulation results are given to evaluate downlink capacity of DS-CDMA systems using base station antenna and the new algorithm proposed. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Author index

    Page(s): 0_5 - 0_10
    Save to Project icon | Request Permissions | PDF file iconPDF (510 KB)  
    Freely Available from IEEE
  • VeriNet Web-speaker verification for the World Wide Web

    Page(s): 1497 - 1500 vol.3
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (420 KB)  

    VeriNet Web is a software system that allows Web site resources to be protected using speaker verification. Speaker verification is the most practical biometric technology available for Internet security since the required collection hardware is readily available on most personal computers (PCs) being distributed today. VeriNet Web employs a three-tier client server architecture that distributes verification processing across multiple components. It integrates into a Web application by supplying standard extensions to a commercial Web server and browser. These components work together to manage the authentication process, maintain a user's authentication status, and ensure that protected resources are only accessed by authenticated users. When developing this multimedia Internet application, there were numerous implementation issues regarding the user interface, secure data transmissions, and user PC configuration that affected the success of the system. The paper discusses these issues. In addition, the paper examines the verification performance of VeriNet Web for a trial conducted with company employees. For experiments in this environment, the system yielded very good performance, and the equal error rate was always less than 5% View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Flexible multimedia system for multimedia communication services

    Page(s): 1359 - 1362 vol.3
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (372 KB)  

    A distributed multimedia system which not only integrates various multimedia information distributed over computer networks, but also provides it to users in accordance with the user's requirements during real-time multimedia communication must guarantee the user requested quality of service (QoS) even though the computer and network resources change statically or dynamically. We have proposed a flexible multimedia system (FMS) which is based on an agent-oriented architecture and is able to organize required functions itself. This paper describes the FMS architecture for a multimedia teleconference service with adaptive QoS guarantee functions. In a multimedia teleconference service, FMS is able to realize multimedia communication services flexibly for service requests and QoS requests from users View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Issues in data embedding and synchronization for digital television

    Page(s): 1233 - 1236 vol.3
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (340 KB)  

    As digital television matures, applications which utilize data embedded in the broadcast television signal are likely to be an important feature of the new medium. The paper explores some of the issues involved in the implementation of systems for the utilization of data which accompanies digital television's video and audio content. Chief among these issues are those which pertain to: (i) how data can be embedded into the television signal in ways which utilize bandwidth and buffer space efficiently; (ii) how the embedded data can be synchronized with its associated video and audio content View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Common time reference for interactive multimedia applications

    Page(s): 1679 - 1682 vol.3
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (396 KB)  

    A delay of about 100 ms gives human communicators the feeling of live interaction. Since in a global network the propagation delay alone is about 100 ms, every other delay component, such as processing and queuing, should be kept as short as possible. Moreover, the deployment of new high bandwidth multimedia applications will boost network traffic and consequently the demand for very high capacity transmission technologies, such as Wavelength Division Multiplexing (WDM). Networks will suffer: (i) electronic switching bottlenecks among high-speed links; and (ii) communications link bottlenecks between high capacity core technologies and low speed access technologies. The paper addresses the design of interactive systems for applications such as toll quality telephony, videotelephony and videoconferencing, highlighting the benefits brought by the availability of global common time reference derived from GPS (Global Positioning System). Common time reference is essential to keep the user perceived delay within the 100 ms bound while avoiding the two above mentioned bottlenecks. The proposed solution can be applied to both IP and ATM networks, does not require changes to any of the existing protocols, and enables traffic aggregation in the core of the network. It thus does not require nodes to keep state information on microflows and it provides a guaranteed quality service to individual applications View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Modeling of the coding gain of joint coding for multi-program video transmission

    Page(s): 1309 - 1312 vol.3
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (308 KB)  

    In digital multi-program video transmission, several video programs are compressed (e.g. using MPEG-2), multiplexed, and transmitted over a constant bit rate (CBR) channel. Joint coding (or statistical multiplexing) dynamically distributes the channel capacity among programs according to picture content. This scheme is much more efficient than independent coding where each channel is allocated a fixed bit rate. The authors present a novel model to calculate the coding gain of joint coding relative to independent coding, in terms of bandwidth savings. The model takes into account statistical variations of video program complexity, and subjective picture quality View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Visualization methods for personal photo collections: browsing and searching in the PhotoFinder

    Page(s): 1539 - 1542 vol.3
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1224 KB)  

    Software tools for personal photo collection management are proliferating, but they usually have limited searching and browsing functions. We implemented the PhotoFinder prototype to enable non-technical users of personal photo collections to search and browse easily. PhotoFinder provides a set of visual Boolean query interfaces, coupled with dynamic query and query preview features. It gives users powerful search capabilities. Using a scatter plot thumbnail display and drag-and-drop interface, PhotoFinder is designed to be easy to use for searching and browsing photos View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Browsing images based on social and content similarity

    Page(s): 1567 - 1570 vol.3
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (356 KB)  

    We propose an information visualization technique that uses social and content features complementarily and helps users explore a database for images. Items in the database are laid out in a 2D map according to content similarity. The social filter dynamically shows the relationship between items in the map based on social similarity. We demonstrate an example of visualization applied to our Web graphics recommender system View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Tracking of multiple faces for human-computer interfaces and virtual environments

    Page(s): 1563 - 1566 vol.3
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (380 KB)  

    Describes a real-time face-tracking algorithm. We start with single face tracking based on statistical color modeling and a deformable template. We then expand the algorithm to track multiple faces, possibly with occlusion, by constraining the speed and size changes of the faces. We test the algorithm on sequences with different occlusion patterns and analyze the tracking performance. We also present a tracking software library based on this algorithm. This library can be applied to human-computer interfaces, lip-reading and virtual environments View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Live multimedia adaptation through wireless hybrid networks

    Page(s): 1697 - 1700 vol.3
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (444 KB)  

    We present new techniques to intelligently adapt and combine multimedia presentations in real-time, mobile service environments. We present the techniques necessary to perform mobile multimedia processing for multiple video streams with “value-adding” information. The adaptation uses video analysis, content-recognition and automated annotation to provide scalable and interactive presentations over hybrid networks to mobile terminals. As a concrete case, we present a mobile surveillance service called Princess, which has several video sources and supports control information, terminal adaptation and end-to-end service View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Lossless compression for μ-law (A-law) and IMA ADPCM on the basis of a fast RLS algorithm

    Page(s): 1775 - 1778 vol.3
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (312 KB)  

    Lossless compression methods are introduced for μ-law (A-law) and IMA ADPCM standards. Lossless compression means that for a given original input we generate the exactly same output as that of these standards while reducing the bit rates of the compressed files. To keep the same quality, we use the same quantization methods as that in the standards. To reduce the bit rates, we use prediction and entropy coding techniques. The prediction is based on a fast recursive least square (RLS) algorithm which requires less computation than existing RLS algorithms. The entropy coding is based on Huffman coding. The prediction, quantization and coding steps are integrated into an adaptive scheme. Then we can keep the same quality while reducing the bit rates per sample. Comparing to 8 bit per sample μ-law or A-law, only 3.24 bits per sample are needed for coding of an audio signal with 44100 Hz sampling frequency and 4.72 bits for speech/audio signals at 11025 Hz. Some improvements for IMA ADPCM standard are also obtained View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Camera tracking for augmented reality media

    Page(s): 1637 - 1640 vol.3
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (400 KB)  

    The paper presents a camera tracking system for the spatial stabilization of augmented reality (AR) media. Our approach integrates both artificial landmarks (fiducials) and natural features for camera tracking. Artificial landmarks are used for system initialization and computation of initial camera pose. Robust and extendible tracking is achieved by dynamically calibrating the 3D positions of a priori uncalibrated natural features. Analysis and experimental results demonstrate the effectiveness of this approach for presenting stabilized AR media in long camera-motion video sequences View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Evaluation of adaptive filtering of MPEG system streams in IP networks

    Page(s): 1313 - 1317 vol.3
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (512 KB)  

    Congestion and large differences in available link bandwidth create challenges for the design of applications that want to deliver high quality video over the Internet. We investigate the placement of computation agents within the network to provide filtering to adapt MPEG system streams to the available bandwidth without transcoding. The paper presents an evaluation of our implementation in three different operating environments: a networking testbed in a laboratory environment, a home-user scenario (DSL line with 640 Kbit/sec), and a wide area network covering the Atlantic (server in Europe, client in the US). Our architecture leverages previous work and introduces efficient MPEG system filtering to achieve high quality video over best-effort networks View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A feature point based scheme for unsupervised video object segmentation in stereoscopic video sequences

    Page(s): 1543 - 1546 vol.3
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (452 KB)  

    The video coding standard MPEG-4 is enabling content-based functionalities by the introduction of video object planes (VOPs) which represent semantically meaningful objects. A novel fast, unsupervised semantic segmentation scheme is presented for stereoscopic sequences, which utilizes the provided depth information. Each stereo pair is first analyzed and the disparity field and occluded areas are estimated. Then a multiresolution implementation of the RSST segmentation algorithm is applied to the depth map for extracting the depth segments. For each depth segment, except the last, feature points are generated on its contour and a motion geometric space (MGS) for every initial point is defined. Afterwards one point per MGS is selected, which satisfies predefined intensity and curvature constraints so that the object boundaries are accurately extracted. Experimental results are presented to indicate the good performance of the proposed scheme on real life stereoscopic video sequences View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Skew detection and compensation for Internet audio applications

    Page(s): 1687 - 1690 vol.3
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (356 KB)  

    Long lived audio streams, such as music broadcasts, and small differences in clock rates lead to buffer underflow or overflow events in receiving applications that manifest themselves as audible interruptions. We present a low complexity algorithm for detecting clock skew in network audio applications that function with local clocks and in the absence of a synchronization mechanism. A companion algorithm to perform skew compensation is also presented. The compensation algorithm utilises the temporal redundancy inherent in audio streams to make inaudible playout adjustments. Both algorithms have been implemented in a simulator and in a network audio application. They perform effectively over the range of observed clock rate differences and beyond View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Dissolve transition detection using B-splines interpolation

    Page(s): 1349 - 1352 vol.3
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (348 KB)  

    We present a novel identification technique for dissolve transitions in video sequences. Our scheme is to estimate the actual transition curve by polynomial data interpolation. We make use of “goodness” of fit to determine the presence of dissolve transitions. A B-spline polynomial curve fitting technique is used. Our approach is able to recover the original transition behavior of a dissolve edit effect even if it was distorted by various post-processing stages View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Generating optimal video summaries

    Page(s): 1559 - 1562 vol.3
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (384 KB)  

    Proposes a novel technique for video summarization based on singular value decomposition (SVD). For the input video sequence, we create a feature frame matrix A, and perform the SVD on it. From this SVD, we are able to derive not only the refined feature space to better cluster visually similar frames, but also a metric to measure the amount of visual content contained in each frame cluster using its degree of visual change. Then, in this refined feature space, we find the most static frame cluster, define it as the content unit, and use the content value computed from this cluster as well as the distance between frames as the thresholds to cluster the rest of the frames. Based on this clustering result, the optimal set of keyframes is generated as the content summary of the original video. Our approach ensures that the summarized video representation contains little redundancy, and gives equal attention to the same amount of visual contents. Besides the optimal keyframe set, our system can also generate summarized motion video for an input video sequence with a user-specified time length. Examples of the summarized motion videos can be viewed at <http://www.ccrl.com/video/summary> View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Towards a multimodal meeting record

    Page(s): 1593 - 1596 vol.3
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (440 KB)  

    Face-to-face meetings usually encompass several modalities including speech, gesture, handwriting, and person identification. Recognition and integration of each of these modalities is important to create an accurate record of a meeting. However, each of these modalities presents recognition difficulties. Speech recognition must be speaker and domain independent, have low word error rates, and be close to real time to be useful. Gesture and handwriting recognition must be writer independent and support a wide variety of writing styles. Person identification has difficulty with segmentation in a crowded room. Furthermore, in order to produce the record automatically, we have to solve the assignment problem (who is saying what), which involves people identification and speech recognition. This paper examines a multimodal meeting room system under development at Carnegie Mellon University that enables us to track, capture and integrate the important aspects of a meeting from people identification to meeting transcription. Once a multimedia meeting record is created, it can be archived for later retrieval View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Video segmentation with the assistance of audio content analysis

    Page(s): 1507 - 1510 vol.3
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (408 KB)  

    Video structure extraction is essential to automatic content-based organization, retrieval and browsing of video. However, while many robust shot segmentation algorithms have been developed, it is still difficult to extract scene structures or group shots into scenes. We present a novel audio assisted video segmentation scheme, in which audio and color information is integrated in video scene extraction. A novel audio segmentation scheme is developed to segment audio tracks into speech, music, environmental sound and silence segments. A robust algorithm for shot grouping based on correlation analysis is also developed to further enhance the scene extraction accuracy View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An adaptable network architecture for multimedia traffic management and control

    Page(s): 1615 - 1618 vol.3
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (416 KB)  

    We have designed an adaptable network architecture, called ADNET, which provides mechanisms to allow an application to adapt to resource constraints to achieve improved QoS. In our experiments we compare three schemes (IP fragmentation, ACTP fragmentation with or without active program) of video transmissions. We find QoS is improved in the ACTP scheme with active programs. Our design aims to unify different QoS control mechanisms together to provide a wide range of network services to all users and meet their specific needs View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance analysis of AB-tree

    Page(s): 1701 - 1704 vol.3
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (328 KB)  

    We present the performance analysis of AB-tree, an efficient indexing scheme for high dimensional databases. AB-tree is based on a few interesting data distribution properties, such as the angle and the distance, which were observed in experiments for large databases with high dimensional data. We have shown a significant performance gain of the AB-tree (about 57 times) over the SS-tree. We have also shown that the AB-tree performs better than VA-file, the best known sequential access method View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Hardware/software co-design for real-time physical modeling

    Page(s): 1363 - 1366 vol.3
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (256 KB)  

    Physical modeling of a mass-spring system allows for realistic object motion and deformation in a virtual environment. Previous work in this type of physical modeling relies on general-purpose hardware, and cannot offer the performance necessary for real-time human-machine interaction. In this paper, we consider the co-design of software and hardware in order to achieve real-time physical modeling View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Mixing realities in Shared Space: an augmented reality interface for collaborative computing

    Page(s): 1641 - 1644 vol.3
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (424 KB)  

    In the Shared Space project, we explore, innovate, design and evaluate future computing environments that will radically enhance interaction between human and computers as well as interaction between humans mediated by computers. In particular, we investigate how augmented reality enhanced by physical and spatial 3D user interfaces can be used to develop effective face-to-face collaborative computing environments. How will we interact in such collaborative spaces? How will we interact with each other? What new applications can be developed using this technology? These are the questions that we are trying to answer in research on Shared Space. The paper provides a short overview of Shared Space, its directions, technologies and applications View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.