By Topic

Audio, Language and Image Processing, 2008. ICALIP 2008. International Conference on

Date 7-9 July 2008

Filter Results

Displaying Results 1 - 25 of 345
  • Apply hybrid method of relevance feedback and EMD algorithm in a color feature extraction CBIR system

    Page(s): 163 - 166
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (854 KB) |  | HTML iconHTML  

    In the past decade, the rapid spread of multimedia and network technology leads to the development of multimedia information retrieval techniques, especially the CBIR (content-based image retrieval) research. In this paper, we introduce a CBIR system which applies hybrid method of relevance feedback and the Earth Moverpsilas Distance (EMD) algorithm. Given a query image, the system can extract color feature, convert RGB color space to HSV color space. In attempt to reduce the drawback of bin-by-bin distance, Earth Moverpsilas Distance (EMD) algorithm is utilized. What is more, the gap between the features computer extracts and the userpsilas semantic concepts causes an unsatisfying retrieval result. Nowadays we realized that the semantic gap may be resolved by applying relevance feedback, a powerful query modification technique in the process of CBIR. Experimental results demonstrate the improvement compared with tradition methods. Finally, a visual demo shows the retrieval results which is evaluated by MEPG-7 standard. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A novel hybrid image inpainting model

    Page(s): 138 - 142
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (571 KB) |  | HTML iconHTML  

    In this paper a hybrid image inpainting model is proposed. The novel model uses the TV-L1 equation to decompose the image into a structure part and a texture part, and then more dynamic information is contained in the texture part. The equivalence between inpainting Partial differential equation (PDE) and inviscid Helmholtz vorticity equation is proved in this paper and then a bi-directional diffusion PDE is used to inpaint the structural part. The texture part is restored by exemplar-based inpainting model which is constrained by a cross-isophote diffused data term. The novel hybrid model can inpaint image with texture information and preserve linear structure simultaneously. Both theoretical analysis and experiments have verified the validity of the novel model proposed in this paper. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A new algorithm for constrained global optimization based on filled function

    Page(s): 189 - 193
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (184 KB) |  | HTML iconHTML  

    In this paper, we propose a filled function method for constrained global optimization. This filled function contains only one parameter which is easily to be chosen. Then we investigate the properties of this function and design a new algorithm based on this function. Last, we make a numerical test. The numerical results show the efficiency of this global optimization method. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design and implementation of pico-second pulse generator with multiple parameters adjustment

    Page(s): 185 - 188
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (798 KB) |  | HTML iconHTML  

    By frequency dividing, Phase aligning, pulse width adjusting, edge condensing, amplitude amplifying and frequency spectrum shielding to a high frequency clock, an UWB narrow pulse with pico-second edges and jitter time is generated. Because of adopting programmable high-bandwidth digital elements, the parameters of output signal can be adjusted on line with high resolution, so it can be used for various communication purposes and has a broad application prospect. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A fast algorithm for moving objects detection based on model switching

    Page(s): 143 - 146
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (218 KB) |  | HTML iconHTML  

    A new method is proposed to improve background modeling speed. First, the pixels in current frame are classified into two classes according to average background to reduce the computing load. Second, different models for instance kernel or GMM based algorithm are used necessarily to deal with 'dead lock' of scene. Third, a kernel density estimation based on neighbor correlation is used to decrease the false positives'. Last, the two algorithm detection results are fused to detect moving object by the label of pixel. In this paper, a novel description of correlation about the pixel with its around pixels and a strategy of background modeling are proposed. Experimental results of outdoor complex scene demonstrate that the new algorithm is robustness to noise and good for real-time moving object detection. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Tracking the small object through clutter with adaptive particle filter

    Page(s): 357 - 362
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (188 KB)  

    Cluttered background and occlusion cause large ambiguity in the tracking of video objects. When the object is small (like a soccer ball in broadcast game video signals), the ambiguity gets even more severe. In this paper, we propose an adaptive particle filter with effective proposal distribution to handle these situations. In the proposed tracking approach, motion estimation is embedded into the state transition to tackle abrupt motion changes and generate good proposal distributions. We also propose a mixture model to account for multiple hypotheses in the template correlation surface when estimating the appearance likelihood. In addition, motion continuity and trajectory smoothness are combined with template correlation in the observation likelihood to further filter out visual distracters. As an example of small object tracking, promising results of the ball tracking (as small as 30 pixels) in soccer game videos are presented to illustrate that the proposed scheme handles the cluttered background and occlusion effectively. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A novel snake model without re-initialization for image segmentation

    Page(s): 147 - 151
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (385 KB) |  | HTML iconHTML  

    In this paper, we present a new variational formulation of geometric snake for image segmentation. Our formulation includes an internal energy term that penalizes the deviation of the level set function from a signed distance function and stopping term related to a particular segmentation of the image instead of gradient. They force the level set function to be close to a signed distance function, therefore completely eliminate the need of the costly re-initialization procedure. Significantly larger time step can be used for solving the evolution equation to speed up the evolution. The level set formulation is easily implemented by simple finite difference scheme that is computationally more efficient. Meanwhile not only the initial curve can be anywhere in the image, but also interior contours can be automatically detected. Experiment results on image segmentation show that our algorithm has very good performance. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Re-coloring images for dichromats based on an improved adaptive mapping algorithm

    Page(s): 152 - 156
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (347 KB) |  | HTML iconHTML  

    To improve the ability of color discrimination for dichromats and preserve the naturalness of the original images, a color image transformation method is proposed based on an improved adaptive mapping algorithm. Using the transformation, the color space of ordinary vision is mapped to the color plane of dichromats to avoid the main color confusion for dichromats. Experiments are conducted on both 15 color test images and 5 natural images. The results show that the invisible information for dichromats in original images stands out in re-colored ones so that the color information readability for dichromats is enhanced. The performance of the improved method is much better than that of the original one. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Extracting boundaries of ultrasonic breast tumor images based on a coarse-to-fine active contour model

    Page(s): 157 - 162
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (628 KB) |  | HTML iconHTML  

    Segmentation of ultrasonic breast tumor images is a challenging topic in the clinical practice. A novel coarse-to-fine active contour (CFAC) model is proposed to extract boundaries of breast tumors based on a level-set framework. To apply the CFAC model, a Gaussian pyramid is firstly constructed to represent images at different resolution levels. Then, on the top pyramid level a region-based segmentation algorithm incorporating with the certain edge information is used to get a coarse boundary. Finally, the coarse boundary is gradually refined on other higher-level images according to the more detailed gradient information. Experiments are performed on both synthetic and real ultrasonic breast tumor images. The qualitative and quantitative results verified the efficiency of the CFAC model for the image segmentation task. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A multiscale technique for digital halftoning and embedded multiresolution rendering

    Page(s): 913 - 917
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (447 KB) |  | HTML iconHTML  

    In this paper, a new multiscale technique for digital halftoning and embedded multiresolution rendering is proposed. The proposed algorithm is adaptive by tuning a threshold value, the dithered patterns used to producing the illusion of grayscale information at constant and small varying areas can be preserved and the high frequency information can made more visible. In addition, the halftoning images possess the property of embedded multiresolution rendering. That is, the down sampled version of the halftoning image can be directly obtained from the original halftoning image. The proposed algorithm can be used for not only displaying the halftoning images at multiple resolutions but also having the potential for progressive transmission and image compression. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An SIPCA-WCCN method for SVM-based speaker verification system

    Page(s): 1295 - 1299
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (218 KB) |  | HTML iconHTML  

    The session variability is the most important factor affecting the performance of the speaker verification systems. In order to deal with the variability more efficiently, this paper provides a practical procedure for applying a smooth within-class covariance normalization (WCCN) to an SVM-based speaker verification system, where the dimension of input samples resides in a low session-invariant principal component analysis(SIPCA) feature space. When the SIPCA and smooth WCCN approaches are implemented on NIST 2006 verification task, experimental results show relative reductions of up to 19.7% in EER and 18.4% in minimum decision cost function(DCF) over our previous GMM-mean SVM system. Our approach also has advantages in computational and memory costs compared to the state-of-art systems. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A block matching criterion for interframe coding of video

    Page(s): 133 - 137
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (175 KB) |  | HTML iconHTML  

    Interframe coding is used for removal of temporal redundancy in video data and motion compensation plays a very significant role in the interframe coding of such data. Motion compensation based on block matching technique generally uses the criterion of either minimum Mean Square Error (MSE)/ Mean Absolute Difference (MAD) value to find the suitable motion vector. Vector Matching Criterion (VMC) is another such method for motion compensation in the literature. In this manuscript, a new matching criterion for block based motion compensation is being proposed and compared with other existing techniques. The experimental results show that the proposed criterion of block matching gives excellent results in comparison to the existing criterion of block matching techniques. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Simplified factor analysis in speaker verification

    Page(s): 1316 - 1319
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (186 KB) |  | HTML iconHTML  

    The channel or session variability is one of the major difficulty problems to deal with in speaker recognition systems. In this paper, a simplified factor analysis is proposed to solve the inter-session variability problem. We find the factor analysis method can achieve very good performance both in the GMM_UBM and the SVM systems. The proposed algorithm can get almost the same performance as the NAP algorithm in SVM system. In the NIST 2006 SRE core test, the EER of the factor analysis is about 5.0% in the SVM system and 4.9% in the GMM_UBM system. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Feature mapping based on GMM supervector

    Page(s): 1081 - 1085
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (210 KB) |  | HTML iconHTML  

    The channel or inter-session variability problem is one of the most important factors causing recognition errors in speaker recognition systems. In this paper, we have proposed three methods to estimate the channel supervector in the GMM supervector space to deal with this problem, namely EM clustering, PCA and NAP algorithms. Furthermore, feature mapping is applied to the MFCC after the estimation of channel supervector. The EER of the feature mapping system decreases by 34% relatively over the baseline GMM system in the NIST 2006 SRE core test. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Role of clear speech attribute on fricatives perception for the hard of hearing

    Page(s): 87 - 91
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (247 KB) |  | HTML iconHTML  

    In difficult listening environments or when a listenerpsilas dynamic range is severely reduced, speech recognition becomes a challenging task. One way to alleviate this difficulty is to speak dasiaclearlypsila as opposed to dasiaconversationallypsila to these individuals [5, 6]. Clear speech has an intelligibility advantage over typical slow speech on account of some special attributes. The present study was specifically concerned with the effect of one such acoustic attribute of clear speech - dasiaconsonant-vowel intensity ratiopsila, on speech perception. A case for synthetic clear speech in the context of hearing impairment was constituted. Fricatives of English language with cardinal vowels-/a i u/, were additively mixed with comb-filtered white noise at three SNRs. Computerized test administration system was developed to study the responses of five listeners. The perceptual analysis was accomplished in terms of information transmission analysis measures. The overall information transmission and transmission of consonant features have reported appreciable improvement. Under adverse noise masking condition (SNR=6 dB), the maximum intelligibility benefit of 24% points and 38% points were reported with two testing procedures. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An efficient method for removing random-valued impulse noise

    Page(s): 918 - 922
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1436 KB) |  | HTML iconHTML  

    In this paper, we propose an efficient two-stage iterative method for removing random-valued impulse noise. In the first phase, we use the adaptive center-weighted median filter to identify pixels which are likely to be corrupted by noise. In the second phase, these noise candidates are restored by using an efficient median filter based method. These two phases are applied alternatively. Extensive computer simulation results indicate that the proposed method provides some improvement over many of the well-known method in random noise removal while preserving low computational complexity. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design and implementation of a real-time image processing system with modularization and extendibility

    Page(s): 798 - 802
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (288 KB) |  | HTML iconHTML  

    A new design method for the real-time image processing system based on field programmable gate array (FPGA) and digital signal processing (DSP) structure is presented in the paper. In the practical system, FPGA is used to be logic unit for sensor interface protocol and complete image preprocessing and display of PAL video; DSP is applied as the main processing unit which is extendable for better performance; Modules are connected by user-configurable I/O interface, external memory interface (EMIF) and host-post interface (HPI), which configure the system to be a loosely coupled and cascaded multi-DSP system. Finally, the design approach shows good performance in the real-time image processing system with the capability of real-time processing, extendibility and multi-interface, which is significant for practical application. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The design and implementation of video-based interactive motion tracking model

    Page(s): 1565 - 1568
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (201 KB) |  | HTML iconHTML  

    In this paper, a simple yet efficient method to model object motion track is designed and implemented. The moving tracks of any objects (human, animal and other dynamic objects) can be tracked interactively, and a motion model can be formed and adjusted for later use. The model can be saved as files, leaving an open interface for other applications to access. The model is significant in the animation and cartoon design and production, and can be applied in other digital media fields. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Adaptive FMO selection strategy for error resilient H.264 coding

    Page(s): 868 - 872
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (197 KB) |  | HTML iconHTML  

    Flexible macroblock ordering (FMO) is one of the effective error resilient tools in H.264/AVC video coding standard. Nevertheless the issue of how to suitably arrange the macroblocks in suitable FMO mapping type for different video applications is yet to be clarified and investigated. In this paper, we are analyzing the tradeoff and effectiveness of the six fixed FMO types, and based these six fixed FMO types, using the joint source-channel rate distortion optimization (RDO) principle to propose an adaptive FMO type selection strategy for different video scenes and applications. The experiment results shows that our method has more compatibility and flexibility than the six fixed FMO types, and better error resilience than most of them. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An improved method of RHT to localize circle applied in Intelligent Transportation System

    Page(s): 335 - 338
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (155 KB) |  | HTML iconHTML  

    Intelligent transportation system (ITS) is an application of modern information technology in the field of traffic that has attracted worldwide attention. Seat belt recognition is a relatively new orientation in ITS. In order to correctly identify the seat belt, under the premise of localized vehicle window, this paper explores to extract the centerline of the steering wheel as the feature, which provides reference for the localization of seat belt. But it is very difficult to find accurate location of the steering wheelpsila centerline using luminance or color, because of the fuzzy image of vehicle window which has been obtained in a random environment. This paper explores to carry out carry edge detection, and then to use randomized Hough transform (RHT) to find the centerline of the steering wheel. However, RHT has a great of computation and memory consumption, which has prohibited its wider use from a large extent. This paper proposes a novel method called dasialocal randomized Hough transportation (LRHT)psila. Experimental result shows that not only complexity but also efficiency has been greatly improved. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • HRS: HPC-based automatic 3D building reconstruction system

    Page(s): 1725 - 1731
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (402 KB) |  | HTML iconHTML  

    3D digital city is paid more and more attention on for they are expected to make life and business more convenient, but traditional 3D digital city are out of sync with reality for non-automatic 3D digital city reconstruction. Automatic 3D building reconstruction is vital for automatic 3D digital city reconstruction, for city mainly consists of buildings. Serial computing based automatic 3D building reconstruction from stereo-pair images and from single image studied recently still is too slow for synchronously updating 3D digital city. So we propose HRS (HPC-based automatic 3D building reconstruction system) to make use of HPC (high performance computing) to speed up automatic 3D building reconstruction. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • RFID readers low sampling rate frequency shift keying decoding method

    Page(s): 803 - 807
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (222 KB) |  | HTML iconHTML  

    The paper present puts forward a numerical decoding method using low sampling rate of frequency shift keying (FSK) on UHF radio frequency identification tags (RFID) readers. This method is not only applicable to the signal detection that the signal sampling rate is twice higher than the frequency of signal detection, but also to the signal detection that the signal sampling rate is twice lower than the frequency of signal detection. Itpsilas a better solution when that the high symbol rate using low sampling rate and adopting aliased signal for detecting frequency shift keying signal, and it also solves the multi-signals collision detection problem. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The nonlinear distortion of directional loudspeaker

    Page(s): 17 - 21
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (294 KB) |  | HTML iconHTML  

    A directional audible sound could be generated from the nonlinear interaction of ultrasound in the propagation medium. The demodulated audible signal is proportional to the second time derivative of the square of the ultrasound envelope, described in Berktaypsilas Far-field solution, which is nonlinear to the process of sound reproduction. Double side-band modulation (DSB) is proposed to amplitude-modulate the ultrasound carrier with audio signal. Total harmonic distortion (THD) is calculated for signal with single frequency firstly, and it increases with the increase of the modulation index. However, itpsilas more complex for input with two frequencies or multi-frequency. Except for the interaction between the modulated ultrasound and the carrier, the intermodulation distortion (IMD) is generated from the interaction of modulated signal with multi-frequencies. Hence, Single side band modulation (SSB) is proposed for reducing the THD and IMD. There is no 2nd harmonic frequency or higher frequencies for input with single frequency. And the amplitudes of harmonics are only related to the modulation index for input with two frequencies or wideband. SSB is recommended to reduce THD or IMD. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Using timestamp to realize audio-video synchronization in Real-Time streaming media transmission

    Page(s): 1073 - 1076
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (245 KB) |  | HTML iconHTML  

    Real-time streaming media transmission is used widely in computer network, and audio-video synchronization is a key issue. Because of using RTP protocol in data transmission, the synchronization requirements can be solved if the time information in RTP data packets is used effectively. In this paper, combining actual application, RTP data packets are simplified appropriately in sending end, and implementation process is given completely according to various situations in receiving end. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A novel approach for face recognition based on stereo image processing algorithm

    Page(s): 1245 - 1249
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (380 KB) |  | HTML iconHTML  

    Recently, the importance of face recognition has been increasingly emphasized since CCD cameras are distributed to various applications. This paper proposes a robust method for face recognition from two image sequences taken by stereo cameras, so the facial appearance changes by lighting variations or other causes wonpsilat lead to serious performance degradation. The feature correspondences between images are extracted and refined automatically by the relation of the stereo cameras. Given a point on left image, the correspondent point on right image is the intersection of maximum correlation coefficient and epipolar line. Identical database of the person is synthesized form photometric stereo images of training data. Some experimental results are presented to demonstrate the proposed technology. The results show that high-resolution 3D data of free surface has been acquired and higher recognition rates are gotten. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.