By Topic

Speech, Image Processing and Neural Networks, 1994. Proceedings, ISSIPNN '94., 1994 International Symposium on

Date 13-16 April 1994

Filter Results

Displaying Results 1 - 25 of 202
  • Proceedings of ICSIPNN '94. International Conference on Speech, Image Processing and Neural Networks

    Save to Project icon | Request Permissions | PDF file iconPDF (151 KB)  
    Freely Available from IEEE
  • On finite recurrent neural network

    Page(s): 791 - 794 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (128 KB)  

    A kind of recurrent network, called the finite recurrence back propagation network (FRBP) due to the finite recurrence in the hidden layer, is presented. This kind of network has some of the abilities of existing recurrent networks, less memory storage requirement, and a faster training time. The learning algorithm is discussed, giving its computational and simulation results View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Handwritten character recognition by extended loop neural networks

    Page(s): 460 - 463 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (200 KB)  

    Presents an extended loop neural network approach to handwritten character recognition. Experiments show that this method is very effective. The recognition rate by this method is higher than that by a backpropagation network View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parameter estimation of a fractional Brownian motion in a white noise using wavelets

    Page(s): 646 - 649 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (224 KB)  

    To discriminate the fractal parameter of a fractional Brownian motion (fBm) embedded in a white noise is equivalent to discriminating the composite singularity formed by superimposing a peak singularity upon a Dirac singularity. We use the autocorrelation of the wavelet transform coefficients to characterize the composite singularity, by formalizing this problem as a nonlinear optimization problem. We modify the internal penalty function method to efficiently estimate the parameters of the fBm in the white noise View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Blind deconvolution using a mixed projection-gradient approach

    Page(s): 788 - 790 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (152 KB)  

    In this paper, we present a set theoretic based approach for solving the classical blind deconvolution problem. In this new approach, every piece of available information about both the source function and the system impulse response function is expressed via a constraint set. The proposed algorithm enjoys good convergent properties View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A novel VQ encoding algorithm based on adaptive searching sequence

    Page(s): 164 - 167 vol.1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (224 KB)  

    Most VQ encoding algorithms are proposed to enhance the codeword elimination efficiency, which concerns how to reject codewords or stop searching codewords as soon as possible by using various criteria. The paper proposes an algorithm to dynamically determine the codeword searching sequence for a given input vector which can further improve the codeword elimination efficiency. An encoding algorithm is also proposed based on a similar idea View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Finding of optimal morphological filter on binary images based on greedy and branch & bound searching

    Page(s): 5 - 8 vol.1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (224 KB)  

    The method by searching the error code graph (ECG) to find the optimal morphological filter on binary image is proposed in this paper. The problem to find the optimal solution is reduced to the problem for searching a minimal path in ECG. Since this graph satisfies some greedy properties, only few nodes need to be traversed and examined in this graph View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fingerprint preclassification using key-points

    Page(s): 308 - 311 vol.1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (192 KB)  

    A new idea of fingerprint preclassification named the key-point recognition method (KMF) is proposed which only pays attention to whether there is a general feature key-point in a certain area and takes no notice of what the feature is. Using this method, an automatic fingerprint recognition system has been developed, which is characterized by fewer requirements imposed on the preprocessing, lower sensitivity to the noise, higher capacity and parallelism being compared with other traditional ones. The system can list out the most promising fingerprints as a preclassifier View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Unstructured to structured error correction using neural nets

    Page(s): 457 - 459 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (132 KB)  

    Most transmitted or stored information are subjected to occasional errors. In most situations, the source of this information has inherent unstructured redundancy that can be exploited to correct these errors. In addition to the storage requirements, getting the source statistics required to perform the error correction may not be easy. In this paper, we propose and evaluate trained neural nets to transform the unstructured redundancy into a structured one. The new approach, eliminates the need for source statistics storage and also simplifies the decoding process. This idea is applied to correct some of the errors caused by passing a printed Arabic text through an optical character recognition (OCR) device. Simulation results demonstrate the effectiveness of this technique View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The fast algorithm for the finite length discrete wavelet transform

    Page(s): 642 - 645 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (196 KB)  

    The paper presents a structured algorithm for the finite length discrete wavelet transform. The analysis and synthesis filter matrices H, G can be decomposed in kronecker product form with cyclic block matrix and lower-triangle block matrix. The cyclic matrix can be implemented using FFT and the lower-triangle matrix is implemented straightforward. The arithmetic complexity of the algorithm is prior to the full-FFT implementation. Since the filter matrix of two-dimensional discrete wavelet transform separated into the kronecker product of the filter matrices of one-dimensional discrete wavelet transform, the algorithm can also be extended to the two-dimensional discrete wavelet transform conveniently View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Segment the metallograph images using Gabor filter

    Page(s): 25 - 28 vol.1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (224 KB)  

    We have implemented a texture-based supervised segmentation method to segment a variety of metallograph images which are considered to contain different textured regions. Texture features are computed by using a set of even symmetric Gabor filters which have been successfully used earlier for a variety of texture classification and segmentation tasks. The performance of the proposed method has been shown on a number of test images. These results also demonstrate the ability of our improved approach to separate the untextured regions from the textured regions in the same image View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Phonetic and linguistic features of spoken Chinese

    Page(s): 117 - 121 vol.1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (280 KB)  

    The phonetic and linguistic features of spoken Chinese which are closely related to the research and development of speech I/O systems are discussed at different linguistic levels. It is shown that: (1) the syllable structures of Chinese are an advantage in increasing the syllable intelligibility; (2) the rank order of the principal phonetic features is voiced, aspirated, fricative, and places of articulation; (3) the statistical relation between the intelligibility of different speech units can be described by a Bayes probability model; (4) the task of developing a continuous speech recognition system for Chinese faces a number of linguistic barriers View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast self-organization by query-based algorithm and its applications

    Page(s): 85 - 88 vol.1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (252 KB)  

    A query-based algorithm is introduced to Kohonen's self-organizing feature maps for finding the best matching neurons with their desire states. It is different from the original self-organization algorithm which finds the best matching neurons with neurons' current states. The proposed method with a batch view of the self-organization process is a variant of Kohonen's algorithm. Thus, it can be applied to solve the data clustering problem with pre-specified input vectors. Moreover, the VLSI circuit placement problem without pre-specified input vectors can also be successfully resolved. Comparisons have been made with the original self-organization algorithm, and shown to be better View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A novel algorithm for the restoration of AFM/STM images

    Page(s): 784 - 787 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (328 KB)  

    Images generated from the scanning tunneling microscope (STM) or atomic force microscope (AFM) imaging system can show microstructures of samples. However, resulting AFM/STM images are sometimes corrupted by streaks. Thus, to suppress such streaks becomes an important task in the processing of AFM/STM images. We analyze the generation of streaks, introduce a degradation model of the corrupted AFM/STM image, and then propose an adaptive notch filtering algorithm to remove the streaks. Some simulation results support the analysis and indicate the performance of the proposed algorithm View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An improved JPEG image coder using the adaptive fast approximate Karhunen-Loeve transform (AKLT)

    Page(s): 160 - 163 vol.1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (260 KB)  

    A new fast approximate Karhunen-Loeve transform (AKLT) [Lan and Reed, 1993; Reed and Lan, 1993] was discovered. This novel transform is derived by use of the perturbation theory of linear operators. The theoretical advantages of the AKLT over the DCT, the current industrial standard for image compression, are discussed thoroughly in Reed and Lan, and Lan and Reed for the first-order Markov image model. In the present paper, an improved JPEG image coder which uses the adaptive AKLT is presented. An improvement of 5% in compression ratio as compared with the DCT-JPEG system is obtained when the desired nominal data rate is above 0.4 bits/pixel for typical real-life images. The rate-distortion curves indicate a superiority of the adaptive AKLT scheme over the DCT scheme for all ranges of the data rate. It is worthwhile to note that this new adaptive AKLT-JPEG system can be really implemented using the existing JPEG chip set with only a slight modification View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Solving least squares problems by neural network approach

    Page(s): 539 - 542 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (120 KB)  

    In this paper, a neural network approach to least squares (LS) problems is proposed. The linear LS problems are solved by using a class of neural network rather than by other conventional methods (such as SVD, Householder transform, etc.). The theoretical analysis and computer simulations show that the method is efficient and reliable, and it is computationally simple and has a normal structure View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A thresholding method using the mixture of normal density functions

    Page(s): 304 - 307 vol.1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (224 KB)  

    Thresholding techniques are fundamental for image segmentation. It is often realistic to assume that each pixel is subject to the mixture of several normal distributions. The paper proposes a criterion of thresholding a histogram of gray level intensity. It uses a new variance of the histogram. An algorithm which considers the tails of probability density functions of the other classes is also shown. The proposed method is experimentally compared with the Kittler-Illingworth method and the Otsu method. The proposed criterion and the Otsu one are effective in thresholding of handwritten characters. More accurate thresholds are obtained by the algorithm when the data comes from the mixture normal distribution, although the number of computations is increased View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Human computer interactions using language based technology

    Page(s): I - VII vol.1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (608 KB)  

    Spoken language offers an attractive alternative for human computer interface, since speech is the most natural, efficient, flexible, and economical means of communication among humans. To provide this interface, however, speech recognition technology must be combined with natural language processing technology, so that the verbal input can not only be recognized, but also understood, and appropriate actions can be taken. This paper provides an overview of recent research in spoken language understanding, focusing on the activities in the United States sponsored by the Advanced Research Projects Agency of the Department of Defense. Some of the future research directions are also summarized View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Voice message summary for voice services

    Page(s): 622 - 625 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (216 KB)  

    This paper describes a voice message summarizing method for retrieving specific voice messages from a large number of voice messages on voice services, such as voice mail and voice bulletin boards. Voice browsing facilities are tools intended to allow users to handle voice messages as easily and conveniently as browsing books. After surveying methods for voice browsing, the author proposes a new voice message summarizing method that is based on the important part being spoken slowly and having a higher proportion of unvoiced parts. The effectiveness of this method was demonstrated using actual voices from radio programs View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Generalization of hierarchical retinotopic networks using stochastic distortion models

    Page(s): 381 - 384 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (256 KB)  

    The generalization of hierarchical retinotopic networks is modeled as a type of probability measure called “tail probability” with a stochastic distortion field. Learning in the network memorizes the exemplars in terms of the distribution. Generalization in a hierarchical retinotopic network is characterized by the probability measure of multilevel events and decision making at each abstraction level. The concept is applied to automatically generating a hierarchical retinotopic network during the leaning of exemplars. This approach is called Cresceptron and it has been tested on learning, recognizing and segmenting a variety of real-world objects based on their 2-D images View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A theory on optimal construction of dynamic features of speech for HMM-based speech recognition

    Page(s): 351 - 354 vol.1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (196 KB)  

    The construction of dynamic (delta) features of speech, which has been in the past confined to the pre-processing domain in hidden Markov modelling (HMM), is generalized and formulated as an integrated speech modeling problem. This generalization allows to utilize state-dependent weights to transform static speech features into dynamic ones. The author describes a rigorous theoretical framework that naturally incorporates the generalized dynamic-parameter technique, and presents a maximum-likelihood based algorithm for integrated optimization of the conventional HMM parameters and of the time-varying weighting functions that define the dynamic features of speech View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Low delay CELP coding at 8 kbps using classified voiced and unvoiced excitation codebooks

    Page(s): 472 - 475 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (252 KB)  

    This paper presents an 8 kbps low delay CELP coding scheme which has a one way delay less than 5 ms. In the proposed scheme, we classify the analyzed speech as either voiced or unvoiced segment. An adaptive pitch excitation codebook is used to generate voiced speech while a Gaussian stochastic codebook is used to generate unvoiced speech. To eliminate the roughness of synthetic speech without affecting the pitch dynamics, we propose a new long-term prediction algorithm to smooth the pitch contour in the voiced speech segment. Closed-loop excitation gain adaptation is employed to make the coding scheme more robust View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Applying N-best keyword search to continuous speech recognition for telecommunication-based applications

    Page(s): 726 - 729 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (212 KB)  

    An N-best keyword search algorithm was developed in a continuous speech recognizer which models vocabulary words as well as extraneous sounds and noise, to achieve high sentence accuracy. The continuous speech recognizer was developed for telecommunication-based applications which typically demand high sentence accuracy. Possible approaches for achieving high sentence accuracy include applying complicated speech modeling techniques or employing more knowledge sources when conducting the recognition search. An alternative solution is to first apply an N-best decoding search to obtain N sentence hypotheses using pre-selected knowledge source(s) and then re-score those hypotheses using other knowledge source(s) or models. The proposed N-best keyword search algorithm derives all keyword sentence hypotheses and the corresponding likelihood scores time-synchronously. We show that the algorithm guarantees to find all sentence hypotheses. To reduce the exponentially growing number of hypotheses, in practical implementation we applied empirically derived thresholds to prune the search. Recognition experiments were conducted on two speech corpora: TI Connected Digit Corpus and Road Rally Corpus, to show the effectiveness of the proposed method View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Text-to-speech system for Brazilian Portuguese using a reduced set of synthesis units

    Page(s): 579 - 582 vol.2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (240 KB)  

    An unrestricted text-to-speech synthesis system for Brazilian Portuguese is presented. The system is based on concatenation by rules of basic speech units. A reduced set of speech units (149) is proposed. This set comprises mostly consonant-oral vowel (CV) transitions, which represent crucial acoustic segments in the speech production process. The authors show that using this inventory of units it is possible to synthesize highly intelligible speech for Brazilian Portuguese. A CELP model is used as compression and synthesis structure View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A comparison of multi-layer neural networks and optimized nearest neighbor classifiers for handwritten digit recognition

    Page(s): 312 - 315 vol.1
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (272 KB)  

    The basic nearest neighbor classifier (NNC) is often inefficient for classification in terms of memory space and computing time if all training samples are used as prototypes. These problems can be solved by reducing the number of prototypes using clustering algorithms and optimizing the prototypes using a special neural network model. The author compares the performance of the multi-layer neural network and an optimized nearest neighbor classifier (ONNC). It is shown that an ONNC can have the same recognition performance and the same memory requirement as but need less training and classification time than an equivalent neural network View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.