By Topic

Neural Networks, IEEE Transactions on

Popular Articles (November 2014)

Includes the top 50 most frequently downloaded documents for this publication according to the most recent monthly usage statistics.
  • 1. Survey of clustering algorithms

    Page(s): 645 - 678
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1528 KB) |  | HTML iconHTML  

    Data analysis plays an indispensable role for understanding various phenomena. Cluster analysis, primitive exploration with little or no prior knowledge, consists of research developed across a wide variety of communities. The diversity, on one hand, equips us with many tools. On the other hand, the profusion of options causes confusion. We survey clustering algorithms for data sets appearing in statistics, computer science, and machine learning, and illustrate their applications in some benchmark data sets, the traveling salesman problem, and bioinformatics, a new field attracting intensive efforts. Several tightly related topics, proximity measure, and cluster validation, are also discussed. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 2. Face recognition: a convolutional neural-network approach

    Page(s): 98 - 113
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (492 KB)  

    We present a hybrid neural-network for human face recognition which compares favourably with other methods. The system combines local image sampling, a self-organizing map (SOM) neural network, and a convolutional neural network. The SOM provides a quantization of the image samples into a topological space where inputs that are nearby in the original space are also nearby in the output space, thereby providing dimensionality reduction and invariance to minor changes in the image sample, and the convolutional neural network provides partial invariance to translation, rotation, scale, and deformation. The convolutional network extracts successively larger features in a hierarchical set of layers. We present results using the Karhunen-Loeve transform in place of the SOM, and a multilayer perceptron (MLP) in place of the convolutional network for comparison. We use a database of 400 images of 40 individuals which contains quite a high degree of variability in expression, pose, and facial details. We analyze the computational complexity and discuss how new classes could be added to the trained recognizer View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 3. Identification and control of dynamical systems using neural networks

    Page(s): 4 - 27
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1792 KB)  

    It is demonstrated that neural networks can be used effectively for the identification and control of nonlinear dynamical systems. The emphasis is on models for both identification and control. Static and dynamic backpropagation methods for the adjustment of parameters are discussed. In the models that are introduced, multilayer and recurrent networks are interconnected in novel configurations, and hence there is a real need to study them in a unified fashion. Simulation results reveal that the identification and adaptive control schemes suggested are practically feasible. Basic concepts and definitions are introduced throughout, and theoretical questions that have to be addressed are also described View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 4. A comparison of methods for multiclass support vector machines

    Page(s): 415 - 425
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (406 KB) |  | HTML iconHTML  

    Support vector machines (SVMs) were originally designed for binary classification. How to effectively extend it for multiclass classification is still an ongoing research issue. Several methods have been proposed where typically we construct a multiclass classifier by combining several binary classifiers. Some authors also proposed methods that consider all classes at once. As it is computationally more expensive to solve multiclass problems, comparisons of these methods using large-scale problems have not been seriously conducted. Especially for methods solving multiclass SVM in one step, a much larger optimization problem is required so up to now experiments are limited to small data sets. In this paper we give decomposition implementations for two such "all-together" methods. We then compare their performance with three methods based on binary classifications: "one-against-all," "one-against-one," and directed acyclic graph SVM (DAGSVM). Our experiments indicate that the "one-against-one" and DAG methods are more suitable for practical use than the other methods. Results also show that for large problems methods by considering all data at once in general need fewer support vectors View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 5. An introduction to kernel-based learning algorithms

    Page(s): 181 - 201
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (516 KB)  

    This paper provides an introduction to support vector machines, kernel Fisher discriminant analysis, and kernel principal component analysis, as examples for successful kernel-based learning methods. We first give a short background about Vapnik-Chervonenkis theory and kernel feature spaces and then proceed to kernel based learning in supervised and unsupervised scenarios including practical and algorithmic considerations. We illustrate the usefulness of kernel algorithms by discussing applications such as optical character recognition and DNA analysis View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 6. Simple model of spiking neurons

    Page(s): 1569 - 1572
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (403 KB)  

    A model is presented that reproduces spiking and bursting behavior of known types of cortical neurons. The model combines the biologically plausibility of Hodgkin-Huxley-type dynamics and the computational efficiency of integrate-and-fire neurons. Using this model, one can simulate tens of thousands of spiking cortical neurons in real time (1 ms resolution) using a desktop PC. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 7. A Discrete-Time Neural Network for Optimization Problems With Hybrid Constraints

    Page(s): 1184 - 1189
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2402 KB) |  | HTML iconHTML  

    Recurrent neural networks have become a prominent tool for optimizations including linear or nonlinear variational inequalities and programming, due to its regular mathematical properties and well-defined parallel structure. This brief presents a general discrete-time recurrent network for linear variational inequalities and related optimization problems with hybrid constraints. In contrary to the existing discrete-time networks, this general model can operate not only on bound constraints, but also on hybrid constraints comprised of inequality, equality and bound constraints. The model has dynamical properties of global convergence, asymptotical and exponential convergences under some weaker conditions. Numerical examples demonstrate its efficacy and performance. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 8. Training feedforward networks with the Marquardt algorithm

    Page(s): 989 - 993
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (388 KB)  

    The Marquardt algorithm for nonlinear least squares is presented and is incorporated into the backpropagation algorithm for training feedforward neural networks. The algorithm is tested on several function approximation problems, and is compared with a conjugate gradient algorithm and a variable learning rate algorithm. It is found that the Marquardt algorithm is much more efficient than either of the other techniques when the network contains no more than a few hundred weights View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 9. Face recognition by independent component analysis

    Page(s): 1450 - 1464
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1405 KB)  

    A number of current face recognition algorithms use face representations found by unsupervised statistical methods. Typically these methods find a set of basis images and represent faces as a linear combination of those images. Principal component analysis (PCA) is a popular example of such methods. The basis images found by PCA depend only on pairwise relationships between pixels in the image database. In a task such as face recognition, in which important information may be contained in the high-order relationships among pixels, it seems reasonable to expect that better basis images may be found by methods sensitive to these high-order statistics. Independent component analysis (ICA), a generalization of PCA, is one such method. We used a version of ICA derived from the principle of optimal information transfer through sigmoidal neurons. ICA was performed on face images in the FERET database under two different architectures, one which treated the images as random variables and the pixels as outcomes, and a second which treated the pixels as random variables and the images as outcomes. The first architecture found spatially local basis images for the faces. The second architecture produced a factorial face code. Both ICA representations were superior to representations based on PCA for recognizing faces across days and changes in expression. A classifier that combined the two ICA representations gave the best performance. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 10. An overview of statistical learning theory

    Page(s): 988 - 999
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (280 KB)  

    Statistical learning theory was introduced in the late 1960's. Until the 1990's it was a purely theoretical analysis of the problem of function estimation from a given collection of data. In the middle of the 1990's new types of learning algorithms (called support vector machines) based on the developed theory were proposed. This made statistical learning theory not only a tool for the theoretical analysis but also a tool for creating practical algorithms for estimating multidimensional functions. This article presents a very general overview of statistical learning theory including both theoretical and algorithmic aspects of the theory. The goal of this overview is to demonstrate how the abstract learning theory established conditions for generalization which are more general than those discussed in classical statistical paradigms and how the understanding of these conditions inspired new algorithmic approaches to function estimation problems View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 11. Support vector machines for histogram-based image classification

    Page(s): 1055 - 1064
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (676 KB)  

    Traditional classification approaches generalize poorly on image classification tasks, because of the high dimensionality of the feature space. This paper shows that support vector machines (SVM) can generalize well on difficult image classification problems where the only features are high dimensional histograms. Heavy-tailed RBF kernels of the form K(x, y)=eΣi|xia-yia|b with a ⩽1 and b⩽2 are evaluated on the classification of images extracted from the Corel stock photo collection and shown to far outperform traditional polynomial or Gaussian radial basis function (RBF) kernels. Moreover, we observed that a simple remapping of the input xi→xia improves the performance of linear SVM to such an extend that it makes them, for this problem, a valid alternative to RBF kernels View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 12. Fast and robust fixed-point algorithms for independent component analysis

    Page(s): 626 - 634
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (136 KB)  

    Independent component analysis (ICA) is a statistical method for transforming an observed multidimensional random vector into components that are statistically as independent from each other as possible. We use a combination of two different approaches for linear ICA: Comon's information theoretic approach and the projection pursuit approach. Using maximum entropy approximations of differential entropy, we introduce a family of new contrast functions for ICA. These contrast functions enable both the estimation of the whole decomposition by minimizing mutual information, and estimation of individual independent components as projection pursuit directions. The statistical properties of the estimators based on such contrast functions are analyzed under the assumption of the linear mixture model, and it is shown how to choose contrast functions that are robust and/or of minimum variance. Finally, we introduce simple fixed-point algorithms for practical optimization of the contrast functions View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 13. Using recurrent neural networks for adaptive communication channel equalization

    Page(s): 267 - 278
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (952 KB)  

    Nonlinear adaptive filters based on a variety of neural network models have been used successfully for system identification and noise-cancellation in a wide class of applications. An important problem in data communications is that of channel equalization, i.e., the removal of interferences introduced by linear or nonlinear message corrupting mechanisms, so that the originally transmitted symbols can be recovered correctly at the receiver. In this paper we introduce an adaptive recurrent neural network (RNN) based equalizer whose small size and high performance makes it suitable for high-speed channel equalization. We propose RNN based structures for both trained adaptation and blind equalization, and we evaluate their performance via extensive simulations for a variety of signal modulations and communication channel models. It is shown that the RNN equalizers have comparable performance with traditional linear filter based equalizers when the channel interferences are relatively mild, and that they outperform them by several orders of magnitude when either the channel's transfer function has spectral nulls or severe nonlinear distortion is present. In addition, the small-size RNN equalizers, being essentially generalized IIR filters, are shown to outperform multilayer perceptron equalizers of larger computational complexity in linear and nonlinear channel equalization cases View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 14. A general regression neural network

    Page(s): 568 - 576
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (804 KB)  

    A memory-based network that provides estimates of continuous variables and converges to the underlying (linear or nonlinear) regression surface is described. The general regression neural network (GRNN) is a one-pass learning algorithm with a highly parallel structure. It is shown that, even with sparse data in a multidimensional measurement space, the algorithm provides smooth transitions from one observed value to another. The algorithmic form can be used for any regression problem in which an assumption of linearity is not justified View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 15. Output Feedback Control of a Quadrotor UAV Using Neural Networks

    Page(s): 50 - 66
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (763 KB) |  | HTML iconHTML  

    In this paper, a new nonlinear controller for a quadrotor unmanned aerial vehicle (UAV) is proposed using neural networks (NNs) and output feedback. The assumption on the availability of UAV dynamics is not always practical, especially in an outdoor environment. Therefore, in this work, an NN is introduced to learn the complete dynamics of the UAV online, including uncertain nonlinear terms like aerodynamic friction and blade flapping. Although a quadrotor UAV is underactuated, a novel NN virtual control input scheme is proposed which allows all six degrees of freedom (DOF) of the UAV to be controlled using only four control inputs. Furthermore, an NN observer is introduced to estimate the translational and angular velocities of the UAV, and an output feedback control law is developed in which only the position and the attitude of the UAV are considered measurable. It is shown using Lyapunov theory that the position, orientation, and velocity tracking errors, the virtual control and observer estimation errors, and the NN weight estimation errors for each NN are all semiglobally uniformly ultimately bounded (SGUUB) in the presence of bounded disturbances and NN functional reconstruction errors while simultaneously relaxing the separation principle. The effectiveness of proposed output feedback control scheme is then demonstrated in the presence of unknown nonlinear dynamics and disturbances, and simulation results are included to demonstrate the theoretical conjecture. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 16. Fuzzy support vector machines

    Page(s): 464 - 471
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (260 KB) |  | HTML iconHTML  

    A support vector machine (SVM) learns the decision surface from two distinct classes of the input points. In many applications, each input point may not be fully assigned to one of these two classes. In this paper, we apply a fuzzy membership to each input point and reformulate the SVMs such that different input points can make different contributions to the learning of decision surface. We call the proposed method fuzzy SVMs (FSVMs) View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 17. A New Formulation for Feedforward Neural Networks

    Page(s): 1588 - 1598
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1079 KB) |  | HTML iconHTML  

    Feedforward neural network is one of the most commonly used function approximation techniques and has been applied to a wide variety of problems arising from various disciplines. However, neural networks are black-box models having multiple challenges/difficulties associated with training and generalization. This paper initially looks into the internal behavior of neural networks and develops a detailed interpretation of the neural network functional geometry. Based on this geometrical interpretation, a new set of variables describing neural networks is proposed as a more effective and geometrically interpretable alternative to the traditional set of network weights and biases. Then, this paper develops a new formulation for neural networks with respect to the newly defined variables; this reformulated neural network (ReNN) is equivalent to the common feedforward neural network but has a less complex error response surface. To demonstrate the learning ability of ReNN, in this paper, two training methods involving a derivative-based (a variation of backpropagation) and a derivative-free optimization algorithms are employed. Moreover, a new measure of regularization on the basis of the developed geometrical interpretation is proposed to evaluate and improve the generalization ability of neural networks. The value of the proposed geometrical interpretation, the ReNN approach, and the new regularization measure are demonstrated across multiple test problems. Results show that ReNN can be trained more effectively and efficiently compared to the common neural networks and the proposed regularization measure is an effective indicator of how a network would perform in terms of generalization. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 18. Which model to use for cortical spiking neurons?

    Page(s): 1063 - 1070
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (257 KB) |  | HTML iconHTML  

    We discuss the biological plausibility and computational efficiency of some of the most useful models of spiking and bursting neurons. We compare their applicability to large-scale simulations of cortical neural networks. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 19. Wavelet networks

    Page(s): 889 - 898
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (800 KB)  

    A wavelet network concept, which is based on wavelet transform theory, is proposed as an alternative to feedforward neural networks for approximating arbitrary nonlinear functions. The basic idea is to replace the neurons by `wavelons', i.e., computing units obtained by cascading an affine transform and a multidimensional wavelet. Then these affine transforms and the synaptic weights must be identified from possibly noise corrupted input/output data. An algorithm of backpropagation type is proposed for wavelet network training, and experimental results are reported View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 20. Input space versus feature space in kernel-based methods

    Page(s): 1000 - 1017
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1216 KB)  

    This paper collects some ideas targeted at advancing our understanding of the feature spaces associated with support vector (SV) kernel functions. We first discuss the geometry of feature space. In particular, we review what is known about the shape of the image of input space under the feature space map, and how this influences the capacity of SV methods. Following this, we describe how the metric governing the intrinsic geometry of the mapped surface can be computed in terms of the kernel, using the example of the class of inhomogeneous polynomial kernels, which are often used in SV pattern recognition. We then discuss the connection between feature space and input space by dealing with the question of how one can, given some vector in feature space, find a preimage (exact or approximate) in input space. We describe algorithms to tackle this issue, and show their utility in two applications of kernel methods. First, we use it to reduce the computational complexity of SV decision functions; second, we combine it with the kernel PCA algorithm, thereby constructing a nonlinear statistical denoising technique which is shown to perform well on real-world data View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 21. Artificial neural networks for solving ordinary and partial differential equations

    Page(s): 987 - 1000
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (424 KB)  

    We present a method to solve initial and boundary value problems using artificial neural networks. A trial solution of the differential equation is written as a sum of two parts. The first part satisfies the initial/boundary conditions and contains no adjustable parameters. The second part is constructed so as not to affect the initial/boundary conditions. This part involves a feedforward neural network containing adjustable parameters (the weights). Hence by construction the initial/boundary conditions are satisfied and the network is trained to satisfy the differential equation. The applicability of this approach ranges from single ordinary differential equations (ODE), to systems of coupled ODE and also to partial differential equations (PDE). In this article, we illustrate the method by solving a variety of model problems and present comparisons with solutions obtained using the Galerkin finite element method for several cases of partial differential equations. With the advent of neuroprocessors and digital signal processors the method becomes particularly interesting due to the expected essential gains in the execution speed View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 22. Clustering of the self-organizing map

    Page(s): 586 - 600
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (356 KB)  

    The self-organizing map (SOM) is an excellent tool in exploratory phase of data mining. It projects input space on prototypes of a low-dimensional regular grid that can be effectively utilized to visualize and explore properties of the data. When the number of SOM units is large, to facilitate quantitative analysis of the map and the data, similar units need to be grouped, i.e., clustered. In this paper, different approaches to clustering of the SOM are considered. In particular, the use of hierarchical agglomerative clustering and partitive clustering using K-means are investigated. The two-stage procedure-first using SOM to produce the prototypes that are then clustered in the second stage-is found to perform well when compared with direct clustering of the data and to reduce the computation time View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 23. Robust Adaptive Neural Network Control for a Class of Uncertain MIMO Nonlinear Systems With Input Nonlinearities

    Page(s): 796 - 812
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1054 KB) |  | HTML iconHTML  

    In this paper, robust adaptive neural network (NN) control is investigated for a general class of uncertain multiple-input-multiple-output (MIMO) nonlinear systems with unknown control coefficient matrices and input nonlinearities. For nonsymmetric input nonlinearities of saturation and deadzone, variable structure control (VSC) in combination with backstepping and Lyapunov synthesis is proposed for adaptive NN control design with guaranteed stability. In the proposed adaptive NN control, the usual assumption on nonsingularity of NN approximation for unknown control coefficient matrices and boundary assumption between NN approximation error and control input have been eliminated. Command filters are presented to implement physical constraints on the virtual control laws, then the tedious analytic computations of time derivatives of virtual control laws are canceled. It is proved that the proposed robust backstepping control is able to guarantee semiglobal uniform ultimate boundedness of all signals in the closed-loop system. Finally, simulation results are presented to illustrate the effectiveness of the proposed adaptive NN control. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 24. A Hierarchical Graph Neuron Scheme for Real-Time Pattern Recognition

    Page(s): 212 - 229
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1739 KB) |  | HTML iconHTML  

    The hierarchical graph neuron (HGN) implements a single cycle memorization and recall operation through a novel algorithmic design. The HGN is an improvement on the already published original graph neuron (GN) algorithm. In this improved approach, it recognizes incomplete/noisy patterns. It also resolves the crosstalk problem, which is identified in the previous publications, within closely matched patterns. To accomplish this, the HGN links multiple GN networks for filtering noise and crosstalk out of pattern data inputs. Intrinsically, the HGN is a lightweight in-network processing algorithm which does not require expensive floating point computations; hence, it is very suitable for real-time applications and tiny devices such as the wireless sensor networks. This paper describes that the HGN's pattern matching capability and the small response time remain insensitive to the increases in the number of stored patterns. Moreover, the HGN does not require definition of rules or setting of thresholds by the operator to achieve the desired results nor does it require heuristics entailing iterative operations for memorization and recall of patterns. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 25. A Fast and Accurate Online Sequential Learning Algorithm for Feedforward Networks

    Page(s): 1411 - 1423
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (673 KB) |  | HTML iconHTML  

    In this paper, we develop an online sequential learning algorithm for single hidden layer feedforward networks (SLFNs) with additive or radial basis function (RBF) hidden nodes in a unified framework. The algorithm is referred to as online sequential extreme learning machine (OS-ELM) and can learn data one-by-one or chunk-by-chunk (a block of data) with fixed or varying chunk size. The activation functions for additive nodes in OS-ELM can be any bounded nonconstant piecewise continuous functions and the activation functions for RBF nodes can be any integrable piecewise continuous functions. In OS-ELM, the parameters of hidden nodes (the input weights and biases of additive nodes or the centers and impact factors of RBF nodes) are randomly selected and the output weights are analytically determined based on the sequentially arriving data. The algorithm uses the ideas of ELM of Huang developed for batch learning which has been shown to be extremely fast with generalization performance better than other batch training methods. Apart from selecting the number of hidden nodes, no other control parameters have to be manually chosen. Detailed performance comparison of OS-ELM is done with other popular sequential learning algorithms on benchmark problems drawn from the regression, classification and time series prediction areas. The results show that the OS-ELM is faster than the other sequential algorithms and produces better generalization performance View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 26. Orthogonal least squares learning algorithm for radial basis function networks

    Page(s): 302 - 309
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (580 KB)  

    The radial basis function network offers a viable alternative to the two-layer neural network in many applications of signal processing. A common learning algorithm for radial basis function networks is based on first choosing randomly some data points as radial basis function centers and then using singular-value decomposition to solve for the weights of the network. Such a procedure has several drawbacks, and, in particular, an arbitrary selection of centers is clearly unsatisfactory. The authors propose an alternative learning procedure based on the orthogonal least-squares method. The procedure chooses radial basis function centers one by one in a rational way until an adequate network has been constructed. In the algorithm, each selected center maximizes the increment to the explained variance or energy of the desired output and does not suffer numerical ill-conditioning problems. The orthogonal least-squares learning strategy provides a simple and efficient means for fitting radial basis function networks. This is illustrated using examples taken from two different signal processing applications View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 27. Using mutual information for selecting features in supervised neural net learning

    Page(s): 537 - 550
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1228 KB)  

    This paper investigates the application of the mutual information criterion to evaluate a set of candidate features and to select an informative subset to be used as input data for a neural network classifier. Because the mutual information measures arbitrary dependencies between random variables, it is suitable for assessing the “information content” of features in complex classification tasks, where methods bases on linear relations (like the correlation) are prone to mistakes. The fact that the mutual information is independent of the coordinates chosen permits a robust estimation. Nonetheless, the use of the mutual information for tasks characterized by high input dimensionality requires suitable approximations because of the prohibitive demands on computation and samples. An algorithm is proposed that is based on a “greedy” selection of the features and that takes both the mutual information with respect to the output class and with respect to the already-selected features into account. Finally the results of a series of experiments are discussed View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 28. Color clustering and learning for image segmentation based on neural networks

    Page(s): 925 - 936
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1888 KB) |  | HTML iconHTML  

    An image segmentation system is proposed for the segmentation of color image based on neural networks. In order to measure the color difference properly, image colors are represented in a modified L*u*v* color space. The segmentation system comprises unsupervised segmentation and supervised segmentation. The unsupervised segmentation is achieved by a two-level approach, i.e., color reduction and color clustering. In color reduction, image colors are projected into a small set of prototypes using self-organizing map (SOM) learning. In color clustering, simulated annealing (SA) seeks the optimal clusters from SOM prototypes. This two-level approach takes the advantages of SOM and SA, which can achieve the near-optimal segmentation with a low computational cost. The supervised segmentation involves color learning and pixel classification. In color learning, color prototype is defined to represent a spherical region in color space. A procedure of hierarchical prototype learning (HPL) is used to generate the different sizes of color prototypes from the sample of object colors. These color prototypes provide a good estimate for object colors. The image pixels are classified by the matching of color prototypes. The experimental results show that the system has the desired ability for the segmentation of color image in a variety of vision tasks. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 29. OP-ELM: Optimally Pruned Extreme Learning Machine

    Page(s): 158 - 162
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (259 KB) |  | HTML iconHTML  

    In this brief, the optimally pruned extreme learning machine (OP-ELM) methodology is presented. It is based on the original extreme learning machine (ELM) algorithm with additional steps to make it more robust and generic. The whole methodology is presented in detail and then applied to several regression and classification problems. Results for both computational time and accuracy (mean square error) are compared to the original ELM and to three other widely used methodologies: multilayer perceptron (MLP), support vector machine (SVM), and Gaussian process (GP). As the experiments for both regression and classification illustrate, the proposed OP-ELM methodology performs several orders of magnitude faster than the other algorithms used in this brief, except the original ELM. Despite the simplicity and fast performance, the OP-ELM is still able to maintain an accuracy that is comparable to the performance of the SVM. A toolbox for the OP-ELM is publicly available online. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 30. Deep Learning Regularized Fisher Mappings

    Page(s): 1668 - 1675
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (229 KB) |  | HTML iconHTML  

    For classification tasks, it is always desirable to extract features that are most effective for preserving class separability. In this brief, we propose a new feature extraction method called regularized deep Fisher mapping (RDFM), which learns an explicit mapping from the sample space to the feature space using a deep neural network to enhance the separability of features according to the Fisher criterion. Compared to kernel methods, the deep neural network is a deep and nonlocal learning architecture, and therefore exhibits more powerful ability to learn the nature of highly variable datasets from fewer samples. To eliminate the side effects of overfitting brought about by the large capacity of powerful learners, regularizers are applied in the learning procedure of RDFM. RDFM is evaluated in various types of datasets, and the results reveal that it is necessary to apply unsupervised regularization in the fine-tuning phase of deep learning. Thus, for very flexible models, the optimal Fisher feature extractor may be a balance between discriminative ability and descriptive ability. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 31. Efficient and robust feature extraction by maximum margin criterion

    Page(s): 157 - 165
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (578 KB) |  | HTML iconHTML  

    In pattern recognition, feature extraction techniques are widely employed to reduce the dimensionality of data and to enhance the discriminatory information. Principal component analysis (PCA) and linear discriminant analysis (LDA) are the two most popular linear dimensionality reduction methods. However, PCA is not very effective for the extraction of the most discriminant features, and LDA is not stable due to the small sample size problem . In this paper, we propose some new (linear and nonlinear) feature extractors based on maximum margin criterion (MMC). Geometrically, feature extractors based on MMC maximize the (average) margin between classes after dimensionality reduction. It is shown that MMC can represent class separability better than PCA. As a connection to LDA, we may also derive LDA from MMC by incorporating some constraints. By using some other constraints, we establish a new linear feature extractor that does not suffer from the small sample size problem, which is known to cause serious stability problems for LDA. The kernelized (nonlinear) counterpart of this linear feature extractor is also established in the paper. Our extensive experiments demonstrate that the new feature extractors are effective, stable, and efficient. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 32. Domain Adaptation via Transfer Component Analysis

    Page(s): 199 - 210
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (799 KB) |  | HTML iconHTML  

    Domain adaptation allows knowledge from a source domain to be transferred to a different but related target domain. Intuitively, discovering a good feature representation across domains is crucial. In this paper, we first propose to find such a representation through a new learning method, transfer component analysis (TCA), for domain adaptation. TCA tries to learn some transfer components across domains in a reproducing kernel Hilbert space using maximum mean miscrepancy. In the subspace spanned by these transfer components, data properties are preserved and data distributions in different domains are close to each other. As a result, with the new representations in this subspace, we can apply standard machine learning methods to train classifiers or regression models in the source domain for use in the target domain. Furthermore, in order to uncover the knowledge hidden in the relations between the data labels from the source and target domains, we extend TCA in a semisupervised learning setting, which encodes label information into transfer components learning. We call this extension semisupervised TCA. The main contribution of our work is that we propose a novel dimensionality reduction framework for reducing the distance between domains in a latent space for domain adaptation. We propose both unsupervised and semisupervised feature extraction approaches, which can dramatically reduce the distance between domain distributions by projecting data onto the learned transfer components. Finally, our approach can handle large datasets and naturally lead to out-of-sample generalization. The effectiveness and efficiency of our approach are verified by experiments on five toy datasets and two real-world applications: cross-domain indoor WiFi localization and cross-domain text classification. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 33. Tuning of the structure and parameters of a neural network using an improved genetic algorithm

    Page(s): 79 - 88
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (566 KB)  

    This paper presents the tuning of the structure and parameters of a neural network using an improved genetic algorithm (GA). It is also shown that the improved GA performs better than the standard GA based on some benchmark test functions. A neural network with switches introduced to its links is proposed. By doing this, the proposed neural network can learn both the input-output relationships of an application and the network structure using the improved GA. The number of hidden nodes is chosen manually by increasing it from a small number until the learning performance in terms of fitness value is good enough. Application examples on sunspot forecasting and associative memory are given to show the merits of the improved GA and the proposed neural network. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 34. Deterministic Learning and Rapid Dynamical Pattern Recognition

    Page(s): 617 - 630
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1433 KB) |  | HTML iconHTML  

    Recognition of temporal/dynamical patterns is among the most difficult pattern recognition tasks. In this paper, based on a recent result on deterministic learning theory, a deterministic framework is proposed for rapid recognition of dynamical patterns. First, it is shown that a time-varying dynamical pattern can be effectively represented in a time-invariant and spatially distributed manner through deterministic learning. Second, a definition for characterizing similarity of dynamical patterns is given based on system dynamics inherently within dynamical patterns. Third, a mechanism for rapid recognition of dynamical patterns is presented, by which a test dynamical pattern is recognized as similar to a training dynamical pattern if state synchronization is achieved according to a kind of internal and dynamical matching on system dynamics. The synchronization errors can be taken as the measure of similarity between the test and training patterns. The significance of the paper is that a completely dynamical approach is proposed, in which the problem of dynamical pattern recognition is turned into the stability and convergence of a recognition error system. Simulation studies are included to demonstrate the effectiveness of the proposed approach View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 35. Input feature selection for classification problems

    Page(s): 143 - 159
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (411 KB) |  | HTML iconHTML  

    Feature selection plays an important role in classifying systems such as neural networks (NNs). We use a set of attributes which are relevant, irrelevant or redundant and from the viewpoint of managing a dataset which can be huge, reducing the number of attributes by selecting only the relevant ones is desirable. In doing so, higher performances with lower computational effort is expected. In this paper, we propose two feature selection algorithms. The limitation of mutual information feature selector (MIFS) is analyzed and a method to overcome this limitation is studied. One of the proposed algorithms makes more considered use of mutual information between input attributes and output classes than the MIFS. What is demonstrated is that the proposed method can provide the performance of the ideal greedy selection algorithm when information is distributed uniformly. The computational load for this algorithm is nearly the same as that of MIFS. In addition, another feature selection algorithm using the Taguchi method is proposed. This is advanced as a solution to the question as to how to identify good features with as few experiments as possible. The proposed algorithms are applied to several classification problems and compared with MIFS. These two algorithms can be combined to complement each other's limitations. The combined algorithm performed well in several experiments and should prove to be a useful method in selecting features for classification problems View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 36. Bankruptcy prediction for credit risk using neural networks: A survey and new results

    Page(s): 929 - 935
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (92 KB) |  | HTML iconHTML  

    The prediction of corporate bankruptcies is an important and widely studied topic since it can have significant impact on bank lending decisions and profitability. This work presents two contributions. First we review the topic of bankruptcy prediction, with emphasis on neural-network (NN) models. Second, we develop an NN bankruptcy prediction model. Inspired by one of the traditional credit risk models developed by Merton (1974), we propose novel indicators for the NN system. We show that the use of these indicators in addition to traditional financial ratio indicators provides a significant improvement in the (out-of-sample) prediction accuracy (from 81.46% to 85.5% for a three-year-ahead forecast) View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 37. Multilayer neural-net robot controller with guaranteed tracking performance

    Page(s): 388 - 399
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1286 KB)  

    A multilayer neural-net (NN) controller for a general serial-link rigid robot arm is developed. The structure of the NN controller is derived using a filtered error/passivity approach. No off-line learning phase is needed for the proposed NN controller and the weights are easily initialized. The nonlinear nature of the NN, plus NN functional reconstruction inaccuracies and robot disturbances, mean that the standard delta rule using backpropagation tuning does not suffice for closed-loop dynamic control. Novel online weight tuning algorithms, including correction terms to the delta rule plus an added robust signal, guarantee bounded tracking errors as well as bounded NN weights. Specific bounds are determined, and the tracking error bound can be made arbitrarily small by increasing a certain feedback gain. The correction terms involve a second-order forward-propagated wave in the backpropagation network. New NN properties including the notions of a passive NN, a dissipative NN, and a robust NN are introduced. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Full text access may be available. Click article title to sign in or learn about subscription options.
  • 39. Recurrent neural networks and robust time series prediction

    Page(s): 240 - 254
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1100 KB)  

    We propose a robust learning algorithm and apply it to recurrent neural networks. This algorithm is based on filtering outliers from the data and then estimating parameters from the filtered data. The filtering removes outliers from both the target function and the inputs of the neural network. The filtering is soft in that some outliers are neither completely rejected nor accepted. To show the need for robust recurrent networks, we compare the predictive ability of least squares estimated recurrent networks on synthetic data and on the Puget Power Electric Demand time series. These investigations result in a class of recurrent neural networks, NARMA(p,q), which show advantages over feedforward neural networks for time series with a moving average component. Conventional least squares methods of fitting NARMA(p,q) neural network models are shown to suffer a lack of robustness towards outliers. This sensitivity to outliers is demonstrated on both the synthetic and real data sets. Filtering the Puget Power Electric Demand time series is shown to automatically remove the outliers due to holidays. Neural networks trained on filtered data are then shown to give better predictions than neural networks trained on unfiltered time series View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 40. Adaptive neural control of uncertain MIMO nonlinear systems

    Page(s): 674 - 692
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (546 KB) |  | HTML iconHTML  

    In this paper, adaptive neural control schemes are proposed for two classes of uncertain multi-input/multi-output (MIMO) nonlinear systems in block-triangular forms. The MIMO systems consist of interconnected subsystems, with couplings in the forms of unknown nonlinearities and/or parametric uncertainties in the input matrices, as well as in the system interconnections without any bounding restrictions. Using the block-triangular structure properties, the stability analyses of the closed-loop MIMO systems are shown in a nested iterative manner for all the states. By exploiting the special properties of the affine terms of the two classes of MIMO systems, the developed neural control schemes avoid the controller singularity problem completely without using projection algorithms. Semiglobal uniform ultimate boundedness (SGUUB) of all the signals in the closed-loop of MIMO nonlinear systems is achieved. The outputs of the systems are proven to converge to a small neighborhood of the desired trajectories. The control performance of the closed-loop system is guaranteed by suitably choosing the design parameters. The proposed schemes offer systematic design procedures for the control of the two classes of uncertain MIMO nonlinear systems. Simulation results are presented to show the effectiveness of the approach. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 41. MPCA: Multilinear Principal Component Analysis of Tensor Objects

    Page(s): 18 - 39
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1667 KB) |  | HTML iconHTML  

    This paper introduces a multilinear principal component analysis (MPCA) framework for tensor object feature extraction. Objects of interest in many computer vision and pattern recognition applications, such as 2D/3D images and video sequences are naturally described as tensors or multilinear arrays. The proposed framework performs feature extraction by determining a multilinear projection that captures most of the original tensorial input variation. The solution is iterative in nature and it proceeds by decomposing the original problem to a series of multiple projection subproblems. As part of this work, methods for subspace dimensionality determination are proposed and analyzed. It is shown that the MPCA framework discussed in this work supplants existing heterogeneous solutions such as the classical principal component analysis (PCA) and its 2D variant (2D PCA). Finally, a tensor object recognition system is proposed with the introduction of a discriminative tensor feature selection mechanism and a novel classification strategy, and applied to the problem of gait recognition. Results presented here indicate MPCA's utility as a feature extraction tool. It is shown that even without a fully optimized design, an MPCA-based gait recognition module achieves highly competitive performance and compares favorably to the state-of-the-art gait recognizers. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 42. BELM: Bayesian Extreme Learning Machine

    Page(s): 505 - 509
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (415 KB) |  | HTML iconHTML  

    The theory of extreme learning machine (ELM) has become very popular on the last few years. ELM is a new approach for learning the parameters of the hidden layers of a multilayer neural network (as the multilayer perceptron or the radial basis function neural network). Its main advantage is the lower computational cost, which is especially relevant when dealing with many patterns defined in a high-dimensional space. This brief proposes a Bayesian approach to ELM, which presents some advantages over other approaches: it allows the introduction of a priori knowledge; obtains the confidence intervals (CIs) without the need of applying methods that are computationally intensive, e.g., bootstrap; and presents high generalization capabilities. Bayesian ELM is benchmarked against classical ELM in several artificial and real datasets that are widely used for the evaluation of machine learning algorithms. Achieved results show that the proposed approach produces a competitive accuracy with some additional advantages, namely, automatic production of CIs, reduction of probability of model overfitting, and use of a priori knowledge. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 43. Control of a nonholonomic mobile robot using neural networks

    Page(s): 589 - 600
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (464 KB)  

    A control structure that makes possible the integration of a kinematic controller and a neural network (NN) computed-torque controller for nonholonomic mobile robots is presented. A combined kinematic/torque control law is developed using backstepping and stability is guaranteed by Lyapunov theory. This control algorithm can be applied to the three basic nonholonomic navigation problems: tracking a reference trajectory, path following, and stabilization about a desired posture. Moreover, the NN controller proposed in this work can deal with unmodeled bounded disturbances and/or unstructured unmodeled dynamics in the vehicle. Online NN weight tuning algorithms do not require off-line learning yet guarantee small tracking errors and bounded control signals are utilized View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 44. Neural network-based adaptive dynamic surface control for a class of uncertain nonlinear systems in strict-feedback form

    Page(s): 195 - 202
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (352 KB) |  | HTML iconHTML  

    The dynamic surface control (DSC) technique was developed recently by Swaroop et al. This technique simplified the backstepping design for the control of nonlinear systems in strict-feedback form by overcoming the problem of "explosion of complexity." It was later extended to adaptive backstepping design for nonlinear systems with linearly parameterized uncertainty. In this paper, by incorporating this design technique into a neural network based adaptive control design framework, we have developed a backstepping based control design for a class of nonlinear systems in strict-feedback form with arbitrary uncertainty. Our development is able to eliminate the problem of "explosion of complexity" inherent in the existing method. In addition, a stability analysis is given which shows that our control law can guarantee the uniformly ultimate boundedness of the solution of the closed-loop system, and make the tracking error arbitrarily small. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 45. Neuro-fuzzy rule generation: survey in soft computing framework

    Page(s): 748 - 768
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (248 KB)  

    The present article is a novel attempt in providing an exhaustive survey of neuro-fuzzy rule generation algorithms. Rule generation from artificial neural networks is gaining in popularity in recent times due to its capability of providing some insight to the user about the symbolic knowledge embedded within the network. Fuzzy sets are an aid in providing this information in a more human comprehensible or natural form, and can handle uncertainties at various levels. The neuro-fuzzy approach, symbiotically combining the merits of connectionist and fuzzy approaches, constitutes a key component of soft computing at this stage. To date, there has been no detailed and integrated categorization of the various neuro-fuzzy models used for rule generation. We propose to bring these together under a unified soft computing framework. Moreover, we include both rule extraction and rule refinement in the broader perspective of rule generation. Rules learned and generated for fuzzy reasoning and fuzzy control are also considered from this wider viewpoint. Models are grouped on the basis of their level of neuro-fuzzy synthesis. Use of other soft computing tools like genetic algorithms and rough sets are emphasized. Rule generation from fuzzy knowledge-based networks, which initially encode some crude domain knowledge, are found to result in more refined rules. Finally, real-life application to medical diagnosis is provided View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 46. Artificial neural networks for feature extraction and multivariate data projection

    Page(s): 296 - 317
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1592 KB)  

    Classical feature extraction and data projection methods have been well studied in the pattern recognition and exploratory data analysis literature. We propose a number of networks and learning algorithms which provide new or alternative tools for feature extraction and data projection. These networks include a network (SAMANN) for J.W. Sammon's (1969) nonlinear projection, a linear discriminant analysis (LDA) network, a nonlinear discriminant analysis (NDA) network, and a network for nonlinear projection (NP-SOM) based on Kohonen's self-organizing map. A common attribute of these networks is that they all employ adaptive learning algorithms which makes them suitable in some environments where the distribution of patterns in feature space changes with respect to time. The availability of these networks also facilitates hardware implementation of well-known classical feature extraction and projection approaches. Moreover, the SAMANN network offers the generalization ability of projecting new data, which is not present in the original Sammon's projection algorithm; the NDA method and NP-SOM network provide new powerful approaches for visualizing high dimensional data. We evaluate five representative neural networks for feature extraction and data projection based on a visual judgement of the two-dimensional projection maps and three quantitative criteria on eight data sets with various properties View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 47. Web mining in soft computing framework: relevance, state of the art and future directions

    Page(s): 1163 - 1177
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (373 KB) |  | HTML iconHTML  

    The paper summarizes the different characteristics of Web data, the basic components of Web mining and its different types, and the current state of the art. The reason for considering Web mining, a separate field from data mining, is explained. The limitations of some of the existing Web mining methods and tools are enunciated, and the significance of soft computing (comprising fuzzy logic (FL), artificial neural networks (ANNs), genetic algorithms (GAs), and rough sets (RSs) are highlighted. A survey of the existing literature on "soft Web mining" is provided along with the commercially available systems. The prospective areas of Web mining where the application of soft computing needs immediate attention are outlined with justification. Scope for future research in developing "soft Web mining" systems is explained. An extensive bibliography is also provided. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 48. Support vector machine with adaptive parameters in financial time series forecasting

    Page(s): 1506 - 1518
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (558 KB) |  | HTML iconHTML  

    A novel type of learning machine called support vector machine (SVM) has been receiving increasing interest in areas ranging from its original application in pattern recognition to other applications such as regression estimation due to its remarkable generalization performance. This paper deals with the application of SVM in financial time series forecasting. The feasibility of applying SVM in financial forecasting is first examined by comparing it with the multilayer back-propagation (BP) neural network and the regularized radial basis function (RBF) neural network. The variability in performance of SVM with respect to the free parameters is investigated experimentally. Adaptive parameters are then proposed by incorporating the nonstationarity of financial time series into SVM. Five real futures contracts collated from the Chicago Mercantile Market are used as the data sets. The simulation shows that among the three methods, SVM outperforms the BP neural network in financial forecasting, and there are comparable generalization performance between SVM and the regularized RBF neural network. Furthermore, the free parameters of SVM have a great effect on the generalization performance. SVM with adaptive parameters can both achieve higher generalization performance and use fewer support vectors than the standard SVM in financial forecasting. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 49. Neural-network predictive control for nonlinear dynamic systems with time-delay

    Page(s): 377 - 389
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (833 KB) |  | HTML iconHTML  

    A new recurrent neural-network predictive feedback control structure for a class of uncertain nonlinear dynamic time-delay systems in canonical form is developed and analyzed. The dynamic system has constant input and feedback time delays due to a communications channel. The proposed control structure consists of a linearized subsystem local to the controlled plant and a remote predictive controller located at the master command station. In the local linearized subsystem, a recurrent neural network with on-line weight tuning algorithm is employed to approximate the dynamics of the time-delay-free nonlinear plant. No linearity in the unknown parameters is required. No preliminary off-line weight learning is needed. The remote controller is a modified Smith predictor that provides prediction and maintains the desired tracking performance; an extra robustifying term is needed to guarantee stability. Rigorous stability proofs are given using Lyapunov analysis. The result is an adaptive neural net compensation scheme for unknown nonlinear systems with time delays. A simulation example is provided to demonstrate the effectiveness of the proposed control strategy. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 50. Learning long-term dependencies with gradient descent is difficult

    Page(s): 157 - 166
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1008 KB)  

    Recurrent neural networks can be used to map input sequences to output sequences, such as for recognition, production or prediction problems. However, practical difficulties have been reported in training recurrent neural networks to perform tasks in which the temporal contingencies present in the input/output sequences span long intervals. We show why gradient based learning algorithms face an increasingly difficult problem as the duration of the dependencies to be captured increases. These results expose a trade-off between efficient learning by gradient descent and latching on information for long periods. Based on an understanding of this problem, alternatives to standard gradient descent are considered View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

IEEE Transactions on Neural Networks is devoted to the science and technology of neural networks, which disclose significant technical knowledge, exploratory developments, and applications of neural networks from biology to software to hardware.

 

This Transactions ceased production in 2011. The current retitled publication is IEEE Transactions on Neural Networks and Learning Systems.

Full Aims & Scope