By Topic

Neural Networks, IEEE Transactions on

Issue 1 • Date Jan. 2004

Filter Results

Displaying Results 1 - 25 of 32
  • Table of contents

    Page(s): 0_1
    Save to Project icon | Request Permissions | PDF file iconPDF (42 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Neural Networks publications information

    Page(s): 0_2
    Save to Project icon | Request Permissions | PDF file iconPDF (37 KB)  
    Freely Available from IEEE
  • Editorial IEEE Transactions on Neural Networks: Editorial Report and Passing the Baton Jacek M. Zurada (left) and Marios M. Polycarpou

    Page(s): 1 - 5
    Save to Project icon | Request Permissions | PDF file iconPDF (147 KB)  
    Freely Available from IEEE
  • Markovian architectural bias of recurrent neural networks

    Page(s): 6 - 15
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (387 KB) |  | HTML iconHTML  

    In this paper, we elaborate upon the claim that clustering in the recurrent layer of recurrent neural networks (RNNs) reflects meaningful information processing states even prior to training. By concentrating on activation clusters in RNNs, while not throwing away the continuous state space network dynamics, we extract predictive models that we call neural prediction machines (NPMs). When RNNs with sigmoid activation functions are initialized with small weights (a common technique in the RNN community), the clusters of recurrent activations emerging prior to training are indeed meaningful and correspond to Markov prediction contexts. In this case, the extracted NPMs correspond to a class of Markov models, called variable memory length Markov models (VLMMs). In order to appreciate how much information has really been induced during the training, the RNN performance should always be compared with that of VLMMs and NPMs extracted before training as the "null" base models. Our arguments are supported by experiments on a chaotic symbolic sequence and a context-free language with a deep recursive structure. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The generalized LASSO

    Page(s): 16 - 28
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (351 KB) |  | HTML iconHTML  

    In the last few years, the support vector machine (SVM) method has motivated new interest in kernel regression techniques. Although the SVM has been shown to exhibit excellent generalization properties in many experiments, it suffers from several drawbacks, both of a theoretical and a technical nature: the absence of probabilistic outputs, the restriction to Mercer kernels, and the steep growth of the number of support vectors with increasing size of the training set. In this paper, we present a different class of kernel regressors that effectively overcome the above problems. We call this approach generalized LASSO regression. It has a clear probabilistic interpretation, can handle learning sets that are corrupted by outliers, produces extremely sparse solutions, and is capable of dealing with large-scale problems. For regression functionals which can be modeled as iteratively reweighted least-squares (IRLS) problems, we present a highly efficient algorithm with guaranteed global convergence. This defies a unique framework for sparse regression models in the very rich class of IRLS models, including various types of robust regression models and logistic regression. Performance studies for many standard benchmark datasets effectively demonstrate the advantages of this model over related approaches. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Bayesian support vector regression using a unified loss function

    Page(s): 29 - 44
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (614 KB) |  | HTML iconHTML  

    In this paper, we use a unified loss function, called the soft insensitive loss function, for Bayesian support vector regression. We follow standard Gaussian processes for regression to set up the Bayesian framework, in which the unified loss function is used in the likelihood evaluation. Under this framework, the maximum a posteriori estimate of the function values corresponds to the solution of an extended support vector regression problem. The overall approach has the merits of support vector regression such as convex quadratic programming and sparsity in solution representation. It also has the advantages of Bayesian methods for model adaptation and error bars of its predictions. Experimental results on simulated and real-world data sets indicate that the approach works well even on large data sets. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • New results on error correcting output codes of kernel machines

    Page(s): 45 - 54
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (398 KB) |  | HTML iconHTML  

    We study the problem of multiclass classification within the framework of error correcting output codes (ECOC) using margin-based binary classifiers. Specifically, we address two important open problems in this context: decoding and model selection. The decoding problem concerns how to map the outputs of the classifiers into class codewords. In this paper we introduce a new decoding function that combines the margins through an estimate of their class conditional probabilities. Concerning model selection, we present new theoretical results bounding the leave-one-out (LOO) error of ECOC of kernel machines, which can be used to tune kernel hyperparameters. We report experiments using support vector machines as the base binary classifiers, showing the advantage of the proposed decoding function over other functions of I he margin commonly used in practice. Moreover, our empirical evaluations on model selection indicate that the bound leads to good estimates of kernel parameters. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Independent component analysis based on nonparametric density estimation

    Page(s): 55 - 65
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (691 KB) |  | HTML iconHTML  

    In this paper, we introduce a novel independent component analysis (ICA) algorithm, which is truly blind to the particular underlying distribution of the mixed signals. Using a nonparametric kernel density estimation technique, the algorithm performs simultaneously the estimation of the unknown probability density functions of the source signals and the estimation of the unmixing matrix. Following the proposed approach, the blind signal separation framework can be posed as a nonlinear optimization problem, where a closed form expression of the cost function is available, and only the elements of the unmixing matrix appear as unknowns. We conducted a series of Monte Carlo simulations, involving linear mixtures of various source signals with different statistical characteristics and sample sizes. The new algorithm not only consistently outperformed all state-of-the-art ICA methods, but also demonstrated the following properties: 1) Only a flexible model, capable of learning the source statistics, can consistently achieve an accurate separation of all the mixed signals. 2) Adopting a suitably designed optimization framework, it is possible to derive a flexible ICA algorithm that matches the stability and convergence properties of conventional algorithms. 3) A nonparametric approach does not necessarily require large sample sizes in order to outperform methods with fixed or partially adaptive contrast functions. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A "nonnegative PCA" algorithm for independent component analysis

    Page(s): 66 - 76
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (849 KB) |  | HTML iconHTML  

    We consider the task of independent component analysis when the independent sources are known to be nonnegative and well-grounded, so that they have a nonzero probability density function (pdf) in the region of zero. We propose the use of a "nonnegative principal component analysis (nonnegative PCA)" algorithm, which is a special case of the nonlinear PCA algorithm, but with a rectification nonlinearity, and we conjecture that this algorithm will find such nonnegative well-grounded independent sources, under reasonable initial conditions. While the algorithm has proved difficult to analyze in the general case, we give some analytical results that are consistent with this conjecture and some numerical simulations that illustrate its operation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Positive and negative circuits in discrete neural networks

    Page(s): 77 - 83
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (233 KB) |  | HTML iconHTML  

    We study the relationships between the positive and negative circuits of the connection graph and the fixed points of discrete neural networks (DNNs). As main results, we give necessary conditions and sufficient conditions for the existence of fixed points in a DNN. Moreover, we exhibit an upper bound for the number of fixed points in terms of the structure and number of positive circuits in the connection graph. This allows the determination of the maximum capacity for storing vectors in DNNs as fixed points, depending on the architecture of the network. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A noisy self-organizing neural network with bifurcation dynamics for combinatorial optimization

    Page(s): 84 - 98
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1114 KB)  

    The self-organizing neural network (SONN) for solving general "0-1" combinatorial optimization problems (COPs) is studied in this paper, with the aim of overcoming existing limitations in convergence and solution quality. This is achieved by incorporating two main features: an efficient weight normalization process exhibiting bifurcation dynamics, and neurons with additive noise. The SONN is studied both theoretically and experimentally by using the N-queen problem as an example to demonstrate and explain the dependence of optimization performance on annealing schedules and other system parameters. An equilibrium model of the SONN with neuronal weight normalization is derived, which explains observed bands of high feasibility in the normalization parameter space in terms of bifurcation dynamics of the normalization process, and provides insights into the roles of different parameters in the optimization process. Under certain conditions, this dynamical systems view of the SONN reveals cascades of period-doubling bifurcations to chaos occurring in multidimensional space with the annealing temperature as the bifurcation parameter. A strange attractor in the two-dimensional (2-D) case is also presented. Furthermore, by adding random noise to the cost potentials of the network nodes, it is demonstrated that unwanted oscillations between symmetrical and "greedy" nodes can be sufficiently reduced, resulting in higher solution quality and feasibility. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Deriving sufficient conditions for global asymptotic stability of delayed neural networks via nonsmooth analysis

    Page(s): 99 - 109
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (292 KB) |  | HTML iconHTML  

    In this paper, we obtain new sufficient conditions ensuring existence, uniqueness, and global asymptotic stability (GAS) of the equilibrium point for a general class of delayed neural networks (DNNs) via nonsmooth analysis, which makes full use of the Lipschitz property of functions defining DNNs. Based on this new tool of nonsmooth analysis, we first obtain a couple of general results concerning the existence and uniqueness of the equilibrium point. Then those results are applied to show that existence assumptions on the equilibrium point in some existing sufficient conditions ensuring GAS are actually unnecessary; and some strong assumptions such as the boundedness of activation functions in some other existing sufficient conditions can be actually dropped. Finally, we derive some new sufficient conditions which are easy to check. Comparison with some related existing results is conducted and advantages are illustrated with examples. Throughout our paper, spectral properties of the matrix (A + Aτ) play an important role, which is a distinguished feature from previous studies. Here, A and Aτ are, respectively, the feedback and the delayed feedback matrix defining the neural network under consideration. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A neuro-fuzzy scheme for simultaneous feature selection and fuzzy rule-based classification

    Page(s): 110 - 123
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (525 KB)  

    Most methods of classification either ignore feature analysis or do it in a separate phase, offline prior to the main classification task. This paper proposes a neuro-fuzzy scheme for designing a classifier along with feature selection. It is a four-layered feed-forward network for realizing a fuzzy rule-based classifier. The network is trained by error backpropagation in three phases. In the first phase, the network learns the important features and the classification rules. In the subsequent phases, the network is pruned to an "optimal" architecture that represents an "optimal" set of rules. Pruning is found to drastically reduce the size of the network without degrading the performance. The pruned network is further tuned to improve performance. The rules learned by the network can be easily read from the network. The system is tested on both synthetic and real data sets and found to perform quite well. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Neuro-sliding mode control with its applications to seesaw systems

    Page(s): 124 - 134
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (571 KB) |  | HTML iconHTML  

    This paper proposes an approach of cooperative control that is based on the concept of combining neural networks and the methodology of sliding mode control (SMC). The main purpose is to eliminate the chattering phenomenon. Next, the system performance can be improved by using the method of SMC. In the present approach, two parallel Neural Networks are utilized to realize a neuro-sliding mode control (NSMC), where the equivalent control and the corrective control are the outputs of neural network 1 and neural network 2, respectively. Based on expressions of the SMC, the weight adaptations of neural network can be determined. Furthermore, the gradient descent method is used to minimize the control force so that the chattering phenomenon can be eliminated. Finally, experimental results are given to show the effectiveness and feasibility of the approach. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A hierarchical self-organizing approach for learning the patterns of motion trajectories

    Page(s): 135 - 144
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (806 KB) |  | HTML iconHTML  

    The understanding and description of object behaviors is a hot topic in computer vision. Trajectory analysis is one of the basic problems in behavior understanding, and the learning of trajectory patterns that can be used to detect anomalies and predict object trajectories is an interesting and important problem in trajectory analysis. In this paper, we present a hierarchical self-organizing neural network model and its application to the learning of trajectory distribution patterns for event recognition. The distribution patterns of trajectories are learnt using a hierarchical self-organizing neural network. Using the learned patterns, we consider anomaly detection as well as object behavior prediction. Compared with the existing neural network structures that are used to learn patterns of trajectories, our network structure has smaller scale and faster learning speed, and is thus more effective. Experimental results using two different sets of data demonstrate the accuracy and speed of our hierarchical self-organizing neural network in learning the distribution patterns of object trajectories. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multilevel category structure in the ART-2 network

    Page(s): 145 - 158
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2154 KB) |  | HTML iconHTML  

    Multilevel categorization is investigated within the context of analog activity patterns on the output layer of an ART 2 network. The ART 2 network parameters are analyzed in terms of stable category formation and in terms of the number of nodes in the output layer that can become most active. The resulting activity patterns on the output layer demonstrate a multilevel category structure based on the relative differences between patterns that exist for many different values of the vigilance parameter. We have shown that the information contained in the output analog patterns can be interpreted in several different ways, which is not possible when the category is represented by a single winning node. Also, favorable comparisons are also demonstrated between the category structure emerging from the set of category patterns and principles of categorization in psychology and neurobiology. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A temporally adaptive classifier for multispectral imagery

    Page(s): 159 - 165
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (400 KB)  

    This paper presents a new temporally adaptive classification system for multispectral images. A spatial-temporal adaptation mechanism is devised to account for the changes in the feature space as a result of environmental variations. Classification based upon spatial features is performed using Bayesian framework or probabilistic neural networks (PNNs) while the temporal updating takes place using a spatial-temporal predictor. A simple iterative updating mechanism is also introduced for adjusting the parameters of these systems. The proposed methodology is used to develop a pixel-based cloud classification system. Experimental results on cloud classification from satellite imagery are provided to show the usefulness of this system. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Face recognition by applying wavelet subband representation and kernel associative memory

    Page(s): 166 - 177
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (394 KB) |  | HTML iconHTML  

    In this paper, we propose an efficient face recognition scheme which has two features: 1) representation of face images by two-dimensional (2D) wavelet subband coefficients and 2) recognition by a modular, personalised classification method based on kernel associative memory models. Compared to PCA projections and low resolution "thumb-nail" image representations, wavelet subband coefficients can efficiently capture substantial facial features while keeping computational complexity low. As there are usually very limited samples, we constructed an associative memory (AM) model for each person and proposed to improve the performance of AM models by kernel methods. Specifically, we first applied kernel transforms to each possible training pair of faces sample and then mapped the high-dimensional feature space back to input space. Our scheme using modular autoassociative memory for face recognition is inspired by the same motivation as using autoencoders for optical character recognition (OCR), for which the advantages has been proven. By associative memory, all the prototypical faces of one particular person are used to reconstruct themselves and the reconstruction error for a probe face image is used to decide if the probe face is from the corresponding person. We carried out extensive experiments on three standard face recognition datasets, the FERET data, the XM2VTS data, and the ORL data. Detailed comparisons with earlier published results are provided and our proposed scheme offers better recognition accuracy on all of the face datasets. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multiscale approximation with hierarchical radial basis functions networks

    Page(s): 178 - 188
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2327 KB) |  | HTML iconHTML  

    An approximating neural model, called hierarchical radial basis function (HRBF) network, is presented here. This is a self-organizing (by growing) multiscale version of a radial basis function (RBF) network. It is constituted of hierarchical layers, each containing a Gaussian grid at a decreasing scale. The grids are not completely filled, but units are inserted only where the local error is over threshold. This guarantees a uniform residual error and the allocation of more units with smaller scales where the data contain higher frequencies. Only local operations, which do not require any iteration on the data, are required; this allows to construct the network in quasi-real time. Through harmonic analysis, it is demonstrated that, although a HRBF cannot be reduced to a traditional wavelet-based multiresolution analysis (MRA), it does employ Riesz bases and enjoys asymptotic approximation properties for a very large class of functions. HRBF networks have been extensively applied to the reconstruction of three-dimensional (3-13) models from noisy range data. The results illustrate their power in denoising the original data, obtaining an effective multiscale reconstruction of better quality than that obtained by MRA. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Comparison of different classification algorithms for underwater target discrimination

    Page(s): 189 - 194
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (358 KB)  

    Classification of underwater targets from the acoustic backscattered signals is considered. Several different classification algorithms are tested and benchmarked not only for their performance but also to gain insight to the properties of the feature space. Results on a wideband 80-kHz acoustic backscattered data set collected for six different objects are presented in terms of the receiver operating characteristic (ROC) and robustness of the classifiers wrt reverberation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Boolean Hebb rule for binary associative memory design

    Page(s): 195 - 202
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (237 KB)  

    A binary associative memory design procedure that gives a Hopfield network with a symmetric binary weight matrix is introduced in this paper. The proposed method is based on introducing the memory vectors as maximal independent sets to an undirected graph, which is constructed by Boolean operations analogous to the conventional Hebb rule. The parameters of the resulting network is then determined via the adjacency matrix of this graph in order to rind a maximal independent set whose characteristic vector is close to the given distorted vector. We show that the method provides attractiveness for each memory vector and avoids spurious memories whenever the set of given memory vectors satisfy certain compatibility conditions, which implicitly imply sparsity. The applicability of the design method is finally investigated by a quantitative analysis of the compatibility conditions. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Robust global exponential stability of Cohen-Grossberg neural networks with time delays

    Page(s): 203 - 206
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (197 KB)  

    The authors discuss delayed Cohen-Grossberg neural network models and investigate their global exponential stability of the equilibrium point for the systems. A set of sufficient conditions ensuring robust global exponential convergence of the Cohen-Grossberg neural networks with time delays are given. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Gaussian filters and filter synthesis using a Hermite/Laguerre neural network

    Page(s): 206 - 214
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (549 KB)  

    A neural network for calculating the correlation of a signal with a Gaussian function is described. The network behaves as a Gaussian filter and has two outputs: the first approximates the noisy signal and the second represents the filtered signal. The filtered output provides improvement by a factor often in the signal-to-noise ratio. A higher order Gaussian filter was synthesized by combining several Hermite functions together. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Coupled principal component analysis

    Page(s): 214 - 222
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (485 KB)  

    A framework for a class of coupled principal component learning rules is presented. In coupled rules, eigenvectors and eigenvalues of a covariance matrix are simultaneously estimated in coupled equations. Coupled rules can mitigate the stability-speed problem affecting noncoupled learning rules, since the convergence speed in all eigendirections of the Jacobian becomes widely independent of the eigenvalues of the covariance matrix. A number of coupled learning rule systems for principal component analysis, two of them new, is derived by applying Newton's method to an information criterion. The relations to other systems of this class, the adaptive learning algorithm (ALA), the robust recursive least squares algorithm (RRLSA), and a rule with explicit renormalization of the weight vector length, are established. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A generalized LMI-based approach to the global asymptotic stability of delayed cellular neural networks

    Page(s): 223 - 225
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (127 KB)  

    A novel linear matrix inequality (LMI)-based criterion for the global asymptotic stability and uniqueness of the equilibrium point of a class of delayed cellular neural networks (CNNs) is presented. The criterion turns out to be a generalization and improvement over some previous criteria. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

IEEE Transactions on Neural Networks is devoted to the science and technology of neural networks, which disclose significant technical knowledge, exploratory developments, and applications of neural networks from biology to software to hardware.

 

This Transactions ceased production in 2011. The current retitled publication is IEEE Transactions on Neural Networks and Learning Systems.

Full Aims & Scope