By Topic

Neural Networks, IEEE Transactions on

Issue 2 • Date Mar 1994

Filter Results

Displaying Results 1 - 15 of 15
  • An analysis of the gamma memory in dynamic neural networks

    Publication Year: 1994 , Page(s): 331 - 337
    Cited by:  Papers (23)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (712 KB)  

    Presents a vector space framework to study short-term memory filters in dynamic neural networks. The authors define parameters to quantify the function of feedforward and recursive linear memory filters. They show, using vector spaces, what is the optimization problem solved by the PEs of the first hidden layer of the single input focused network architecture. Due to the special properties of the gamma bases, recursion brings an extra parameter λ (the time constant of the leaky integrator) that displaces the memory manifold towards the desired signal when the mean square error is minimized. In contrast, for the feedforward memory filter the angle between the desired signal and the memory manifold is fixed for a given memory order. The adaptation of the feedback parameter can be done using gradient descent, but the optimization is nonconvex View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Using recurrent neural networks for adaptive communication channel equalization

    Publication Year: 1994 , Page(s): 267 - 278
    Cited by:  Papers (82)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (952 KB)  

    Nonlinear adaptive filters based on a variety of neural network models have been used successfully for system identification and noise-cancellation in a wide class of applications. An important problem in data communications is that of channel equalization, i.e., the removal of interferences introduced by linear or nonlinear message corrupting mechanisms, so that the originally transmitted symbols can be recovered correctly at the receiver. In this paper we introduce an adaptive recurrent neural network (RNN) based equalizer whose small size and high performance makes it suitable for high-speed channel equalization. We propose RNN based structures for both trained adaptation and blind equalization, and we evaluate their performance via extensive simulations for a variety of signal modulations and communication channel models. It is shown that the RNN equalizers have comparable performance with traditional linear filter based equalizers when the channel interferences are relatively mild, and that they outperform them by several orders of magnitude when either the channel's transfer function has spectral nulls or severe nonlinear distortion is present. In addition, the small-size RNN equalizers, being essentially generalized IIR filters, are shown to outperform multilayer perceptron equalizers of larger computational complexity in linear and nonlinear channel equalization cases View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Learning long-term dependencies with gradient descent is difficult

    Publication Year: 1994 , Page(s): 157 - 166
    Cited by:  Papers (105)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1008 KB)  

    Recurrent neural networks can be used to map input sequences to output sequences, such as for recognition, production or prediction problems. However, practical difficulties have been reported in training recurrent neural networks to perform tasks in which the temporal contingencies present in the input/output sequences span long intervals. We show why gradient based learning algorithms face an increasingly difficult problem as the duration of the dependencies to be captured increases. These results expose a trade-off between efficient learning by gradient descent and latching on information for long periods. Based on an understanding of this problem, alternatives to standard gradient descent are considered View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the problem of local minima in recurrent neural networks

    Publication Year: 1994 , Page(s): 167 - 177
    Cited by:  Papers (11)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1020 KB)  

    Many researchers have recently focused their efforts on devising efficient algorithms, mainly based on optimization schemes, for learning the weights of recurrent neural networks. As in the case of feedforward networks, however, these learning algorithms may get stuck in local minima during gradient descent, thus discovering sub-optimal solutions. This paper analyses the problem of optimal learning in recurrent networks by proposing conditions that guarantee local minima free error surfaces. An example is given that also shows the constructive role of the proposed theory in designing networks suitable for solving a given task. Moreover, a formal relationship between recurrent and static feedforward networks is established such that the examples of local minima for feedforward networks already known in the literature can be associated with analogous ones in recurrent networks View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Memory neuron networks for identification and control of dynamical systems

    Publication Year: 1994 , Page(s): 306 - 319
    Cited by:  Papers (108)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1312 KB)  

    This paper discusses memory neuron networks as models for identification and adaptive control of nonlinear dynamical systems. These are a class of recurrent networks obtained by adding trainable temporal elements to feedforward networks that makes the output history-sensitive. By virtue of this capability, these networks can identify dynamical systems without having to be explicitly fed with past inputs and outputs. Thus, they can identify systems whose order is unknown or systems with unknown delay. It is argued that for satisfactory modeling of dynamical systems, neural networks should be endowed with such internal memory. The paper presents a preliminary analysis of the learning algorithm, providing theoretical justification for the identification method. Methods for adaptive control of nonlinear systems using these networks are presented. Through extensive simulations, these models are shown to be effective both for identification and model reference adaptive control of nonlinear systems View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Neurocontrol of nonlinear dynamical systems with Kalman filter trained recurrent networks

    Publication Year: 1994 , Page(s): 279 - 297
    Cited by:  Papers (161)  |  Patents (8)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1684 KB)  

    Although the potential of the powerful mapping and representational capabilities of recurrent network architectures is generally recognized by the neural network research community, recurrent neural networks have not been widely used for the control of nonlinear dynamical systems, possibly due to the relative ineffectiveness of simple gradient descent training algorithms. Developments in the use of parameter-based extended Kalman filter algorithms for training recurrent networks may provide a mechanism by which these architectures will prove to be of practical value. This paper presents a decoupled extended Kalman filter (DEKF) algorithm for training of recurrent networks with special emphasis on application to control problems. We demonstrate in simulation the application of the DEKF algorithm to a series of example control problems ranging from the well-known cart-pole and bioreactor benchmark problems to an automotive subsystem, engine idle speed control. These simulations suggest that recurrent controller networks trained by Kalman filter methods can combine the traditional features of state-space controllers and observers in a homogeneous architecture for nonlinear dynamical systems, while simultaneously exhibiting less sensitivity than do purely feedforward controller networks to changes in plant parameters and measurement noise View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Steepest descent algorithms for neural network controllers and filters

    Publication Year: 1994 , Page(s): 198 - 212
    Cited by:  Papers (38)  |  Patents (14)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1340 KB)  

    A number of steepest descent algorithms have been developed for adapting discrete-time dynamical systems, including the backpropagation through time and recursive backpropagation algorithms. In this paper, a tutorial on the use of these algorithms for adapting neural network controllers and filters is presented. In order to effectively compare and contrast the algorithms, a unified framework for the algorithms is developed. This framework is based upon a standard representation of a discrete-time dynamical system. Using this framework, the computational and storage requirements of the algorithms are derived. These requirements are used to select the appropriate algorithm for training a neural network controller or filter. Finally, to illustrate the usefulness of the techniques presented in this paper, a neural network control example and a neural network filtering example are presented View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Locally recurrent globally feedforward networks: a critical review of architectures

    Publication Year: 1994 , Page(s): 229 - 239
    Cited by:  Papers (68)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1012 KB)  

    In this paper, we will consider a number of local-recurrent-global-feedforward (LRGF) networks that have been introduced by a number of research groups in the past few years. We first analyze the various architectures, with a view to highlighting their differences. Then we introduce a general LRGF network structure that includes most of the network architectures that have been proposed to date. Finally we will indicate some open issues concerning these types of networks View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An application of recurrent nets to phone probability estimation

    Publication Year: 1994 , Page(s): 298 - 305
    Cited by:  Papers (106)  |  Patents (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (760 KB)  

    This paper presents an application of recurrent networks for phone probability estimation in large vocabulary speech recognition. The need for efficient exploitation of context information is discussed; a role for which the recurrent net appears suitable. An overview of early developments of recurrent nets for phone recognition is given along with the more recent improvements that include their integration with Markov models. Recognition results are presented for the DARPA TIMIT and Resource Management tasks, and it is concluded that recurrent nets are competitive with traditional means for performing phone probability estimation View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Application of the recurrent multilayer perceptron in modeling complex process dynamics

    Publication Year: 1994 , Page(s): 255 - 266
    Cited by:  Papers (42)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1152 KB)  

    A nonlinear dynamic model is developed for a process system, namely a heat exchanger, using the recurrent multilayer perceptron network as the underlying model structure. The perceptron is a dynamic neural network, which appears effective in the input-output modeling of complex process systems. Dynamic gradient descent learning is used to train the recurrent multilayer perceptron, resulting in an order of magnitude improvement in convergence speed over a static learning algorithm used to train the same network. In developing the empirical process model the effects of actuator, process, and sensor noise on the training and testing sets are investigated. Learning and prediction both appear very effective, despite the presence of training and testing set noise, respectively. The recurrent multilayer perceptron appears to learn the deterministic part of a stochastic training set, and it predicts approximately a moving average response of various testing sets. Extensive model validation studies with signals that are encountered in the operation of the process system modeled, that is steps and ramps, indicate that the empirical model can substantially generalize operational transients, including accurate prediction of instabilities not in the training set. However, the accuracy of the model beyond these operational transients has not been investigated. Furthermore, online learning is necessary during some transients and for tracking slowly varying process dynamics. Neural networks based empirical models in some cases appear to provide a serious alternative to first principles models View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Recurrent neural network training with feedforward complexity

    Publication Year: 1994 , Page(s): 185 - 197
    Cited by:  Papers (12)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1080 KB)  

    This paper presents a training method that is of no more than feedforward complexity for fully recurrent networks. The method is not approximate, but rather depends on an exact transformation that reveals an embedded feedforward structure in every recurrent network. It turns out that given any unambiguous training data set, such as samples of the state variables and their derivatives, we need only to train this embedded feedforward structure. The necessary recurrent network parameters are then obtained by an inverse transformation that consists only of linear operators. As an example of modeling a representative nonlinear dynamical system, the method is applied to learn Bessel's differential equation, thereby generating Bessel functions within, as well as outside the training set View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Discrete recurrent neural networks for grammatical inference

    Publication Year: 1994 , Page(s): 320 - 330
    Cited by:  Papers (11)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (820 KB)  

    Describes a novel neural architecture for learning deterministic context-free grammars, or equivalently, deterministic pushdown automata. The unique feature of the proposed network is that it forms stable state representations during learning-previous work has shown that conventional analog recurrent networks can be inherently unstable in that they cannot retain their state memory for long input strings. The authors have previously introduced the discrete recurrent network architecture for learning finite-state automata. Here they extend this model to include a discrete external stack with discrete symbols. A composite error function is described to handle the different situations encountered in learning. The pseudo-gradient learning method (introduced in previous work) is in turn extended for the minimization of these error functions. Empirical trials validating the effectiveness of the pseudo-gradient learning method are presented, for networks both with and without an external stack. Experimental results show that the new networks are successful in learning some simple pushdown automata, though overfitting and non-convergent learning can also occur. Once learned, the internal representation of the network is provably stable; i.e., it classifies unseen strings of arbitrary length with 100% accuracy View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Recurrent neural networks and robust time series prediction

    Publication Year: 1994 , Page(s): 240 - 254
    Cited by:  Papers (114)  |  Patents (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1100 KB)  

    We propose a robust learning algorithm and apply it to recurrent neural networks. This algorithm is based on filtering outliers from the data and then estimating parameters from the filtered data. The filtering removes outliers from both the target function and the inputs of the neural network. The filtering is soft in that some outliers are neither completely rejected nor accepted. To show the need for robust recurrent networks, we compare the predictive ability of least squares estimated recurrent networks on synthetic data and on the Puget Power Electric Demand time series. These investigations result in a class of recurrent neural networks, NARMA(p,q), which show advantages over feedforward neural networks for time series with a moving average component. Conventional least squares methods of fitting NARMA(p,q) neural network models are shown to suffer a lack of robustness towards outliers. This sensitivity to outliers is demonstrated on both the synthetic and real data sets. Filtering the Puget Power Electric Demand time series is shown to automatically remove the outliers due to holidays. Neural networks trained on filtered data are then shown to give better predictions than neural networks trained on unfiltered time series View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Training recurrent neural networks: why and how? An illustration in dynamical process modeling

    Publication Year: 1994 , Page(s): 178 - 184
    Cited by:  Papers (28)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (552 KB)  

    The paper first summarizes a general approach to the training of recurrent neural networks by gradient-based algorithms, which leads to the introduction of four families of training algorithms. Because of the variety of possibilities thus available to the “neural network designer,” the choice of the appropriate algorithm to solve a given problem becomes critical. We show that, in the case of process modeling, this choice depends on how noise interferes with the process to be modeled; this is evidenced by three examples of modeling of dynamical processes, where the detrimental effect of inappropriate training algorithms on the prediction error made by the network is clearly demonstrated View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Back propagation through adjoints for the identification of nonlinear dynamic systems using recurrent neural models

    Publication Year: 1994 , Page(s): 213 - 228
    Cited by:  Papers (23)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1328 KB)  

    In this paper, back propagation is reinvestigated for an efficient evaluation of the gradient in arbitrary interconnections of recurrent subsystems. It is shown that the error has to be back-propagated through the adjoint model of the system and that the gradient can only be obtained after a delay. A faster version, accelerated back propagation, that eliminates this delay, is also developed. Various schemes including the sensitivity method are studied to update the weights of the network using these gradients. Motivated by the Lyapunov approach and the adjoint model, the predictive back propagation and its variant, targeted back propagation, are proposed. A further refinement, predictive back propagation with filtering is then developed, where the states of the model are also updated. The convergence of this scheme is assured. It is shown that it is sufficient to back propagate as many time steps as the order of the system for convergence. As a preamble, convergence of online batch and sample-wise updates in feedforward models is analyzed using the Lyapunov approach View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

IEEE Transactions on Neural Networks is devoted to the science and technology of neural networks, which disclose significant technical knowledge, exploratory developments, and applications of neural networks from biology to software to hardware.

 

This Transactions ceased production in 2011. The current retitled publication is IEEE Transactions on Neural Networks and Learning Systems.

Full Aims & Scope