By Topic

Neural Networks, IEEE Transactions on

Issue 2 • Date March 2000

Filter Results

Displaying Results 1 - 25 of 25
  • Constraint satisfaction adaptive neural network and heuristics combined approaches for generalized job-shop scheduling

    Publication Year: 2000 , Page(s): 474 - 486
    Cited by:  Papers (22)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (271 KB)  

    This paper presents a constraint satisfaction adaptive neural network, together with several heuristics, to solve the generalized job-shop scheduling problem, one of NP-complete constraint satisfaction problems. The proposed neural network can be easily constructed and can adaptively adjust its weights of connections and biases of units based on the sequence and resource constraints of the job-shop scheduling problem during its processing. Several heuristics that can be combined with the neural network are also presented. In the combined approaches, the neural network is used to obtain feasible solutions, the heuristic algorithms are used to improve the performance of the neural network and the quality of the obtained solutions. Simulations have shown that the proposed neural network and its combined approaches are efficient with respect to the quality of solutions and the solving speed. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Algorithms for accelerated convergence of adaptive PCA

    Publication Year: 2000 , Page(s): 338 - 355
    Cited by:  Papers (32)  |  Patents (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (520 KB)  

    We derive and discuss adaptive algorithms for principal component analysis (PCA) that are shown to converge faster than the traditional PCA algorithms due to Oja and Karhunen (1985), Sanger (1989), and Xu (1993). It is well known that traditional PCA algorithms that are derived by using gradient descent on an objective function are slow to converge. Furthermore, the convergence of these algorithms depends on appropriate choices of the gain sequences. Since online applications demand faster convergence and an automatic selection of gains, we present new adaptive algorithms to solve these problems. We first present an unconstrained objective function, which can be minimized to obtain the principal components. We derive adaptive algorithms from this objective function by using: (1) gradient descent; (2) steepest descent; (3) conjugate direction; and (4) Newton-Raphson methods. Although gradient descent produces Xu's LMSER algorithm, the steepest descent, conjugate direction, and Newton-Raphson methods produce new adaptive algorithms for PCA. We also provide a discussion on the landscape of the objective function, and present a global convergence proof of the adaptive gradient descent PCA algorithm using stochastic approximation theory. Extensive experiments with stationary and nonstationary multidimensional Gaussian sequences show faster convergence of the new algorithms over the traditional gradient descent methods. We also compare the steepest descent adaptive algorithm with state-of-the-art methods on stationary and nonstationary sequences View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Neurocontroller alternatives for “fuzzy” ball-and-beam systems with nonuniform nonlinear friction

    Publication Year: 2000 , Page(s): 423 - 435
    Cited by:  Papers (29)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (260 KB)  

    The ball-and-beam problem is a benchmark for testing control algorithms. Zadeh proposed (1994) a twist to the problem, which, he suggested, would require a fuzzy logic controller. This experiment uses a beam, partially covered with a sticky substance, increasing the difficulty of predicting the ball's motion. We complicated this problem even more by not using any information concerning the ball's velocity. Although it is common to use the first differences of the ball's consecutive positions as a measure of velocity and explicit input to the controller, we preferred to exploit recurrent neural networks, inputting only consecutive positions instead. We have used truncated backpropagation through time with the node-decoupled extended Kalman filter (NDEKF) algorithm to update the weights in the networks. Our best neurocontroller uses a form of approximate dynamic programming called an adaptive critic design. A hierarchy of such designs exists. Our system uses dual heuristic programming (DHP), an upper-level design. To our best knowledge, our results are the first use of DHP to control a physical system. It is also the first system we know of to respond to Zadeh's challenge. We do not claim this neural network control algorithm is the best approach to this problem, nor do we claim it is better than a fuzzy controller. It is instead a contribution to the scientific dialogue about the boundary between the two overlapping disciplines View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Bayes-optimality motivated linear and multilayered perceptron-based dimensionality reduction

    Publication Year: 2000 , Page(s): 452 - 463
    Cited by:  Papers (6)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (240 KB)  

    Dimensionality reduction is the process of mapping high-dimension patterns to a lower dimension subspace. When done prior to classification, estimates obtained in the lower dimension subspace are more reliable. For some classifiers, there is also an improvement in performance due to the removal of the diluting effect of redundant information. A majority of the present approaches to dimensionality reduction are based on scatter matrices or other statistics of the data which do not directly correlate to classification accuracy. The optimality criteria of choice for the purposes of classification is the Bayes error. Usually however, Bayes error is difficult to express analytically. We propose an optimality criteria based on an approximation of the Bayes error and use it to formulate a linear and a nonlinear method of dimensionality reduction. The nonlinear method we propose, relies on using a multilayered perceptron which produces as output the lower dimensional representation. It thus differs from autoassociative like multilayered perceptrons which have been proposed and used for dimensionality reduction. Our results show that the nonlinear method is, as anticipated, superior to the linear method in that it can perform unfolding of a nonlinear manifold. In addition, the nonlinear method we propose provides substantially better lower dimension representation (for classification purposes) than Fisher's linear discriminant (FLD) and two other nonlinear methods of dimensionality reduction that are often used View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On equilibria, stability, and instability of Hopfield neural networks

    Publication Year: 2000 , Page(s): 534 - 540
    Cited by:  Papers (40)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (212 KB)  

    Existence and uniqueness of equilibrium, as well as its stability and instability, of a continuous-time Hopfield neural network are studied. A set of new and simple sufficient conditions are derived View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Bayesian nonlinear model selection and neural networks: a conjugate prior approach

    Publication Year: 2000 , Page(s): 265 - 278
    Cited by:  Papers (13)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (424 KB)  

    In order to select the best predictive neural-network architecture in a set of several candidate networks, we propose a general Bayesian nonlinear regression model comparison procedure, based on the maximization of an expected utility criterion. This criterion selects the model under which the training set achieves the highest level of internal consistency, through the predictive probability distribution of each model. The density of this distribution is computed as the model posterior predictive density and is asymptotically approximated from the assumed Gaussian likelihood of the data set and the related conjugate prior density of the parameters. The use of such a conjugate prior allows the analytic calculation of the parameter posterior and predictive posterior densities, in an empirical Bayes-like approach. This Bayesian selection procedure allows us to compare general nonlinear regression models and in particular feedforward neural networks, in addition to embedded models as usual with asymptotic comparison tests View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The layer-wise method and the backpropagation hybrid approach to learning a feedforward neural network

    Publication Year: 2000 , Page(s): 295 - 305
    Cited by:  Papers (11)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (280 KB)  

    Feedforward neural networks (FNNs) have been proposed to solve complex problems in pattern recognition and classification and function approximation. Despite the general success of learning methods for FNNs, such as the backpropagation (BP) algorithm, second-order optimization algorithms and layer-wise learning algorithms, several drawbacks remain to be overcome. In particular, two major drawbacks are convergence to a local minima and long learning time. We propose an efficient learning method for a FNN that combines the BP strategy and optimization layer by layer. More precisely, we construct the layer-wise optimization method using the Taylor series expansion of nonlinear operators describing a FNN and propose to update weights of each layer by the BP-based Kaczmarz iterative procedure. The experimental results show that the new learning algorithm is stable, it reduces the learning time and demonstrates improvement of generalization results in comparison with other well-known methods View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the choice of parameters of the cost function in nested modular RNN's

    Publication Year: 2000 , Page(s): 315 - 322
    Cited by:  Papers (11)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (152 KB)  

    We address the choice of the coefficients in the cost function of a modular nested recurrent neural-network (RNN) architecture, known as the pipelined recurrent neural network (PRNN). Such a network can cope with the problem of vanishing gradient, experienced in prediction with RNN's. Constraints on the coefficients of the cost function, in the form of a vector norm, are considered. Unlike the previous cost function for the PRNN, which included a forgetting factor motivated by the recursive least squares (RLS) strategy, the proposed forms of cost function provide “forgetting” of the outputs of adjacent modules based upon the network architecture. Such an approach takes into account the number of modules in the PRNN, through the unit norm constraint on the coefficients of the cost function of the PRNN. This is shown to be particularly suitable, since due to inherent nesting in the PRNN, every module gives its full contribution to the learning process, whereas the unit norm constrained cost function introduces a sense of forgetting in the memory management of the PRNN. The PRNN based upon a modified cost function outperforms existing PRNN schemes in the time series prediction simulations presented View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the optimality of neural-network approximation using incremental algorithms

    Publication Year: 2000 , Page(s): 323 - 337
    Cited by:  Papers (9)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (456 KB)  

    The problem of approximating functions by neural networks using incremental algorithms is studied. For functions belonging to a rather general class, characterized by certain smoothness properties with respect to the L2 norm, we compute upper bounds on the approximation error where error is measured by the Lq norm, 1⩽q⩽∞. These results extend previous work, applicable in the case q=2, and provide an explicit algorithm to achieve the derived approximation error rate. In the range q⩽2 near-optimal rates of convergence are demonstrated. A gap remains, however, with respect to a recently established lower bound in the case q>2, although the rates achieved are provably better than those obtained by optimal linear approximation. Extensions of the results from the L2 norm to Lp are also discussed. A further interesting conclusion from our results is that no loss of generality is suffered using networks with positive hidden-to-output weights. Moreover, explicit bounds on the size of the hidden-to-output weights are established, which are sufficient to guarantee the established convergence rates View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • k-nearest neighbors directed noise injection in multilayer perceptron training

    Publication Year: 2000 , Page(s): 504 - 511
    Cited by:  Papers (17)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (224 KB)  

    The relation between classifier complexity and learning set size is very important in discriminant analysis. One of the ways to overcome the complexity control problem is to add noise to the training objects, increasing in this way the size of the training set. Both the amount and the directions of noise injection are important factors which determine the effectiveness for classifier training. In this paper the effect is studied of the injection of Gaussian spherical noise and k-nearest neighbors directed noise on the performance of multilayer perceptrons. As it is impossible to provide an analytical investigation for multilayer perceptrons, a theoretical analysis is made for statistical classifiers. The goal is to get a better understanding of the effect of noise injection on the accuracy of sample-based classifiers. By both empirical as well as theoretical studies, it is shown that the k-nearest neighbors directed noise injection is preferable over the Gaussian spherical noise injection for data with low intrinsic dimensionality View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Discrete probability estimation for classification using certainty-factor-based neural networks

    Publication Year: 2000 , Page(s): 415 - 422
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (140 KB)  

    Traditional probability estimation often demands a large amount of data for a problem of industrial scale. Neural networks have been used as an effective alternative for estimating input-output probabilities. In this paper, the certainty-factor-based neural network (CFNet) is explored for probability estimation in discrete domains. A new analysis presented here shows that the basis functions learned by the CFNet can bear precise semantics for dependencies. In the simulation study, the CFNet outperforms both the backpropagation network and the system based on the Rademacher-Walsh expansion. In the real-data experiments on splice junction and breast cancer data sets, the CFNet outperforms other neural networks and symbolic systems View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Selecting radial basis function network centers with recursive orthogonal least squares training

    Publication Year: 2000 , Page(s): 306 - 314
    Cited by:  Papers (78)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (168 KB)  

    Recursive orthogonal least squares (ROLS) is a numerically robust method for solving for the output layer weights of a radial basis function (RBF) network, and requires less computer memory than the batch alternative. In the paper, the use of ROLS is extended to selecting the centers of an RBF network. It is shown that the information available in an ROLS algorithm after network training can be used to sequentially select centers to minimize the network output error. This provides efficient methods for network reduction to achieve smaller architectures with acceptable accuracy and without retraining. Two selection methods are developed, forward and backward. The methods are illustrated in applications of RBF networks to modeling a nonlinear time series and a real multiinput-multioutput chemical process. The final network models obtained achieve acceptable accuracy with significant reductions in the number of required centers View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The equivalence between fuzzy logic systems and feedforward neural networks

    Publication Year: 2000 , Page(s): 356 - 365
    Cited by:  Papers (19)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (300 KB)  

    Demonstrates that fuzzy logic systems and feedforward neural networks are equivalent in essence. First, we introduce the concept of interpolation representations of fuzzy logic systems and several important conclusions. We then define mathematical models for rectangular wave neural networks and nonlinear neural networks. With this definition, we prove that nonlinear neural networks can be represented by rectangular wave neural networks. Based on this result, we prove the equivalence between fuzzy logic systems and feedforward neural networks. This result provides us a very useful guideline when we perform theoretical research and applications on fuzzy logic systems, neural networks, or neuro-fuzzy systems View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Learning neural networks with noisy inputs using the errors-in-variables approach

    Publication Year: 2000 , Page(s): 402 - 414
    Cited by:  Papers (7)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (352 KB)  

    Currently, most learning algorithms for neural-network modeling are based on the output error approach, using a least squares cost function. This method provides good results when the network is trained with noisy output data and known inputs. Special care must be taken, however, when training the network with noisy input data, or when both inputs and outputs contain noise. This paper proposes a novel cost function for learning NN with noisy inputs, based on the errors-in-variables stochastic framework. A learning scheme is presented and examples are given demonstrating the improved performance in neural-network curve fitting, at the cost of increased computation time View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Constructive neural-network learning algorithms for pattern classification

    Publication Year: 2000 , Page(s): 436 - 451
    Cited by:  Papers (57)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (372 KB)  

    Constructive learning algorithms offer an attractive approach for the incremental construction of near-minimal neural-network architectures for pattern classification. They help overcome the need for ad hoc and often inappropriate choices of network topology in algorithms that search for suitable weights in a priori fixed network architectures. Several such algorithms are proposed in the literature and shown to converge to zero classification errors (under certain assumptions) on tasks that involve learning a binary to binary mapping (i.e., classification problems involving binary-valued input attributes and two output categories). We present two constructive learning algorithms, MPyramid-real and MTiling-real, that extend the pyramid and tiling algorithms, respectively, for learning real to M-ary mappings (i.e., classification problems involving real-valued input attributes and multiple output classes). We prove the convergence of these algorithms and empirically demonstrate their applicability to practical pattern classification problems. Additionally, we show how the incorporation of a local pruning step can eliminate several redundant neurons from MTiling-real networks View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A class of learning algorithms for principal component analysis and minor component analysis

    Publication Year: 2000 , Page(s): 529 - 533
    Cited by:  Papers (16)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (144 KB)  

    Principal component analysis (PCA) and minor component analysis (MCA) are a powerful methodology for a wide variety of applications such as pattern recognition and signal processing. In this paper, we first propose a differential equation for the generalized eigenvalue problem. We prove that the stable points of this differential equation are the eigenvectors corresponding to the largest eigenvalue. Based on this generalized differential equation, a class of PCA and MCA learning algorithms can be obtained. We demonstrate that many existing PCA and MCA learning algorithms are special cases of this class, and this class includes some new and simpler MCA learning algorithms. Our results show that all the learning algorithms of this class have the same order of convergence speed, and they are robust to implementation error View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Centroid neural network for unsupervised competitive learning

    Publication Year: 2000 , Page(s): 520 - 528
    Cited by:  Papers (18)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1576 KB)  

    An unsupervised competitive learning algorithm based on the classical k-means clustering algorithm is proposed. The proposed learning algorithm called the centroid neural network (CNN) estimates centroids of the related cluster groups in training date. This paper also explains algorithmic relationships among the CNN and some of the conventional unsupervised competitive learning algorithms including Kohonen's self-organizing map and Kosko's differential competitive learning algorithm. The CNN algorithm requires neither a predetermined schedule for learning coefficient nor a total number of iterations for clustering. The simulation results on clustering problems and image compression problems show that CNN converges much faster than conventional algorithms with compatible clustering quality while other algorithms may give unstable results depending on the initial values of the learning coefficient and the total number of iterations View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Combination of artificial neural-network forecasters for prediction of natural gas consumption

    Publication Year: 2000 , Page(s): 464 - 473
    Cited by:  Papers (29)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (260 KB)  

    The focus of this paper is on combination of artificial neural-network (ANN) forecasters with application to the prediction of daily natural gas consumption needed by gas utilities. ANN forecasters can model the complex relationship between weather parameters and previous gas consumption with the future consumption. A two-stage system is proposed with the first stage containing two ANN forecasters, a multilayer feedforward ANN and a functional link ANN. These forecasters are initially trained with the error backpropagation algorithm, but an adaptive strategy is employed to adjust their weights during online forecasting. The second stage consists of a combination module to mix the two individual forecasts produced in the first stage. Eight different combination algorithms are examined, they are based on: averaging, recursive least squares, fuzzy logic, feedforward ANN, functional link ANN, temperature space approach, Karmarkar's linear programming algorithm (1984) and adaptive mixture of local experts (modular neural networks). The performance is tested on real data from six different gas utilities. The results indicate that combination strategies based on a single ANN outperform the other approaches View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Recursive training of neural networks for classification

    Publication Year: 2000 , Page(s): 496 - 503
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (212 KB)  

    A method for recursive training of neural networks for classification is proposed. It searches for the discriminant functions corresponding to several small local minima of the error function. The novelty of the proposed method lies in the transformation of the data into new training data with a deflated minimum of the error function and iteration to obtain the next solution. A simulation study and a character recognition application indicate that the proposed method has the potential to escape from local minima and to direct the local optimizer to new solutions View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Nonlinear adaptive control using networks of piecewise linear approximators

    Publication Year: 2000 , Page(s): 390 - 401
    Cited by:  Papers (30)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (308 KB)  

    Presents a stable nonparametric adaptive control approach using a piecewise local linear approximator. The continuous piecewise linear approximator is developed and its universal approximation capability is proved. The controller architecture is based on adaptive feedback linearization plus sliding mode control. A time varying activation region is introduced for efficient self-organization of the approximator during operation. We modify the adaptive control approach for piecewise linear approximation and self-organizing structures. In addition, we provide analyses of asymptotic stability of the tracking error and parameter convergence for the proposed adaptive control scheme with the online self-organizing structure. The method with a deadzone is also discussed to prevent a high-frequency input which might excite the unmodeled dynamics in practical applications. The application of the piecewise linear adaptive control method is demonstrated by a computational simulation View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Unsupervised feature evaluation: a neuro-fuzzy approach

    Publication Year: 2000 , Page(s): 366 - 376
    Cited by:  Papers (44)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (264 KB)  

    Demonstrates a way of formulating neuro-fuzzy approaches for both feature selection and extraction under unsupervised learning. A fuzzy feature evaluation index for a set of features is defined in terms of degree of similarity between two patterns in both the original and transformed feature spaces. A concept of flexible membership function incorporating weighted distance is introduced for computing membership values in the transformed space. Two new layered networks are designed. The tasks of membership computation and minimization of the evaluation index, through unsupervised learning process, are embedded into them without requiring the information on the number of clusters in the feature space. The network for feature selection results in an optimal order of individual importance of the features. The other one extracts a set of optimum transformed features, by projecting n-dimensional original space directly to n'-dimensional (n'<n) transformed space, along with their relative importance. The superiority of the networks to some related ones is established experimentally View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Extracting M-of-N rules from trained neural networks

    Publication Year: 2000 , Page(s): 512 - 519
    Cited by:  Papers (37)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (148 KB)  

    An effective algorithm for extracting M-of-N rules from trained feedforward neural networks is proposed. First, we train a network where each input of the data can only have one of the two possible values, -1 or one. Next, we apply the hyperbolic tangent function to each connection from the input layer to the hidden layer of the network. By applying this squashing function, the activation values at the hidden units are effectively computed as the hyperbolic tangent (or the sigmoid) of the weighted inputs, where the weights have magnitudes that are equal one. By restricting the inputs and the weights to binary values either -1 or one, the extraction of M-of-N rules from the networks becomes trivial. We demonstrate the effectiveness of the proposed algorithm on several widely tested datasets. For datasets consisting of thousands of patterns with many attributes, the rules extracted by the algorithm are simple and accurate View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A local linearized least squares algorithm for training feedforward neural networks

    Publication Year: 2000 , Page(s): 487 - 495
    Cited by:  Papers (5)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (164 KB)  

    In training the weights of a feedforward neural network, it is well known that the global extended Kalman filter (GEKF) algorithm has much better performance than the popular gradient descent with error backpropagation in terms of convergence and quality of solution. However, the GEKF is very computationally intensive, which has led to the development of efficient algorithms such as the multiple extended Kalman algorithm (MEKA) and the decoupled extended Kalman filter algorithm (DEKF), that are based on dimensional reduction and/or partitioning of the global problem. In this paper we present a new training algorithm, called local linearized least squares (LLLS), that is based on viewing the local system identification subproblems at the neuron level as recursive linearized least squares problems. The objective function of the least squares problems for each neuron is the sum of the squares of the linearized backpropagated error signals. The new algorithm is shown to give better convergence results for three benchmark problems in comparison to MEKA, and in comparison to DEKF for highly coupled applications. The performance of the LLLS algorithm approaches that of the GEKF algorithm in the experiments View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Extracting rules from trained neural networks

    Publication Year: 2000 , Page(s): 377 - 389
    Cited by:  Papers (45)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (240 KB)  

    Presents an algorithm for extracting rules from trained neural networks. The algorithm is a decompositional approach which can be applied to any neural network whose output function is monotone such as a sigmoid function. Therefore, the algorithm can be applied to multilayer neural networks, recurrent neural networks and so on. It does not depend on training algorithms, and its computational complexity is polynomial. The basic idea is that the units of neural networks are approximated by Boolean functions. But the computational complexity of the approximation is exponential, and so a polynomial algorithm is presented. The author has applied the algorithm to several problems to extract understandable and accurate rules. The paper shows the results for the votes data, mushroom data, and others. The algorithm is extended to the continuous domain, where extracted rules are continuous Boolean functions. Roughly speaking, the representation by continuous Boolean functions means the representation using conjunction, disjunction, direct proportion, and reverse proportion. This paper shows the results for iris data View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Quad-Q-learning

    Publication Year: 2000 , Page(s): 279 - 294
    Cited by:  Papers (8)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (508 KB)  

    Develops the theory of quad-Q-learning which is a learning algorithm that evolved from Q-learning. Quad-Q-learning is applicable to problems that can be solved by “divide and conquer” techniques. Quad-Q-learning concerns an autonomous agent that learns without supervision to act optimally to achieve specified goals. The learning agent acts in an environment that can be characterized by a state. In the Q-learning environment, when an action is taken, a reward is received and a single new state results. The objective of Q-learning is to learn a policy function that maps states to actions so as to maximize a function of the rewards such as the sum of rewards. However, with respect to quad-Q-learning, when an action is taken from a state either an immediate reward and no new state results, or no reward is received and four new states result from taking that action. The environment in which quad-Q-learning operates can thus be viewed as a hierarchy of states where lower level states are the children of higher level states. The hierarchical aspect of quad-Q-learning leads to a bottom up view of learning that improves the efficiency of learning at higher levels in the hierarchy. The objective of quad-Q-learning is to maximize the sum of rewards obtained from each of the environments that result as actions are taken. Two versions of quad-Q-learning are discussed; these are discrete state and mixed discrete and continuous state quad-Q-learning. The discrete state version is only applicable to problems with small numbers of states. Scaling up to problems with practical numbers of states requires a continuous state learning method. Continuous state learning can be accomplished using functional approximation methods. Application of quad-Q-learning to image compression is briefly described View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

IEEE Transactions on Neural Networks is devoted to the science and technology of neural networks, which disclose significant technical knowledge, exploratory developments, and applications of neural networks from biology to software to hardware.

 

This Transactions ceased production in 2011. The current retitled publication is IEEE Transactions on Neural Networks and Learning Systems.

Full Aims & Scope