By Topic

A probabilistic approach which provides a modular and adaptive neural network architecture for discrimination

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $33
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
C. Monrocq ; Thomson-CSF LCR, Paris, France

Concerns the supervised discrimination of a vector x(∈RP) between K classes (Ci;i=1. . .K). The discrimination consists in learning a discriminant function from a training set of N examples. In a Bayesian context, the discriminant function is a probability function which is the probability of having the class C i knowing the pattern to classify is x, denoted P(Ci/x) (or equivalently P(Ci,x)). It is well-known that multilayer perceptrons (MLP) with a single hidden layer are universal classifiers in the sense that they can approximate decision surfaces of arbitrary complexity, provided the number of hidden neurons is large enough. Sometimes it is possible to decompose the classification problem, which requires a big network, into subproblems which are efficiently solved by simple modules (with a few or no hidden neurons). To each subproblem corresponds a cluster within the data set on which a module acts like an expert. If back-propagation is used to train a single MLP to solve the global discrimination, and thus to perform these different subproblems, there will generally be strong interference effects which could lead to slow learning and poor generalization; so for these many reasons the modular approach seems to be preferable. A number of authors have suggested to use a system composed of several different `experts': one `expert' for each subproblem. The author gives theoretical justification for this approach by constructing the global discriminant functions P(Ci/x) from outputs of the `experts' which perform local discriminations within the previous clusters. In a Bayesian context, this means that one is able to construct the global discriminant functions P(Ci/x) by means of the discriminant functions for each subproblem. Two main hypothesis are posed; the experts have probabilities as outputs and information about clusters is available. The author looks for appropriate output nonlinearities and for an appropriate criterion for the update of parameters of the neural networks. Two approaches are studied: with or without cooperation between modules during the learning

Published in:

Artificial Neural Networks, 1993., Third International Conference on

Date of Conference:

25-27 May 1993