By Topic

Stochastic Natural Gradient Descent by estimation of empirical covariances

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Malagò Luigi ; Politecnico di Milano, Milano, Italy ; Matteo Matteucci ; Giovanni Pistone

Stochastic relaxation aims at finding the minimum of a fitness function by identifying a proper sequence of distributions, in a given model, that minimize the expected value of the fitness function. Different algorithms fit this framework, and they differ according to the policy they implement to identify the next distribution in the model. In this paper we present two algorithms, in the stochastic relaxation framework, for the optimization of real-valued functions defined over binary variables: Stochastic Gradient Descent (SGD) and Stochastic Natural Gradient Descent (SNDG). These algorithms use a stochastic model to sample from as it happens for Estimation of Distribution Algorithms (EDAs), but the estimation of the model from the population is substituted by the direct update of model parameter through stochastic gradient descent. The two algorithms, SGD and SNDG, both use statistical models in the exponential family, but they differ in the use of the natural gradient, first proposed in the literature by Amari, in the context of Information Geometry. Due to the properties of the exponential family, both gradient and natural gradient can be evaluated in terms of covariances between the fitness function and the sufficient statistics of the exponential family. As the computation of the exact gradient is unfeasible, we approximate the gradient by evaluating empirical covariances. We test the performance of our algorithm over different standard benchmarks, and we compare the results with other well-known meta-heuristics in the framework of EDAs.

Note: As originally published the authors' names are given with surnames first as: Malago Luigi, Matteucci Matteo, Pistone Giovanni. This is corrected in the metadata so that the names read as: Malago, L., Matteo, M., and Pistone, G. The original PDF remains unchanged.  

Published in:

2011 IEEE Congress of Evolutionary Computation (CEC)

Date of Conference:

5-8 June 2011