Skip to Main Content
Stochastic approximation provides a simple and effective approach for finding roots and minima of functions whose evaluations are contaminated with noise. We investigate variants of the random direction stochastic approximation (RDSA) algorithm for optimizing noisy loss functions in high-dimensional spaces. The most popular variant is random selection from a Bernoulli distribution, which also goes by the name simultaneous perturbation stochastic approximation (SPSA). Viable alternatives include an axis-aligned distribution, normal distribution, and uniform distribution on a spherical shell. Although there are special cases where the Bernoulli distribution is optimal, we identify other cases where it performs worse than the alternatives. We show that performance depends on the orientation of the loss function with respect to its coordinate axes, and consider averages over all orientations. We find that the average asymptotic performance depends only on the radial fourth moment of the distribution of random directions, and is identical for the Bernoulli, axis-aligned, and spherical shell distributions. Of these variants, the spherical shell is optimal in the sense of minimum variance over random orientations of the loss function with respect to the coordinate axes. We also show that for unaligned loss functions, the performance of the Kiefer-Wolfowitz-Blum finite difference stochastic approximation (FDSA) is asymptotically equivalent to the RDSA algorithms, and we observe numerically that the pre-asymptotic performance of FDSA is often superior.