Journals & Magazines >IEEE Transactions on Pattern ... >Volume: 45 Issue: 8

Variational Nested Dropout

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Nested dropout is a variant of dropout operation that is able to order network parameters or features based on the pre-defined importance during training. It has been exp...Show More

Metadata

Abstract:

Nested dropout is a variant of dropout operation that is able to order network parameters or features based on the pre-defined importance during training. It has been explored for: I. Constructing nested nets Cui et al. 2020, Cui et al. 2021: the nested nets are neural networks whose architectures can be adjusted instantly during testing time, e.g., based on computational constraints. The nested dropout implicitly ranks the network parameters, generating a set of sub-networks such that any smaller sub-network forms the basis of a larger one. II. Learning ordered representation Rippel et al. 2014: the nested dropout applied to the latent representation of a generative model (e.g., auto-encoder) ranks the features, enforcing explicit order of the dense representation over dimensions. However, the dropout rate is fixed as a hyper-parameter during the whole training process. For nested nets, when network parameters are removed, the performance decays in a human-specified trajectory rather than in a trajectory learned from data. For generative models, the importance of features is specified as a constant vector, restraining the flexibility of representation learning. To address the problem, we focus on the probabilistic counterpart of the nested dropout. We propose a variational nested dropout (VND) operation that draws samples of multi-dimensional ordered masks at a low cost, providing useful gradients to the parameters of nested dropout. Based on this approach, we design a Bayesian nested neural network that learns the order knowledge of the parameter distributions. We further exploit the VND under different generative models for learning ordered latent distributions. In experiments, we show that the proposed approach outperforms the nested network in terms of accuracy, calibration, and out-of-domain detection in classification tasks. It also outperforms the related generative models on data generation tasks.

Published in: IEEE Transactions on Pattern Analysis and Machine Intelligence ( Volume: 45, Issue: 8, August 2023)

Page(s): 10519 - 10534

Date of Publication: 20 February 2023

ISSN Information:

PubMed ID: 37027650

DOI: 10.1109/TPAMI.2023.3241945

Funding Agency:

Contents

I. Introduction

Modern deep neural networks (DNNs) have achieved great success in fields of supervised learning and representation learning. In the meantime, deep learning models have a high demand for learning ordered information from data, for both model architecture and representations.

References is not available for this document.

Variational Nested Dropout

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Variational Nested Dropout

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

I. Introduction

Authors

Figures

References

Citations

Keywords

Metrics

Supplemental Items

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?