Loading [MathJax]/extensions/MathZoom.js
Projective Fisher Information for Natural Gradient Descent | IEEE Journals & Magazine | IEEE Xplore

Projective Fisher Information for Natural Gradient Descent


Impact Statement:Deep Neural Networks achieve state ofthe art in various deep learning problems like computer visionand speech recognition. However, they are challenging to train,require ...Show More

Abstract:

Improvements in neural network optimization algorithms have enabled shorter training times and the ability to reach state-of-the-art performance on various machine learni...Show More
Impact Statement:
Deep Neural Networks achieve state ofthe art in various deep learning problems like computer visionand speech recognition. However, they are challenging to train,require extensive hyperparameter tuning, and take substantialtraining time running into many weeks. Hence it is important toease the training by reducing the training time and amount ofparameter tuning required. Natural gradient-based optimizationmethods reduce the training time and need for extensive tuningbut are too complex for many networks. The training and analysisalgorithm introduced here reduces the amount of time andeffort required in performing the training significantly while notimpacting the performance metric achieved at the given task.

Abstract:

Improvements in neural network optimization algorithms have enabled shorter training times and the ability to reach state-of-the-art performance on various machine learning tasks. Fisher information based natural gradient descent is one such second-order method that improves the convergence speed and the final performance metric achieved for many machine learning algorithms. Fisher information matrices are also helpful to analyze the properties and expected behavior of neural networks. However, natural gradient descent is a high complexity method due to the need to maintain and invert covariance matrices. This is especially the case with modern deep neural networks, which have a very high number of parameters, and for which the problem often becomes computationally unfeasible. We suggest using the Fisher information for analysis of parameter space of fully connected and convolutional neural networks without calculating the matrix itself. We also propose a lower complexity natural gradi...
Published in: IEEE Transactions on Artificial Intelligence ( Volume: 4, Issue: 2, April 2023)
Page(s): 304 - 314
Date of Publication: 25 February 2022
Electronic ISSN: 2691-4581

Funding Agency:

Author image of Piyush Kaul
Department of Electrical Engineering, Indian Institute of Technology - Delhi, New Delhi, India
Piyush Kaul (Graduate Student Member, IEEE) received the B.E. degree in electronics and communication engineering from the Delhi College of Engineering, Delhi University, Delhi, India, in 1999. He is currently working toward the Ph.D. degree in electrical engineering with the Indian Institute of Technology Delhi, New Delhi, India.
Since 2013, he has been working as an Architect with the System Engineering Team (IPG), Caden...Show More
Piyush Kaul (Graduate Student Member, IEEE) received the B.E. degree in electronics and communication engineering from the Delhi College of Engineering, Delhi University, Delhi, India, in 1999. He is currently working toward the Ph.D. degree in electrical engineering with the Indian Institute of Technology Delhi, New Delhi, India.
Since 2013, he has been working as an Architect with the System Engineering Team (IPG), Caden...View more
Author image of Brejesh Lall
Department of Electrical Engineering, Indian Institute of Technology - Delhi, New Delhi, India
Brejesh Lall (Member, IEEE) received the B.Tech. degree in electronics and communication engineering and the master’s degree in signal processing from the Delhi College of Engineering, New Delhi, India, in 1991 and 1992, and the Ph.D. degree in signal processing from the Indian Institute of Technology (IIT) Delhi, New Delhi, India, in 1999.
He is currently a Professor with the Department of Electrical Engineering, IIT Delh...Show More
Brejesh Lall (Member, IEEE) received the B.Tech. degree in electronics and communication engineering and the master’s degree in signal processing from the Delhi College of Engineering, New Delhi, India, in 1991 and 1992, and the Ph.D. degree in signal processing from the Indian Institute of Technology (IIT) Delhi, New Delhi, India, in 1999.
He is currently a Professor with the Department of Electrical Engineering, IIT Delh...View more

Author image of Piyush Kaul
Department of Electrical Engineering, Indian Institute of Technology - Delhi, New Delhi, India
Piyush Kaul (Graduate Student Member, IEEE) received the B.E. degree in electronics and communication engineering from the Delhi College of Engineering, Delhi University, Delhi, India, in 1999. He is currently working toward the Ph.D. degree in electrical engineering with the Indian Institute of Technology Delhi, New Delhi, India.
Since 2013, he has been working as an Architect with the System Engineering Team (IPG), Cadence Design Systems, San Jose, CA, USA, for developing machine learning and artificial intelligence systems based on deep neural networks with a specific focus on computer vision. His primary research interest has been in the areas of mathematical characterization and understanding of deep neural networks.
Piyush Kaul (Graduate Student Member, IEEE) received the B.E. degree in electronics and communication engineering from the Delhi College of Engineering, Delhi University, Delhi, India, in 1999. He is currently working toward the Ph.D. degree in electrical engineering with the Indian Institute of Technology Delhi, New Delhi, India.
Since 2013, he has been working as an Architect with the System Engineering Team (IPG), Cadence Design Systems, San Jose, CA, USA, for developing machine learning and artificial intelligence systems based on deep neural networks with a specific focus on computer vision. His primary research interest has been in the areas of mathematical characterization and understanding of deep neural networks.View more
Author image of Brejesh Lall
Department of Electrical Engineering, Indian Institute of Technology - Delhi, New Delhi, India
Brejesh Lall (Member, IEEE) received the B.Tech. degree in electronics and communication engineering and the master’s degree in signal processing from the Delhi College of Engineering, New Delhi, India, in 1991 and 1992, and the Ph.D. degree in signal processing from the Indian Institute of Technology (IIT) Delhi, New Delhi, India, in 1999.
He is currently a Professor with the Department of Electrical Engineering, IIT Delhi. In 1997, he joined Aricent, Santa Clara, CA, USA. In 2005, he joined IIT Delhi as an Assistant Professor. He is currently a Professor with IIT Delhi and is actively working in the area of image processing, signal processing, and communication. His research interests include object representation, tracking and classification, odometry, depth map generation, representation and rendering, vector sensor-based underwater acoustic communications, characterization, analysis of neural networks, and performance issues in molecular communications.
Brejesh Lall (Member, IEEE) received the B.Tech. degree in electronics and communication engineering and the master’s degree in signal processing from the Delhi College of Engineering, New Delhi, India, in 1991 and 1992, and the Ph.D. degree in signal processing from the Indian Institute of Technology (IIT) Delhi, New Delhi, India, in 1999.
He is currently a Professor with the Department of Electrical Engineering, IIT Delhi. In 1997, he joined Aricent, Santa Clara, CA, USA. In 2005, he joined IIT Delhi as an Assistant Professor. He is currently a Professor with IIT Delhi and is actively working in the area of image processing, signal processing, and communication. His research interests include object representation, tracking and classification, odometry, depth map generation, representation and rendering, vector sensor-based underwater acoustic communications, characterization, analysis of neural networks, and performance issues in molecular communications.View more
Contact IEEE to Subscribe

References

References is not available for this document.