Abstract:
The Long Short-Term Memory (LSTM) architecture is one of the most successful types of Recurrent Neural Networks (RNNs). However, the number of parameters that LSTMs need ...Show MoreMetadata
Abstract:
The Long Short-Term Memory (LSTM) architecture is one of the most successful types of Recurrent Neural Networks (RNNs). However, the number of parameters that LSTMs need to achieve acceptable performance might be larger than desired for standard devices. In this work, an Extended LSTM (E-LSTM) architecture is proposed to reduce the number of parameters needed to achieve similar performance to LSTMs. The architecture of the proposed E-LSTM is characterized by higher and explicit connectivity between distant past and current cell states' values, increasing the likelihood of a higher information transmission between data points separated by long lags. Analysis of possible nonlinear relations in the data sets aids in the process of determining an appropriate placement of increased connectivity, performed using the Distance Correlation (DC) method. The proposed E-LSTM reduces the number of parameters needed, by an order of magnitude for some cases, with acceptable increases in both CPU time and memory needed for the training process.
Date of Conference: 18-23 July 2022
Date Added to IEEE Xplore: 30 September 2022
ISBN Information: