RNNs evolving on an equilibrium manifold: a panacea for vanishing and exploding gradients?
Files
Date
2020
DOI
Authors
Kag, Anil
Zhang, Ziming
Saligrama, Venkatesh
Version
OA Version
Published version
Citation
Anil Kag, Ziming Zhang, Venkatesh Saligrama. 2020. "RNNs Evolving on an Equilibrium Manifold: A Panacea for Vanishing and Exploding Gradients?." International Conference on Learning Representations (ICLR).
Abstract
Recurrent neural networks (RNNs) are particularly well-suited for modeling longterm
dependencies in sequential data, but are notoriously hard to train because the
error backpropagated in time either vanishes or explodes at an exponential rate.
While a number of works attempt to mitigate this effect through gated recurrent
units, skip-connections, parametric constraints and design choices, we propose a
novel incremental RNN (iRNN), where hidden state vectors keep track of incremental
changes, and as such approximate state-vector increments of Rosenblatt’s
(1962) continuous-time RNNs. iRNN exhibits identity gradients and is able to
account for long-term dependencies (LTD). We show that our method is computationally
efficient overcoming overheads of many existing methods that attempt to
improve RNN training, while suffering no performance degradation. We demonstrate
the utility of our approach with extensive experiments and show competitive
performance against standard LSTMs on LTD and other non-LTD tasks.