Home // COGNITIVE 2017, The Ninth International Conference on Advanced Cognitive Technologies and Applications // View article
An Intrinsic Difference Between Vanilla RNNs and GRU Models
Authors:
Tristan Sterin
Nicolas Farrugia
Vincent Gripon
Keywords: Recurrent Neural Networks,Gradient Backpropagation,Grammatical Inference,Dynamical Systems
Abstract:
In order to perform well in practice, Recurrent Neural Networks (RNN) require computationally heavy architectures, such as Gated Recurrent Unit (GRU) or Long Short Term Memory (LSTM). Indeed, the original Vanilla model fails to encapsulate middle and long term sequential dependencies. The aim of this paper is to show that gradient training issues, which have motivated the introduction of LSTM and GRU models, are not sufficient to explain the failure of the simplest RNN. Using the example of Reber’s grammar, we propose an experimental measure of both Vanilla and GRU models, which suggest an intrinsic difference in their dynamics. A better mathematical understanding of this difference could lead to more efficient models without compromising performance.
Pages: 76 to 81
Copyright: Copyright (c) IARIA, 2017
Publication date: February 19, 2017
Published in: conference
ISSN: 2308-4197
ISBN: 978-1-61208-531-9
Location: Athens, Greece
Dates: from February 19, 2017 to February 23, 2017