Udacity Deep Learning Nanodegree Notes and Thoughts [Lesson 4: RNNs and LSTMs]
Course Content:
The primary focus of this lesson was recurrent neural networks (RNN), which is a class of neural networks where the system not only takes current input as an input but previous data as well. This function of RNNs makes the network great for speech recognition and time series predictions. A critical feature of RNNs is long short-term memories (LSTMs), the building unit for RNNs. They help the RNN deal with the previous data and how to use them. LSTMs are currently used in Amazon’s Alexa, Apple’s Siri and the Quick-type function on the iPhone keyboard.
LSTMs and RNNs are widely used with numerous applications, but they are tricky concepts to understand. RNNs take in the previous data along with the current input. LSTMs decide what weights to give the outputs of the previous inputs. They assign weights to each of the previous inputs depending on the timestamp of those inputs. LSTMs have a long-term memory and a short-term memory (hence, the name). The long-term memory stores the older data and has a smaller weight. The short-term memory stores more current data and has a larger weight because it’s more relevant to the current input.
Our first implementation of an RNN/LSTM neural network was using the novel Anna Karenina to generate new characters. Implementation was different and more complicated than the implementation of a feedforward or convolutional neural network. The hardest part for me was understanding the hyper-parameters. This was difficult in Lesson 3 as well but the reason I found it tricky this time was because I wasn’t sure what outcome to expect when modifying the hyper-parameters. However, they had an entire lesson on how each hyper-parameter works so it was great to have that as a reference when implementing the neural network.
Thoughts:
The lesson was tough to understand, and they didn’t do as good of a job with their explanations as they did in previous lessons. I think it may be this way because I wasn’t as fond of the instructors teaching style and how she tried to explain new concepts. The diagrams helped, but I had to rewatch a lot of the videos and do extra research of my own to gain enough understanding to complete the mini-projects. The way they taught coding implementation in this lesson should’ve been different than the previous ones because of the added difficulty.
I feel like instead of giving us the notebook and then providing us with a solution isn’t an effective enough way to learn such a difficult concept. Since some of the tasks are tricky to implement, I looked at the solution code and wrote a line by line explanation about how it worked. Then I would go back and code it myself. I think they should change their coding exercises by giving you a description of a general framework used for the mini-project. By doing this, it would make it easier to understand how to code each task because some of the exercises aren’t something a student can code up by themselves without prior knowledge.
Overall, this was a challenging lesson, but equally, if not more so rewarding. I’m looking forward to Lesson 5!