I just started the deeplearning.ai specialization on coursera to see if it lives up to the hype. It consists of 5 courses, each spanning around 4-5 weeks, and covering the spectrum of topics from optimizing deep neural networks and hyperparameter optimization over the challenges of structuring ML projects to CNNs and sequence models.
In this post, I'll try to sum up my thoughts on the course, as I go through the models. It is Tuesday evening, let's see where the journey goes.
Course 1 - Deep Learning
Course 1 - Week 1
The first week is a rather soft introduction on the background of Machine Learning, where the biggest successes were observed and why it is taking off now. I agree with his point that - while unstructured data processing is more newsworthy - most profits can and will be made by processing structured data, at least in the medium term.
It definitely makes sense that the performance of the model is pretty much determined by the minimum of the available data and the complexity of the used model. In the regime of small training sets, handcrafting of features and model design is more important.
Also, the interview with Geoffrey Hinton was quite interesting.
Course 1 - Week 2
Useful bit of information: It is possible to speed up the videos - a speed of 1.25x is perfectly fine to watch, maybe even better, as the information keeps flowing a bit faster and the urge to distract yourself with a browser on the side is reduced. If you enable subtitles and disable the sound, following the content at speed 1.5x is still comfortably possible, even though you might pause sometimes to think.
In this class, binary classification and logistic regression is introduced and the first practical exercises take place. With a science / programming background, it might be absolutely okay to skip the Python introduction and basics of derivatives videos, even though it is helpful to get to know the coding conventions of the course.
This week also contains the first quiz and take-home assignment. The assignments are actually done in a Jupyter notebook hosted by coursera - as I learned from comments on the course, it is recommended to store your notebooks locally to ensure you can access them after finishing the course.
The question of the week: Why is it common to use fixed learning rates? Hasn't the problem of adaptive step sized in gradient descent already been addressed? (c.f. BB algorithm)
Course 1 - Week 3
In this week, multi-layered neural networks are introduced. I very much appreciated the discussion why a non-linear activation function is required. Interesting: One exception to the rule "always use a non-linear activation function" is: For regression problems, it is legitimate to use a linear activation function.
One point I was wondering about: Is it actually possible to mix activation functions in one layer of a neural network?
The exercises are quite manageable, and easy to follow. My only issue with them is the boilerplate code that is already supplied. While it teaches you to convert mathematical formulas into code, lessons to be learned about writing clean code are limited.
Update: After checking the platform, I noticed that pretty much every test did not pass. Waiting for feedback in the forum - let's see how responsive they are, given the monthly fee.
Update 2: Finally, the grader has been fixed. It required opening the Jupyter notebook and submitting it again directly from the notebook, instead of uploading the ipynb-file as proposed in the forum. Time to continue the course.
Course 1 - Week 4
In week 4, we are going "deep" - setting up a DNN and discussing the choice of hyperparameters. At the end of the day, it is mainly an extension of the concepts from the previous weeks.
After two programming assignments, this week's work is also done - time to look into the next course.
Course 5 - Sequence models
As exciting as it is to classify cat vs. non-cat images and hand-written numbers, sequence models (and time series prediction) are an area of machine learning that gained more interest of people in the finance community. Given that much of the work in the field is about predicting the future from time series (What is going to be the interest rate next year? Will the price of Deutsche Bank stock hit the hard support at 0 EUR?), it would only make sense explore potential applications of Machine Learning in this sector.
This course is actually course 5 of the deeplearning.ai specialisation. While the course seems to be more focused on sequence models for problems such as text and audio, I am sure it is a promising starting point.
Here is the link to the course, if you are interested in checking it out: https://www.coursera.org/learn/nlp-sequence-models
Course 5 - Week 1: Recurrent Neural Network Model (RNN)
During the first week, RNNs are introduced. The fundamental idea is to apply the activation levels of the neural network obtained with the previous input as additional input variables for the predictor. This course is definitely getting up to speed faster than the previous ones.