Happy Thursday, y’all! For the next two weeks we’ll be watching and discussing “How Convolutional Neural Networks Work”, which is short, informational video from Brandon Rohrer.
We can use this thread for conversation, so as you watch the video, share your thoughts, questions, and ideas below! I can’t wait to see what we learn
As someone who is more used to tabular data, I can say that the video was very clear and very instructive (but maybe it is just a Dunning Kruger effect ).
Things I am now wondering is exactly how useful ReLU layers are. I think that my math is not good enough to get an intuition on when math doesn’t work anymore. Definitely something I want to learn more about!
In any case, the video made me want to play with this. Any expert who is opinionated on where to start? Torch maybe?
Anyway great video, thanks @jesse and the people who voted!
I’ll echo what Cedric said - it was so succinct! I especially liked the explanations on backpropagation and ReLU.
In terms of ReLU, it seems the main benefits are speed, given that you’re essentially removing the noise from the matrix. Additionally, when I’ve tried the tanh or logistic transformations in the past, instead of ReLU, I’ve gotten errors and all hell breaks loose. So I guess they’re just simpler to implement.
I had this on my calendar to watch later in the week but feel like I need to bump it up to ASAP! y’all are getting me excited to watch this!
I studied CNNs from a lecture not a couple of months ago. I think that the video provided a very good recap and visual explanations makes the algorithm more clear.
Thanks for @jesse and everyone who share their opinions.
yes, of course! the video was such a good refresher and helped me solidify my understanding. I’m glad you enjoyed it!
Data science has been kind of a retirement job for me. I am an electrical engineer by training (I finished school in the mid-80s). This video has a Back to the Future vibe for me. There are a number of equivalences between electrical circuits and neural networks.
- ReLu => simple diode circuit
- Weights and summing nodes => operational amplifier circuit in analog computer
- Sigmoid/Tanh function => functions that behave a lot like the response of differential amplifiers
- Convolutions => filter circuits
As I think about it, this makes a lot of sense. CNNs are emulating analog functions in biological organisms.
Electrical engineers often approximate all sorts of functions (e.g. logarithms) using linear combinations of inputs followed by an “activation function” – EEs call them limit amplifiers. Convolutions are key to these systems because they simplify complex data into categories. In the case of circuits, the convolutions are filters that separate the desired signal from noise. Just like in CNNs.
These circuits are orders of magnitude simpler than the CNNs being implemented today, but the concepts are similar.
A great video and it shows that different fields often do the same things with ENTIRELY different jargon.
this is such a cool thing to learn! I don’t know much about electrical engineering beyond a couple of physics classes in college, so it was super interesting to see the parallels between the two!
this would make such an interesting blog post series, too - comparing data science to your electrical engineering work!
All fields stand on the shoulders of those that came before. For example, EEs use adaptive filters to estimate system parameters that can change over time – your car’s MPG estimator is an example. These filters dynamically tune themselves as new data come in. Some of the key early work was done back in the 1940s (WW2) by Norbert Wiener, who was working on analog computers for fire control. He developed the Wiener filter, which is the basis for all the different adapter filters we see today (e.g. Kalman Filter). You may not know that you are using adaptive filters, but they are used to control cars, aircraft, etc.
These adaptive filters architecturally look a lot like the algorithms used in data science. By architecturally, I mean that they are yet another episode in the long-running theme of minimizing loss functions.
A key difference is that hardware is a physical thing and must be MUCH simpler than software algorithms – otherwise, we couldn’t build it.
Sorry to run on. This subject is a passion for me.
Recently, I watched a seminar from an EE professor. It started with an introduction to machine learning and continued with the scientific development in history, the industrial revolution, and so on. The topic is pretty broad yet very intellectual.
It is very interesting to see how he has a very different data science understanding compared to me, as an economist who learned everything building upon econometrics.