Hello :) Today is Day 279!
A quick summary of today:
- passed the Neo4j GDS certification exam
- read The 100-page ML book
Neo4j Graph Data Science Certification ✔️
Technically this should count for yesterday, because after I posted yesterday’s post and rested a bit I decided to take the certification which was around midnight.
And I got it 🥳
The 100-page ML book
Wow! This book goes into my hall of fame 😆
While it is ~150 pages (short) it covers the material very well and if I was a beginner, I would use this book as a guide for what to study next. I am not sure if this book is a precursor for the MLE book by the same author or if it’s a complimentary piece, but it is so nice and easy to read (compared to what I read next below)
I got the MLE book as well and I will keep it on my to-read list.
Before ML Vol 2 - Chapter 5 - A Free Upgrade: More Dimensions
Scalar Functions
It is a type of function that takes inputs from a two-dimensional space (represented by R^2) and maps them to a single-dimensional output (represented by f : R^2 → R)
When dealing with functions of two variables, such as f(x, y), we employ the concept of partial derivatives. This involves differentiating with respect to one variable while holding the other constant.
Gradient Descent
- X = (x₁, x₂, …, xₙ) represents a vector with n dimensions
- η is the learning rate
- ∇f(Xₙ) is the gradient of the function f with respect to the parameters Xₙ - the GPS that indicates the direction of steepest descent
The Hessian
The Hessian matrix is crucial in understanding the curvature of scalar functions and determining convexity. It’s a square matrix of second-order partial derivatives. To prove convexity, the function must satisfy a specific inequality involving the Hessian.
By using a second-order Taylor expansion, we relate the Hessian to the curvature properties of the function. If the Hessian is positive semi-definite (its eigenvalues are non-negative), the function is convex. This guarantees that any identified minimum is a global minimum, which is vital in optimization processes.
Neural networks
This part introduced some of the basic concepts behind NNs and went through a sample NN calculation
- activation functions
- loss functions
- backprop and chain rule
The author shares a notebook where a neural net is written from scratch and a comparison is shown for different activation functions as well.
Vector Functions
Taking multiple inputs and producing multiple outputs
An example curvature:
It’s definitely not convex!
We must understand how to classify critical point other than the minima
- Positive Eigenvalues (Local Minimum):
- If all the eigenvalues of the Hessian matrix at a critical point are positive, then the function has a local minimum at that point. This is because the positive eigenvalues indicate that the function is concave up along all directions in the vicinity of that point, forming a bowl shape or an upward-facing paraboloid.
- Negative Eigenvalues (Local Maximum):
- If all the eigenvalues of the Hessian matrix at a critical point are negative, then the function has a local maximum at that point. This represents a situation where the function is concave down in all directions at that point, resembling an inverted bowl or a downward-facing paraboloid.
- Mixed Sign Eigenvalues (Saddle Point):
- If the Hessian matrix at a critical point has eigenvalues with mixed signs (some positive and some negative), then the function has a saddle point at that location. A saddle point represents a situation where the function is concave up in some directions and concave down in others, resembling a saddle shape on the graph.
The global or local minima concept here is slightly different than with a scalar function. Two functions dictate the output of f: The function d represents the distance to the food source whereas the dance angle is depicted by that θ guy. Therefore, selecting one or more of these critical points depends on the context of the problem we are dealing with.
And again showed an example neural net calculation.
2nd book of this Before ML series, and I still feel this book has too much abstractions and side stories. Maybe it deserves a 2nd read to change my opinion.
I found that the 3rd book on probability and stats is only available on amazon’s kindle, but when I try to buy the Kindle edition it says it is not available right now :( I guess I have to wait.
That is all for today!
See you tomorrow :)