Hello :) Today is Day 16!
A quick summary of today:
- Finished Stanford/DeepLearning.AI’s Machine learning specialization
Firstly, let me share the final specialization certificate
As for the learned content
Today I learned about reinforcement learning for the 1st time, and thanks to Andrew Ng it was absolutely enjoyable
State
means action
Return
just like in finance, is an important word
Policy
tells us which action to take given a state
Depending on the discount factor
the model’s impatience changes
Bellman equation - It is the return if you start from state s, take action a (once), then behave optimally after that
How should I choose the behaviour correctly while training? At first, Andrew Ng said that it would be good to choose a high probability of random behavioural choice and then gradually lower it while training
From tomorrow, I will start preparing for the TensorFlow developer certificate which I found about today
That is all for today!
See you tomorrow :)