learning-rate
Parent: Hyperparameters
Source: google-ml-course
Hyperparameter ‘learning rate’ $\alpha$
- Small learning rate: small steps, long time to reach the minimum
- Big learning rate: big steps, shorter computation time but with potential to overshoot the minimum
Ideal learning rate
In 1D: the inverse of the second derivative of the model $$ \begin{align} \alpha = \frac{1}{f(x)''} \end{align} $$
In $\geq$ 2D: inverse of the Hessian