learning-rate

Backlinks

:

Parent: Hyperparameters

Source: google-ml-course

Hyperparameter ‘learning rate’ $\alpha$

Small learning rate: small steps, long time to reach the minimum
Big learning rate: big steps, shorter computation time but with potential to overshoot the minimum

Ideal learning rate

In 1D: the inverse of the second derivative of the model $$ \begin{align} \alpha = \frac{1}{f(x)''} \end{align} $$

In $\geq$ 2D: inverse of the Hessian