gradient-descent

Backlinks

:

Parent: Variations of gradient descent

Source: google-ml-course

Gradient descent

An iterated approach

Labeled data arrives
Gradient of the loss function is computed
Now the direction for updating the model parameters $\mathbf{w}$ is known (negative gradient).
A step is taken in this direction, in the parameter space. The step size is equivalent to the learning rate .
Repeat

This process tunes all model parameters simultaneously.

Notes

Works well for convex problems (loss function w.r.t. parameters is convex); the convex loss function converges at the minimum
But many ML problems are not convex, e.g. neural networks
Variations of gradient descent