שאלה | if the first few iterations of gradient descent increase the loss rather than decrease, then the most likely cause is that we have set the learning rate to too large a value (מועד ב 2022)

if the first few iterations of gradient descent increase the loss rather than decrease, then the most likely cause is that we have set the learning rate to too large a value (מועד ב 2022)

שאלה #180164

		True
		False

If alpha were small enough, the gradient descent should always successfully take a tiny, small downhill and decrease f(theta0, theta1) at least a little bit. if gradient descent instead increases the objective value, that means alpha is too large (or you have a bug in your code!)