# Train for 100 epochs

for i in range(100):
preds = model(inputs)
loss = mse(preds, targets)
loss.backward()

In this loop, should not we must calculate loss after adjusting w and b .I am getting doubt that each time we adjust w grad zero how are we calculating new grad as we are not taking new random w(weight).

If I’m understanding your question correctly:

• `loss.backward()` calculates new gradients
• `w -= w.grad * 1e-5` & `b -= b.grad * 1e-5` adjust your weights
• `w.grad.zero_()` & `b.grad.zero_()` reset the gradients to zero because otherwise, when the for loop goes into its next iteration, it would add the newly calculated gradient to the existing gradient values.

As you can see from the above, we are:

• getting predictions, then
• calculating loss, then
• then adjusting our weights, then
• resetting gradients for our next calculation.

The reason you do it in that order instead of re-calculating losses after adjusting the weights and biases is your predictions have not changed, and so your loss has not changed. Does this clear up your confusion?

1 Like

Naah , we’ll have to calculate loss first to find out the d(loss) w.r.t w,b.
After calculating loss the gradients are then multiplied to the learning rate