What to when we have both -ve and +ve grads

Compute gradients


Gradients for weights


tensor([[ 2.2130, -0.9320, 0.6896],
[-0.7745, -0.5719, -1.5609]], requires_grad=True)
tensor([[ 6278.2095, 3847.4817, 3027.4102],
[-24301.8887, -26762.2812, -16539.3965]])

what to do now as my grads are both -ve and +ve? should I increase slightly or decrease ?

What’s -ve and +ve?
If you mean that the numbers are both negative and positive, then that’s how the model learns - some features contribute positively, some negatively.

Without an actual info about what you were trying to do/achieve, the model used, the task, I have no idea how to answer it more thoroughly.

whenever gradient element is +ve , we decrease weight to decrease loss, and gradient element is -ve, we increase weight to increase loss, but what to do when my tensor contain both -ve and +ve gradient elements?

Still not sure about the nomenclature here.

Anyway, you always want to decrease loss (true for most loss functions you might encounter).

If the gradient is positive, then it means that this weight update would decrease the loss function.

Quick example:
1/x → if x becomes larger, then the overall result becomes closer to 0.

The functions in machine learning usually operate in spaces with many dimensions. You can’t predict how the loss function topography looks there, because it would probably blow anyone’s mind to imagine even 1000-dimensional space with valleys/hills of the loss function. An increase there (of single parameter) might mean that the overall result really becomes lower.

1 Like