Practice dataset problem

I was working on the third weather dataset
while analysing the data the temperature had maximum correlation with visibility

but after plotting the graph it doesn’t make any sense to me
temp vs vis

So is this even a problem for linear regression?
because it doesnt look like one or am I making some mistake?

link to the notebook:

Hey you can see visibility and temperature has a positive correlation of 0.39 which is good but not so much, in the graph too you can see a positive trend the graph is tilting toward the top right side(a little). It seems like with increase in temperature visibility is increasing(most times), but that is not the case always. Try to imagine and draw a line through the graph, you will be able to see a Straight line though the points are dispersed a little.
About the second question, yes Linear Regression can be applied to this data but I am not sure if it is the best algorithm that can be applied try applying other models too and see which one works best.

1 Like

Thanks, I will try it with other models

I tried to draw lines using

  1. Linear Regression:

  2. SGDRessor:

1 Like

Here you can easily see a positive linear trend with both the columns.

I watched the second lecture yesterday. I will try to apply logistic regression on this dataset and try to determine rain or snow because that seems like a proper problem for this dataset

1 Like