Replacing NaN values in a dataframe

Hi all,

I’m trying to replace NaN values using the mean of the values. What I’m trying to do is group by country and get the mean values and replace the NaN values.

I have gotten as far as getting the mean values as seen in the screenshot below. My problem is how to replace the NaN values all at once without having to do one column or even one row at a time. I have looked and tried using df.fillna() but I don’t get the desired results.

Any help is appreciated. Thank you.

Hey @jmwas, welcome to the community.
Do you wanna replace it? or just drop the rows where the NaN values are there? You can just directly drop the rows if there are not too many rows with NaN values, but if you wanna replace it, You cannot replace it with 0 cause if you replace it with 0 you won’t get the desired mean. If you fill it with the mean value of the column to get a good result. How to fill with mean value? Just take a temporary df a copy of the original df, drop the NaN values in the temporary df, and get the mean of the required column. Now you can fill the value using df.fillna() and the mean value of the respective column.

Hi @birajde, thank you for taking your time to reply. I do appreciate it.
I do want to replace the values with the mean. By that you mean fill the temporary df with the respective mean and then merge the original df with the dropped NaN values? I hope my question makes sense

No, here is what I mean in steps.

  • Make a temporary df
  • Drop the NaN values in the temp df using df.dropna() function
  • Find the mean of the column you want to fill the NaN values using the .mean() function in the temp df.
  • As now you have the mean of the respective column just use df.fillna(mean_value) in the original df.

Hope you understand, let me know if I need to clarify more.

Thank you for further explaining. There are a few columns on the df with NaN values, does this mean I have to do each columns separately or I can fill the columns all at once with the different mean values?

You can either do one column at a time or make a loop to iterate over all the columns and fill the NaN values for each column using the loop(It’s up to you how you can generate the code).

Thank you very much for your help and patience with my questions.

1 Like