Course Project on Exploratory Data Analysis - Discuss and Share Your Work

My first draft done…used my own twitter account’s data.

BTW, I do not know how I can upload the font data to without uploading repeatedly. Perhaps I should download it then unzip on the local disk before usage?


I am using Jupyter notebook and am getting a dead kernel which will not automatically restart. Anyone have any ideas on what to do??


Hi, I did something that could help you.

First, it’s necessary to split all the countries of each row; for this, I created a dictionary:

With this dictionary you can work in many ways, one way is by converting it into a data frame, and then, summing the numerical values (column 1):

1 Like

Sharing my project:


I also have the same question on uploading the file.


6 posts were split to a new topic: How to upload CSV file on

Please read the whole line in pink region.There will be a Warning somewhere. It normally comes when a method is going to deprecated in the future or other warnings.

If you add a semi-colon ‘;’ to the end of the last line of your code like this

sns.countplot(patients_copy_df.diagnosis) ;

then the warning will not show.

Try this it may remove the warning in the pink region
This normally happens while using matplotlib and seaborn libraries


You could add the following lines at the first of your Notebook where you are importing the libraries

import warnings

The above line will suppress the warning not just this one but every warning in the notebook

1 Like

I realize your post is 4 days old now and you most likely have an answer. But if not, give a try; they have quite a few datasets of manageable size.

Hi, even though I have written the following code, I still have to upload my csv file everytime I run binder.

jovian.commit(project= project_name, environment=None, files= [‘Power generation India.csv’])
How can I save it on my notebook permanently ?

1 Like

I’ve just tested the files argument.


Seems like this function is replacing spaces with underscores. The users seem to be really persistent in using spaces in their filenames. Many problems and misunderstandings happen.


As mentioned above:

  1. Your file shouldn’t have spaces. Replace them with underscores (_ symbol).
  2. Make sure that you upload files in last call to jovian.commit(). Example:
jovian.commit(project='important', files=['file.csv'])
# some cells
# calculations
# graphs, plots, histograms
jovian.commit(project='important') # LAST COMMIT in the notebook

This is wrong, because the second commit is specified without files argument, so the notebook version gets created without this file. For safety I would suggest doing this:

jovian.commit(project='important', files=['file.csv'])
# some cells
# calculations
# graphs, plots, histograms
jovian.commit(project='important', files=['file.csv']) # LAST COMMIT in the notebook
1 Like

Watch this video it will help you.

1 Like

Yes exactly I came to know that after googling it. anyway, thank you for your explanation.

Great work , I would recommend you to write some explanations about the project by using markdown cells

Sharing is caring …:slight_smile:

I took a look to the project I think you still have ways to improve and add more of visualizations. Also, if I am not mistaken the project should have a minimum 5 visualizations?

1 Like

Hi guys,

Please find the my course projects on Automobile datasets! I have tried to implement most of the data analysis piece here.

Feel free to ask or suggest any changes. Your feedbacks are appreciated.


Hi Folks,
I am trying to replace the row values in a pandas dataframe column based on the value of an another column in the same row. I am using a for loop for this as below. The DataFrame is called ‘df_new_employmenttype’. The values in the column “Employment” exist. I have added another column named “EmploymentType”, in which I am trying to put the values “Enthusiasts” or “Professional” based on the value in the existing column “Empolyment”.
Is there any shortcut to do this task?
My code with the ‘for loop’ is as below:

for i in range(0,df_new_employmenttype.shape[0]): # go through each row
emp = df_new_employmenttype.iloc[i,0] # variable for the existing column value
if emp == ‘Independent contractor, freelancer, or self-employed’ or emp == ‘Employed full-time’ or emp == ‘Employed part-time’:
df_new_employmenttype.iloc[i,1] = ‘Professional’
if emp == ‘Student’ or emp == ‘Not employed, but looking for work’ or emp == ‘Not employed, and not looking for work’ or emp ==‘Retired’:
df_new_employmenttype.iloc[i,1] = ‘Enthusiast’

To the global zerotopandas braintrust:

Does anyone have any tips on plotting a fixed categorical variable as a bar chart?
I just want to illustrate the count for each of the 2-3 possibilities that this variable stores. The variables are currently in string format.



I’m interested. I don’t know, but out of my intuition, I guess it may work as if uploading data file. Perhaps, you guys have tried this.

1 Like

I may not have the answer for you, but I’m interested in your question.

How about trying replacing your code sns.countplot(patients_copy_df.diagnosis) with sns.countplot(x='diagnosis', data=patients_copy_df.diagnosis)? Will this resolve the issue?

Thanks in advance for testing this out for me.