Lecture 4: Analyzing Tabular Data with Pandas

Got it! The ; hides the object name into the output! Thank you again!

1 Like

Thank you for answering! Here @PrajwalPrashanth has pointed out the difference. You can check that too if you prefer.

1 Like

How to export the images and plots.

Some basic work on dataframe was missing such as removal of specific data, joining of new data, removal of non standard data such as shown in the example data with value -148 (instead of assuming something better to remove that)

1 Like

Who’s here for the lecture?

10 Likes

Excited for the lecture! :smiley:

3 Likes

Pandas… always fun playing around analytics!!!

1 Like

Such a great tool library for working with tabular data!

2 Likes

how to use dataset from kaggle data as csv file

How is min of new_cases and new_deaths negative?
min -148.000000 -31.000000

1 Like
import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

Copy the path and read_csv , for example:
data = pd.read_csv("/kaggle/input/aviation-accident-database-synopses/AviationData.csv")

2 Likes

Id like to share shomething - a tool I use actively both to avoid losing work, and to make copy-pasting multiple pieces of data (it also works on pictures etc) is Ditto. You can download it from both the Microsoft store and Im told it also has IOS support. What it does is, that it broadens the built-in clipboard to an ammount you set up in the apps settings. Lets say you wanted to be able to fall back on one of the past 20 copied strings - you open Ditto and just select what you want to paste.
since it creates a local clipboard, it is possible to withdraw the individual copied strings, so careful if you use it to copy-paste passwords, other than that you will be fine. :wink:

1 Like

So the output of pd.read_csv('italy-covid-daywise.csv') differ to read_csv('italy-covid-daywise.csv'), the former one is the data frame and the latter one is the list (maybe it is the wrong way to use read_csv, my point it not using pd.

1 Like

So you are basically creating a “View” with the alias “cases_df”?

1 Like

instead of using covid_df.at[240, 'new_tests'], can I specify the date in the index (such as using filter the name) when the df is too large and hard to find the index?

yes

df[df['date']=={date}]

@aakashns Does NaN stand for Not a Number??

3 Likes

yes Nan stands for Not a Number

Why do i have to install first and then import Numpy in a new notebook, whereas there is no install Numpy command in the course notebooks?