Question on code

Hi guys,
During lecture video, we have the following code:

year = pd.to_datetime(raw_df.Date).dt.year
train_df = raw_df[year < 2015]

However, I don’t understand how is the Series ‘year’ can be linked to the raw_df?
There is no column with the name ‘year’, so how is that we can select the rows with year < 15 in the second line of code??

Anyone can explain this? Thanks a lot.

You will understand this if you run the year < 2015 code separately. Basically the year is a series of the raw_df DataFrame. So the index column of Year matches with the index column of the DataFrame. When we do year<2015 the series gets converted into a boolean series of False and True. We get those rows from the raw_df where the value of year<2015 are True.

Ok, the link between year and raw_df stays with the first line. I thought it creates a separate series on its own. Thanks a lot.

1 Like