In Lesson 5 - Gradient Boosting with XGBoost , I want to split “PromoInterval” column and also create column for Feb,May,Aug,Nov value. After that I want to move specific value to specific column. For example, Feb column contains only Feb value or if match Feb column and Feb value then it showed “Yes” others will be “No”. How to do it?
There are a few ways to do it. To begin with, I would suggest splitting the column up into its components (eg “Feb” or “May”) using .str.split(), for instance:
df[‘new_column_name’] = df[‘old_column_name’].str.split(‘,’).str
Start with that and see how far you can get. Good luck.
Thanks for your valuable answer. It’s worked. But I want to know split with a specific column for a specific value. For example, I want to create a “Feb” column and all “Feb” values will move to “Fab” column. Is there any way to do like this?
Hi. You can always create a new column Feb (eg df[‘Feb’] = pd.Series()). Then you can create a function (eg lambda function) that’ll populate that new column using str.contains(), eg df[‘promo’].str.contains(“Feb”).
Better still: you can create a function to find all unique elements within Promo (eg via extracting the string and splitting elements using .split()). Then for each unique element you can create a new DataFrame column (or Series) for the element, and populate the column using the above mentioned technique.