Day 21: May the ML columns be with you
Moving from categories to values.
- Create an additional column(s) that converts the income ranges to a number.
- Create an additional column(s) that converts the age ranges to a number.
- Create an additional column(s) that converts the school groupings to a number.
Why are we converting the columns to numerical values?
One-hot encoding
- One-hot encode all columns that have categories.
- Convert all yes/no responses to 1/0 numeric.
Which columns are going to be problematic for the function pd.get_dummies()
?
What is pd.get_dummies()
default behavior for the columns that are created? Should we change that behavior?
Updated on 12 Oct 2020