From the course: Applied Machine Learning: Ensemble Learning (2022)
Unlock the full course today
Join today to access over 24,700 courses taught by industry experts.
Cleaning up categorical features - Python Tutorial
From the course: Applied Machine Learning: Ensemble Learning (2022)
Cleaning up categorical features
- [Instructor] In this video, we'll continue the cleaning we started in the last video, but now we'll focus on the categorical features. Just a reminder, if you're picking this up as a new notebook you'll need to rerun the prior cells, to ensure that you have the appropriate data and packages for the code that we'll be covering. Let's start by creating an indicator for the cabin feature. As a quick reminder, running this isnull sum method to count the missings, we see that cabin is missing for 687 people in this data set. Recall for age, we simply replaced the missing values with the average value for age, we're able to take that approach because age was missing at random, so we couldn't really bake any information into that missing value. It's not the same in this case, and let's see why. Let's take a quick look at survival rate, based on whether cabin is missing in this data set or not. We'll…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.