From the course: Data Preparation, Feature Engineering, and Augmentation for AI Models
Unlock this course with a free trial
Join today to access over 24,700 courses taught by industry experts.
Detecting and managing outliers
From the course: Data Preparation, Feature Engineering, and Augmentation for AI Models
Detecting and managing outliers
- [Instructor] One of the steps in data preparation is understanding outliers. Outliers represent atypical records that deviate from normal patterns. Now, if we don't address these, we are at the risk that outliers can skew our analysis and lead to poor quality models and ultimately to poor decisions. So what we want to do is employ a comprehensive approach that captures potentially different types of outliers that any one single method might miss. So for example, there's different kinds of outliers that we want to detect. There could be data errors. For example, we might have data quality issues in our point of sales systems. There's also the potential for fraud. And of course, sometimes outliers are legitimate, but they just happen to be unusual transactions. So we need to be able to identify all of these different scenarios. Now, there are a variety of detection methods. One is known as the Z-score method, and that basically identifies values that are beyond three standard…
Contents
-
-
-
(Locked)
Data exploration and initial quality assessment4m 49s
-
Detecting and managing missing data5m 13s
-
(Locked)
Detecting and managing outliers3m
-
(Locked)
Challenge: Assess data quality of a dataset18s
-
(Locked)
Solution: Assess data quality of a dataset23s
-
(Locked)
Feature engineering: Scaling and normalizing data4m 47s
-
(Locked)
Feature engineering: Categorical encodings4m 8s
-
(Locked)
Challenge: Apply feature engineering to a dataset18s
-
(Locked)
Solution: Apply feature engineering to a dataset16s
-
(Locked)
-
-
-
-
-
-