From the course: Data Preparation, Feature Engineering, and Augmentation for AI Models

Unlock this course with a free trial

Join today to access over 24,700 courses taught by industry experts.

Detecting and managing outliers

Detecting and managing outliers

- [Instructor] One of the steps in data preparation is understanding outliers. Outliers represent atypical records that deviate from normal patterns. Now, if we don't address these, we are at the risk that outliers can skew our analysis and lead to poor quality models and ultimately to poor decisions. So what we want to do is employ a comprehensive approach that captures potentially different types of outliers that any one single method might miss. So for example, there's different kinds of outliers that we want to detect. There could be data errors. For example, we might have data quality issues in our point of sales systems. There's also the potential for fraud. And of course, sometimes outliers are legitimate, but they just happen to be unusual transactions. So we need to be able to identify all of these different scenarios. Now, there are a variety of detection methods. One is known as the Z-score method, and that basically identifies values that are beyond three standard…

Contents