This document provides an overview of data mining and the CRISP-DM methodology. It discusses key terminology, potential applications, and a Venn diagram comparing data mining, knowledge discovery, big data analytics, statistics, and data science. The CRISP-DM methodology is explained in six steps: business understanding, data understanding, data preparation, modeling, evaluation, and deployment. Various data exploration, cleaning, transformation, and dimensionality reduction techniques are covered. Common machine learning algorithms, model selection factors, and assessment metrics are also summarized.
Related topics: