Decision trees are a non-parametric hierarchical classification technique that can be represented using a configuration of nodes and edges. They are built using a greedy recursive algorithm that recursively splits training records into purer subsets based on splitting metrics like information gain or Gini impurity. Preventing overfitting involves techniques like pre-pruning by setting minimum thresholds or post-pruning by simplifying parts of the fully grown tree. Decision trees have strengths like interpretability but also weaknesses like finding only a local optimum and being prone to overfitting.
Related topics: