Probabilistic methods are commonly used in machine learning. They involve estimating probabilities from training data using approaches like maximum likelihood estimation (MLE) and maximum a posteriori estimation (MAP). MLE seeks to find the probability distribution that makes the observed data most likely. MAP also incorporates prior assumptions about probability distributions. These methods were illustrated using examples of estimating the probability of a coin coming up heads based on coin flip data. Parametric methods assume data comes from a known distribution family defined by parameters that are estimated from data, such as the mean and variance of a Gaussian. They can be used for classification by estimating class densities and priors from data.