This document contains lecture notes on machine learning and deep learning. It discusses regression, classification, and neural networks. For regression and classification, it presents the optimal functions that minimize error and relates them to conditional expectations. It also provides bounds on the generalization error of functions learned through empirical risk minimization. For neural networks, it discusses their ability to approximate functions and bounds the VC-dimension of neural networks with multiple hidden layers.