This document discusses various normalization techniques used in deep learning models including batch normalization, layer normalization, recurrent batch normalization, and group normalization. It provides an overview of how each technique works and its advantages and disadvantages. For example, it explains that batch normalization stabilizes the distribution of inputs to hidden layers during training which allows higher learning rates and faster training. Later sections discuss research investigating whether batch normalization's effectiveness is truly due to reducing internal covariate shift during training or if it makes the optimization landscape smoother.