This document summarizes improved training methods for Wasserstein GANs (WGANs). It begins with an overview of GANs and their limitations, such as gradient vanishing. It then introduces WGANs, which use the Wasserstein distance instead of Jensen-Shannon divergence to provide more meaningful gradients during training. However, weight clipping used in WGANs limits the function space and can cause optimization difficulties. The document proposes using gradient penalty instead of weight clipping to enforce a Lipschitz constraint. It also suggests sampling from an estimated optimal coupling rather than independently sampling real and generated samples to better match theory. Experimental results show the gradient penalty approach improves stability and performance of WGANs on image generation tasks.
Related topics: