This document discusses improving language modeling using densely connected recurrent neural networks. It presents motivation for using densely connected layers rather than stacked layers in LSTMs for language modeling. The architecture connects the output of each layer to the input of every other layer with skip connections. Experiments show that densely connecting all layers substantially improves performance, allowing the same results as stacked LSTMs to be achieved with six times fewer parameters.