The document presents a densely connected convolutional networks (DenseNet) approach for audio tagging, addressing challenges like varied input lengths, insufficient data, and unreliable annotations from the DCASE 2018 challenge. Solutions include segment-wise learning, batch-wise loss masking, and mixup data augmentation to improve model performance. The proposed architecture involves an end-to-end DenseNet framework with multi-head softmax for better classification stability and convergence.