The document discusses methods for compressing deep neural network (DNN) models for embedded vision, focusing on pruning, distillation, and neural architecture search. It introduces a new pruning method called Greedy Inter-layer Order with Random Selection of Intra-layer Units (GRS) and provides a software tool named NeuroGRS for automating this process. The document includes experimental results and findings indicating how GRS improves model performance and efficiency by integrating structured pruning approaches.
Related topics: