The document presents a dynamic semantic-based graph convolutional network (ds-gcn) designed for skeleton-based human action recognition, addressing limitations of traditional methods by capturing both low-level skeleton representations and high-level action representations. It introduces a temporal-causal sfd network (tc-sfdn) architecture and emphasizes the importance of encoding dynamic semantic information of joints and edges. Experimental results demonstrate that the ds-gcn significantly outperforms state-of-the-art methods on benchmark datasets NTU RGB+D and Kinetics-400.
Related topics: