The document presents Gated-VIGAT, an efficient bottom-up event recognition and explanation method that utilizes a frame selection policy and gating mechanism to reduce computational complexity while maintaining performance. This approach achieves competitive recognition accuracy across two datasets, outperforming top-down methods and providing object-level explanations. Future work aims to enhance further efficiency, including improvements in object detection and feature extraction.
Related topics: