Google Scholar

Bevheight++: Toward robust visual centric 3d object detection

L Yang, T Tang, J Li, K Yuan, K Wu… - … on Pattern Analysis …, 2025 - ieeexplore.ieee.org

L Yang, T Tang, J Li, K Yuan, K Wu, P Chen, L Wang, Y Huang, L Li, X Zhang, K Yu

IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025•ieeexplore.ieee.org

While most recent autonomous driving system focuses on developing perception methods on ego-vehicle sensors, people tend to overlook an alternative approach to leverage intelligent roadside cameras to extend the perception ability beyond the visual range. We discover that the state-of-the-art vision-centric detection methods perform poorly on roadside cameras. This is because these methods mainly focus on recovering the depth regarding the camera center, where the depth difference between the car and the ground quickly shrinks while the distance increases. In this paper, we propose a simple yet effective approach, dubbed BEVHeight++, to address this issue. In essence, we regress the height to the ground to achieve a distance-agnostic formulation to ease the optimization process of camera-only perception methods. By incorporating both height and depth encoding techniques, we achieve a more accurate and robust projection from 2D to BEV spaces. On popular 3D detection benchmarks of roadside cameras, our method surpasses all previous vision-centric methods by a significant margin. In terms of the ego-vehicle scenario, BEVHeight++ surpasses depth-only methods with increases of +2.8% NDS and +1.7% mAP on the nuScenes test set, and even higher gains of +9.3% NDS and +8.8% mAP on the nuScenes-C benchmark with object-level distortion. Consistent and substantial performance improvements are achieved across the KITTI, KITTI-360, and Waymo datasets as well.

ieeexplore.ieee.org

Mehr anzeigenWeniger anzeigen

Speichern Zitieren Zitiert von: 39 Ähnliche Artikel Alle 8 Versionen

Bestes Ergebnis für diese Suche Alle Ergebnisse

Zitieren

Erweiterte Suche

In „Meine Bibliothek“ gespeichert

Bevheight++: Toward robust visual centric 3d object detection