[BMVC 2022] DA-CIL: Towards Domain Adaptive Class-Incremental 3D Object Detection

DA-CIL: Towards Domain Adaptive Class-Incremental
3D Object Detection
Ziyuan Zhao1,2, Mingxi Xu1,3, Peisheng Qian1, Ramanpreet Singh Pahwa1,2, and Richard Chang1
Presenter: Zhao Ziyuan
1 Institute for Infocomm Research (I2R), A*STAR, Singapore
2 Artificial Intelligence Analytics & Informatics (AI3), A*STAR, Singapore
3 Nanyang Technological University, Singapore
Paper ID: 916

2
Introduction – Catastrophic Forgetting & Domain Shift
- Deep learning-based 3D object detection has received considerable attention in 3D point clouds.
- Catastrophic forgetting: the performance of DL models on old classes tends to decrease substantially
when trained on novel classes.
- Domain Shift: DL models trained on one domain (i.e., source domain) always suffer tremendous
performance degradation when evaluated on another domain (i.e., target domain).

3
Motivation – Domain Adaptive Class-incremental Learning
- Existing CIL and UDA methods are designed for addressing different challenges separately, ignoring
that the catastrophic forgetting and domain shift problems concurrently exist in 3D object detection
on point clouds of real-world environments.
- We aim to leverage labeled old classes on the source domain and labeled new classes on the
target domain to close the domain shift and adapt to new classes without forgetting old classes on the
target domain in DA-CIL.

4
Method – Transformation-consistent Meta-hallucination
- We proposed a new CIL paradigm to enable Domain Adaptive Class-Incremental Learning for 3D object
detection.
- Dual-domain copy-paste (CP) address both in-domain data scarcity and cross-domain distribution
shift.
- Dual-teacher knowledge distillation (KD) enforces multi-level consistency regularization (MLC).

5
Method (1) – Dual-Domain Copy-Paste
- To relieve data scarcity and reduce the domain gap at the data level, we extensively leverage copy-
paste (CP) augmentation techniques for creating cross-domain and in-domain point clouds.

6
Method (2) – Multi-Level Consistency
- To facilitate the dual-teacher knowledge transfer, we propose multi-level consistency regularization from
two aspects.
- Statistics-Level (SC) Consistency ( Batch Normalization parameters alignment)
- Bounding Box-Level Consistency ( center-, class and size-level consistency loss)
Na Zhao, Tat-Seng Chua, and Gim Hee Lee. Sess: Self-ensembling semi-supervised 3d object detection. CVPR, pp.11079–11087, 2020.

7
Method (3) – Dual-Teacher Training
- Cross-domain Teacher: The student model learns the underlying knowledge in base classes from a
cross-domain teacher via distillation loss (square of Euclidean distance between the classification
logits of different bounding boxes).
- In-domain Teacher: Meanwhile, the EMA in-domain teacher helps student model capture structure and
semantic invariant information in objects with consistency loss
- The mixed labels (pseudo base + real novel) are transformed by the same augmentation step that is
applied on the augmented source domain to compute a supervised loss with the backbone VoteNet
Charles R Qi, Or Litany, Kaiming He, and Leonidas J Guibas. Deep hough voting for 3d object detection in point clouds. CVPR, pp. 9277–9286, 2019.

8
Experimental Results
- Dataset
- We evaluate our method on 3D object detection datasets, ScanNet (source) and SUN RGB-D
(target).
- 5 categories (bathtub, bed, bookshelf, chair, desk) were selected as base classes.
- 5 additional categories (dresser, nightstand, sofa, table, toilet) in SUN RGB-D were selected as
novel classes in the target domain.
ScanNet dataset SUN RGB-D dataset

9
Experimental Results (1) – Comparison with different methods
- For class-incremental learning, we evaluated 2 naive transfer learning baselines Freeze-and-add,
and Fine-tune. We also used SDCoT as a strong baseline for CIL.
- For unsupervised domain adaptation, We implemented one recent popular self-ensembling method,
mean teacher (MT).
- Our method outperforms many methods by a clear margin in CIL under domain shift.
Na Zhao, Tat-Seng Chua, and Gim Hee Lee. Sess: Self-ensembling semi-supervised 3d object detection. CVPR, pp.11079–11087, 2020.

10
Experimental Results (2) – Comparison with different augmentation techniques
- We implemented two popular augmentation techniques for comparison, Mix3D and CutMix.
- Mix3D: combines 2 point clouds into a mixed point cloud via concatenation, which results in
excessive overlaps in the mixed point clouds
- CutMix: randomly replaces patches of point clouds with patches from other point clouds, which
unnecessarily introduces extra context from the source domain.
- Our method outperforms the two augmentation techniques.

11
Experimental Results (3) – Ablation Analysis
- Without cross-domain copy-paste, the model achieves a lower result in base classes, which proves
the effectiveness of cross-domain copy-paste augmentation in base class recognition.
- Similarly, it can be inferred from results without in-domain augmentation that in-domain copy-paste
can enhance the model performance in novel classes.
- Moreover, statistics-level consistency provides around 0.8% performance increment in all classes.

12
Experimental Results (4) – Qualitative Comparison (good cases)
- Our method can accurately detect both
old and novel classes in the target
domain, overcoming the domain shift in
old classes.

13
Experimental Results (4) – Qualitative Comparison (failure cases)
- Fail to detect the bookshelf in the point clouds.
The failure is due to the loss of the geometric
structure of the bookshelf in the point cloud data.
- The desk object is misclassified as a table,
which is likely due to geometric similarities
between the 2 classes.
- The bounding box of the desk object deviates
from the ground-truth, which can be attributed to
the large size and irregular shape of the object.

14
Conclusions
- We identify and explore a novel domain adaptive class-incremental learning paradigm for 3D object
detection.
- We propose a novel 3D object detection framework, in which we design a novel dual-domain copy-
paste augmentation method to address both in-domain data scarcity and cross-domain distribution shift.
- We further enhance the dual-teacher knowledge distillation with multi-level consistency between
different domains.
- We achieve superior performance in extensive experiments.

15
Conclusions
- We identify and explore a novel domain adaptive class-incremental learning paradigm for 3D object
detection.
- We propose a novel 3D object detection framework, in which we design a novel dual-domain copy-
paste augmentation method to address both in-domain data scarcity and cross-domain distribution shift.
- We further enhance the dual-teacher knowledge distillation with multi-level consistency between
different domains.
- We achieve superior performance in extensive experiments.

Thanks
Zhao_Ziyuan@i2r.a-star.edu.sg
https://jacobzhaoziyuan.github.io/

[BMVC 2022] DA-CIL: Towards Domain Adaptive Class-Incremental 3D Object Detection

More Related Content

What's hot (20)

Similar to [BMVC 2022] DA-CIL: Towards Domain Adaptive Class-Incremental 3D Object Detection (20)

More from Ziyuan Zhao (15)

Recently uploaded (20)

[BMVC 2022] DA-CIL: Towards Domain Adaptive Class-Incremental 3D Object Detection

Editor's Notes