Cluster-based Pseudo-labeling for Semi-Supervised LiDAR Semantic Segmentation
Abstract
The costly annotation process has driven the development of semi-supervised learning (SSL) approaches. Existing semi-supervised LiDAR segmentation methods typically process entire point clouds directly, aiming to assign labels to all points at the scene scale. However, the large number of points, combined with their sparse and irregular nature, makes it challenging to learn scene-level optimization objectives, especially in SSL settings where labeled data are insufficient. This paper presents a Cluster-based pseudo-LAbeling Semi-Supervised technique, called CLASS. CLASS is designed to divide point clouds into several small, pure clusters, thereby decomposing challenging scene-scale segmentation task into more manageable cluster-scale classification and segmentation tasks, enabling the generation of high-quality pseudo labels for unlabeled data. CLASS possesses three key properties. i) Task simplicity: our pseudo-labeling process is based on simpler cluster-scale classification and segmentation tasks, resulting in ease of learning. ii) Labeling effectiveness: CLASS can generate pseudo-labels comparable to ground truth using only approximately 10% labeled data. iii) Universal versatility: CLASS exhibits flexibility regarding LiDAR representations (e.g., BEV, voxel, and range view). Comprehensive experiments on popular LiDAR segmentation benchmarks demonstrate its superiority.