Real-Time 3-D Semantic Scene Parsing With LiDAR Sensors.

Researchers

Journal

Modalities

Models

Abstract

This article proposes a novel deep-learning framework, called RSSP, for real-time 3-D scene understanding with LiDAR sensors. To this end, we introduce new sparse strided operations based on the sparse tensor representation of point clouds. Compared with conventional convolution operations, the time and space complexity of our sparse strided operations are proportional to the number of occupied voxels N rather than the input spatial size r³ (often N ≪ r³ for LiDAR data). This enables our method to process point clouds at high resolutions (e.g., 2048³) with a high speed (130 ms for classifying a single frame from Velodyne HDL-64). The main structure includes a CNN model built upon our sparse strided operations and a conditional random field (CRF) model to impose spatial consistency on the final predictions. A highly parallel implementation of our system is presented for both CPU-GPU and CPU-only environments. The efficiency and effectiveness of our approach are demonstrated on two public datasets (Semantic3D.net and KITTI). The experimental results and benchmark tests show that our system can be effectively applied for online 3-D data analyses with comparable or better accuracy than the state-of-the-art methods.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *