With the prevelance of 3D sensors and the improvement of computer computing power, the research of 3D computer vision has attracted more and more researchers' attention. Compared with 2D images, 3D data can easily realize the decoupling between objects and background. Therefore, the research of 3D computer vision has important potential and market value in the practical application of automatic driving, virtual reality, robot and remote sensing mapping.
Point cloud is the most intuitive representation of 3D data. How to effectively extract features from 3D point cloud and classify them has been a research hotspot in the field of 3D computer vision.
In order to use convolution neural network(CNN) to extract the internal features of point cloud, some methods transform point cloud into multi-view image or 3D voxel. However, these transformations usually lead to the loss of the inherent geometric information in point cloud, and have high algorithm complexity. In recent years, more and more researchers have proposed point cloud recognition algorithms based on the original point cloud, which directly acts on the original point cloud without any conversion and in result there is no explicit information loss. Due to the complexity of point cloud data, the original point cloud recognition algorithm based on deep learning also faces many challenges:
(1) Because of the disorder and unstructure of the point cloud, the general CNN can not be directly applied to the feature extraction of the original point cloud. Some methods use multi-layer perceptron to extract the features of points, but its extraction ability is limited and the relevant information between points is ignored;
(2) In the practical application scenario, the point cloud is not necessarily aligned, so the network needs to be robust to the rotation, translation and scale change of the point cloud;
(3) It is inevitable that there will be noise when the sensor collects 3D point cloud, so the algorithm needs to have certain noise robustness. This thesis makes an in-depth study on how to solve these problems:
This thesis proposes a new network named dscnet for extracting 3D point cloud context information.
In this method, the shape context algorithm is used to divide the spherical local neighborhood of the point cloud, and then the feature information of the peripheral points in the spherical neighborhood is blurred in a way similar to the geometric blur, forming a rough and fine feature. Finally, based on the above context information, a hierarchical network is designed to learn the shape features from it. Compared with other excellent methods, DSCNet not only has advantages in context information extraction, but also has stronger robustness for point cloud noise and density. In addition, DSCNet has scale and translation invariance to point cloud and rotation invariance under certain conditions. This thesis proves its effectiveness and advanced nature through abundant experiments.
This thesis proposes a new point cloud learning network named DRINet, which has rotation invariance. This method uses the distance information to represent the spatial position information of each point in the point cloud. The core idea is to use the distance between points to determine the internal rigid structure of the local point cloud. Because this expression method only considers the relative relationship between points, it is invariant to any rotation in 3D space; In order to improve the feature extraction ability of point cloud effectively in this rotation invariant point cloud information representation framework, this thesis also proposes an adaptive feature fusion module (AFM) based on attention mechanism for point cloud feature extraction in DRINet. In this thesis, we evaluate the performance of the method on multiple open datasets and the results prove its effectiveness.