**Part 5: PointNet: Pointcloud Classification and Segmentation**
**Goals:** Understanding and implementing PointNet based architecture for classificatoin and segmentation of point clouds. # PointNet [PointNet](https://arxiv.org/pdf/1612.00593.pdf) is a deep learning architecture for processing point clouds. Two buliding blocks of PointNet are: - **Pooling**: A pooling operation is a simple permutation invariant operator (e.g. max, average, etc) - **Pointwise processing**: Applying the same function to each point is trivially permutation equivariant Suppose we have a permutation invariant operator **g**. Then **f(g([h(x1), h(x2), ..., h(xn)]))** is also permutation invariant for any **h** or **f**. Hence, we can use the following architecture to process point clouds, which trivially represents the PointNet architecture: 1. **h**: Process points independently (e.g. MLP) 2. **g**: Pooling operation (e.g. max, average, etc) 3. **f**: Process global feature (e.g. MLP) A final remark is that pointwise output prior to the pooling operation can be used as local embeddings, whereas the output of the pooling operation can be used as global embeddings as shown in the diagram below.
|
|
|
**Vase** |
|
|
|
**Lamp** |
|
|
|
# Pointcloud Segmentation
A simple trick is used to perform pointcloud segmentation using PointNet:
Concatenate the local embedding and the global embedding and pass it through fully connected layers to obtain the class probabilities for each point.
This is illustrated in the diagram below:
|
|
|
|
**Prediction** |
|
|
|
|
**Accuracy** | 94.65% | 88.90% | 91.79% | 47.53% | 40.85%
# Robustness Analysis
Two experiments were conducted to assess the robustness of a learned model.
The first experiment involved rotating the input point clouds at varying angles,
and the second experiment involved varying the number of points in the input point clouds.
The accuracy of the model was reported in both experiments,
and the results were presented quantitatively and qualitatively in the figures provided below.
The findings revealed that both classification and segmentation models were similarly sensitive to rotation,
with accuracy decreasing linearly as the rotation angle increased.
However, the classification model proved more robust to the number of points as compared to the segmentation model.
Notably, the accuracy of the classification model remained relatively consistent until the number of points dropped below 100,
whereas the segmentation model showed a similar trend when the number of points dropped below 1000.

| 
| 
| 
| 
| 
Angle: 120 degrees | | | Angle: 180 degrees | |

| 
| 
| 
| 
| 
Case 1 | Case 2 | Case 3 | Case 1 | Case 2 | Case 3
:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:
Num Points: 10k | | | Num Points: 1000 | |

| 
| 
| 
| 
| 
Num Points: 500 | | | Num Points: 100 | |

| 
| 
| 
| 
| 