Data Preparation

Overall Structure

└── data 
    └── sets
        │── nuscenes
        │── semantickitti
        │── scribblekitti
        └── waymo_open

nuScenes

To prepare the nuScenes-lidarseg dataset, download the data, annotations, and other files from https://www.nuscenes.org/download. Unpack the compressed file(s) into /data/sets/nuscenes and your folder structure should end up looking like this:

└── nuscenes  
    ├── Usual nuscenes folders (i.e. samples, sweep)
    │
    ├── lidarseg
    │   └── v1.0-{mini, test, trainval} <- contains the .bin files; a .bin file 
    │                                      contains the labels of the points in a 
    │                                      point cloud (note that v1.0-test does not 
    │                                      have any .bin files associated with it)
    │
    └── v1.0-{mini, test, trainval}
        ├── Usual files (e.g. attribute.json, calibrated_sensor.json etc.)
        ├── lidarseg.json  <- contains the mapping of each .bin file to the token   
        └── category.json  <- contains the categories of the labels (note that the 
                              category.json from nuScenes v1.0 is overwritten)

📝 Create nuScenes Dataset

For fully-supervised training and evaluation:
- We support scripts that generate dataset information for training and validation. Create these .pkl info files by running:
```
python ./tools/create_data.py nuscenes --root-path ./data/nuscenes --out-dir ./data/nuscenes --extra-tag nuscenes
```

For semi-supervised training and evaluation:

Download the pre-processed .pkl files from here and put them under the nuscenes/ folder.

└── nuscenes
    ├── Usual nuscenes folders (i.e. samples, sweep)
    ├── ...   
    ├── nuscenes_infos_train.pkl
    ├── nuscenes_infos_val.pkl
    ├── ...
    ├── nuscenes_infos_train.10.pkl
    ├── nuscenes_infos_train.10-unlabeled.pkl
    └── ...

SemanticKITTI

To prepare the SemanticKITTI dataset, download the data, annotations, and other files from http://semantic-kitti.org/dataset. Unpack the compressed file(s) into /data/sets/semantickitti and re-organize the data structure. Your folder structure should end up looking like this:

└── semantickitti  
    └── sequences
        ├── velodyne <- contains the .bin files; a .bin file contains the points in a point cloud
        │    │── 00
        │    │── ···
        │    └── 21
        ├── labels   <- contains the .label files; a .label file contains the labels of the points in a point cloud
        │    │── 00
        │    │── ···
        │    └── 10
        ├── calib
        │    │── 00
        │    │── ···
        │    └── 21
        └── semantic-kitti.yaml

📝 Create SemanticKITTI Dataset

For fully-supervised training and evaluation:
- We support scripts that generate dataset information for training and validation. Create these .pkl info files by running:
```
python ./tools/create_data.py semantickitti --root-path ./data/semantickitti --out-dir ./data/semantickitti --extra-tag semantickitti
```

For semi-supervised training and evaluation:

Download the pre-processed .pkl files from here and put them under the semantickitti/ folder.

└── semantickitti
    ├── sequences
    ├── semantickitti_infos_train.pkl
    ├── semantickitti_infos_val.pkl
    ├── ...
    ├── semantickitti_infos_train.10.pkl
    ├── semantickitti_infos_train.10-unlabeled.pkl
    └── ...

ScribbleKITTI

To prepare the ScribbleKITTI dataset, download the annotations from https://data.vision.ee.ethz.ch/ouenal/scribblekitti.zip. Note that you only need to download these annotation files (~118.2MB); the data is the same as SemanticKITTI. Unpack the compressed file(s) into /data/sets/scribblekitti and re-organize the data structure. Your folder structure should end up looking like this:

└── scribblekitti 
    └── sequences
        └── scribbles <- contains the .label files; a .label file contains the scribble labels of the points in a point cloud
             │── 00
             │── ···
             └── 10

📝 Create ScribbleKITTI Dataset

Since ScribbleKITTI shares the same training data with SemanticKITTI, you can follow the same procedure above.

Waymo Open

Coming soon.

References

Please note that you should cite the corresponding paper(s) once you use these datasets.

nuScenes

@article{fong2022panopticnuscenes,
    author = {W. K. Fong and R. Mohan and J. V. Hurtado and L. Zhou and H. Caesar and O. Beijbom and A. Valada},
    title = {Panoptic nuScenes: A Large-Scale Benchmark for LiDAR Panoptic Segmentation and Tracking},
    journal = {IEEE Robotics and Automation Letters},
    volume = {7},
    number = {2},
    pages = {3795--3802},
    year = {2022}
}

@inproceedings{caesar2020nuscenes,
    author = {H. Caesar and V. Bankiti and A. H. Lang and S. Vora and V. E. Liong and Q. Xu and A. Krishnan and Y. Pan and G. Baldan and O. Beijbom},
    title = {nuScenes: A Multimodal Dataset for Autonomous Driving},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
    pages = {11621--11631},
    year = {2020}
}

SemanticKITTI

@inproceedings{behley2019semantickitti,
    author = {J. Behley and M. Garbade and A. Milioto and J. Quenzel and S. Behnke and C. Stachniss and J. Gall},
    title = {SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision},
    pages = {9297--9307},
    year = {2019}
}

@inproceedings{geiger2012kitti,
    author = {A. Geiger and P. Lenz and R. Urtasun},
    title = {Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
    pages = {3354--3361},
    year = {2012}
}

ScribbleKITTI

@inproceedings{unal2022scribble,
    author = {O. Unal and D. Dai and L. Van Gool},
    title = {Scribble-Supervised LiDAR Semantic Segmentation},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
    pages = {2697--2707},
    year = {2022}
}

Waymo Open

@inproceedings{sun2020waymoopen,
    author = {P. Sun and H. Kretzschmar and X. Dotiwalla and A. Chouard and V. Patnaik and P. Tsui and J. Guo and Y. Zhou and Y. Chai and B. Caine and V. Vasudevan and W. Han and J. Ngiam and H. Zhao and A. Timofeev and S. Ettinger and M. Krivokon and A. Gao and A. Joshi and Y. Zhang and J. Shlens and Z. Chen and D. Anguelov},
    title = {Scalability in Perception for Autonomous Driving: Waymo Open Dataset},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
    pages = {2446--2454},
    year = {2020}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DATA_PREPARE.md

DATA_PREPARE.md

Data Preparation

Overall Structure

nuScenes

📝 Create nuScenes Dataset

SemanticKITTI

📝 Create SemanticKITTI Dataset

ScribbleKITTI

📝 Create ScribbleKITTI Dataset

Waymo Open

References

nuScenes

SemanticKITTI

ScribbleKITTI

Waymo Open

Files

DATA_PREPARE.md

Latest commit

History

DATA_PREPARE.md

File metadata and controls

Data Preparation

Overall Structure

nuScenes

📝 Create nuScenes Dataset

SemanticKITTI

📝 Create SemanticKITTI Dataset

ScribbleKITTI

📝 Create ScribbleKITTI Dataset

Waymo Open

References

nuScenes

SemanticKITTI

ScribbleKITTI

Waymo Open