Occupancy Generation

Overview

The occupancy_generation/ module is used for processing the raw collected data collected from CARLA to generate high-quality derived modalities. This includes rectifying sensor artifacts to produce refined semantic depth maps, generating surface normals, and constructing dense and resampled panoptic occupancy grids.

Running

We provide a set of bash scripts to automate the generation and visualization pipelines. You can run these scripts directly from the repository root.

Step 1: Modality Generation

To generate all modalities (refined semantic depth, normals, and panoptic occupancy at 0.1m resolution) once for all, run:

bash occupancy_generation/gen_modalities.sh

You can also generate required modalities by separately running the following commands:

# Semantic Depth Generation
python occupancy_generation/generators/gen_sem_depth.py
# Normal Map Generation
python occupancy_generation/generators/gen_normal.py
# Panoptic Occupancy Generation
python occupancy_generation/generators/gen_pano_occ.py

Step 2: Occupancy Resampling

The raw occupancy grids (vs_0_1) can be resampled into different spatial extents and resolutions. Use following command to resample the occupancy grids for forward-view or surround-view tasks:

bash occupancy_generation/resample_occ.sh

You can also customize your required occ GT format

This script uses generators/resample_occ.py to downsample the occupancy into predefined configurations like vs_0_2_forward_view and vs_0_4_surround_view.

Step 3: Visualization

To visualize the generated scene meshes and occupancy grids, run:

bash occupancy_generation/vis_modalities.sh

Occupancy Resolution Settings

Definitions

Voxel Size (m): Physical size of each voxel along the (x, y, z) axes.
Voxel Origin (m): The coordinate of the minimum corner of the voxel grid in LiDAR Coordinate System.
Voxel Range (m): The spatial coverage of the voxel grid along each axis in LiDAR Coordinate System.
Volume Size: The resolution of the occupancy grid.

Configuration Table

Occupancy grid configurations under different resolutions
Config Type	Voxel Size	Voxel Origin	Voxel Range	Volume Size
vs_0_1 (Raw)	0.1	[-25.6, -25.6, -3]	[[-25.6, 51.2], [-25.6, 25.6], [-3, 10]]	[768, 512, 130]
vs_0_2_forward_view	0.2	[0, -25.6, -2.4]	[[0, 51.2], [-25.6, 25.6], [-2.4, 4]]	[256, 256, 32]
vs_0_4_surround_view	0.4	[-25.6, -25.6, -2.4]	[[-25.6, 25.6], [-25.6, 25.6], [-2.4, 4]]	[128, 128, 16]

Notes

The Volume Size can be computed as the spatial extent divided by the voxel size.
The configuration vs_0_2_forward_view follows the spatial setup of KITTI-360-SSCBench benchmark, focusing on the single-view occupancy prediction task. The configuration vs_0_4_surround_view follows the spatial setup of Occ3D-nuScenes benchmark, focusing on the surround-view occupancy prediction task. Please note that the Voxel Range is slightly different with these two benchmarks.

To generate your customized occupancy ground truth with a different voxel size / voxel origin / volume size, run the following command:

cd data_collection
# e.g., to produce vs_0_2_forward_view
python occ_downsample.py \
  --dataset_dir ../data/CarlaOccV1 \
  --voxel_size 0.2 \
  --save_dir_name vs_0_2_forward_view \
  --voxel_origin 0 -25.6 -2.4 \
  --volume_size 256 256 32

Panoptic Occupancy Labels

Label Format

To reduce spatial redundancy, we adopt a sparse format for the occupancy labels and use the .npz format to store them. The ground truth labels contain the following fields:

occupancy: Sparse occupancy labels in shape of [N, 4], where each row is [x, y, z, encoded_label] representing the voxel indices and its label.
voxel_size: See above.
voxel_origin: See above.
volume_size: See above. It is used to decode the occupancy labels back to the dense format.

Label Encoding and Decoding

To unify the storage format of the occupancy labels, we encode the semantic labels and instance labels into a single combined label of uint16. The encoding and decoding are defined as follows:

encoded_label = semantic_label * 1000 + instance_label % 1000
semantic_label, instance_label = encoded_label // 1000, encoded_label % 1000

Notes

For non-instance classes (i.e., stuff classes), the instance label is set to 0.
The semantic classes should be numbered in range of [0, 64].
The instance labels should be numbered in range of [0, 999]. Instance labels should be unique within each semantic class, and can be reused in different scenes.