graviti
PlatformMarketplaceSolutionsResourcesOpen DatasetsCommunityCompany
399
0
1
BDD100K 10K
General
SDK
Activities
update dataset overview and ba...
7948f4a·
Feb 10, 2022 7:44 AM
·2Commits

Overview

There are 10K images in this BDD100K_10K package for semantic segmentation, instance segmentation and panoptic segmentation. Due to some legacy reasons, not all the images here have corresponding videos in BDD100K dataset. So it is not a subset of the BDD100K images, even though there is a significant overlap.

DataAnnotation

There are three annotation types: semantic segmentation, instance segmentation and panoptic segmentation. For each type of annotation, we provide json and bitmask formats.

DataFormat

Folder Structure

BDD100K_10K
├── images
│   └── 10k
│       ├── test
│       │   ├── <image_name.jpg>
│       │   └── ...
│       ├── train
│       │   ├── <image_name.jpg>
│       │   └── ...
│       └── val
│           ├── <image_name.jpg>
│           └── ...
└── labels
    ├── ins_seg
    │   ├── bitmasks
    │   │   ├── train
    │   │   │   ├── <image_name.png>
    │   │   │   └── ...
    │   │   └── val
    │   │       ├── <image_name.png>
    │   │       └── ...
    │   └── polygons
    │           ├── ins_seg_train.json
    │           └── ins_seg_val.json
    ├── pan_seg
    └── sem_seg

Label Format

We provide labels in both JSON and bitmask formats. In JSON the masks are stored as poly2ds. The semantic masks save the ground-truth of each image into a one-channel png (8 bits per pixel). The value of each pixel represents its category. 255 usually means “ignore”. For instance masks and panoptic masks, labels for each image are stored in an RGBA png file. For the RGBA image, the first byte, R, is used for the category id range from 1 (0 is used for the background). Moreover, G is for the instance attributes. Currently, four attributes are used, they are “truncated”, “occluded”, “crowd” and “ignore”. Note that boxes with “crowd” or “ignore” labels will not be considered during testing. The above four attributes are stored in least significant bits of G. Given this, G = (truncated << 3) + (occluded << 2) + (crowd << 1) + ignore. Finally, the B channel and A channel store the instance id, which can be computed as (B << 8) + A.

Citation

@InProceedings{bdd100k, author = {Yu, Fisher and Chen, Haofeng and Wang, Xin and Xian, Wenqi and Chen, Yingying and Liu, Fangchen and Madhavan, Vashisht and Darrell, Trevor}, title = {BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning}, booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2020} }

Data Preview
List Dataset Files
🎉Many thanks to Hello Dataset for contributing the dataset
Basic Information
Application ScenariosAutonomous Driving
AnnotationsPanopticMaskPolygonSemanticMaskInstanceMask
TasksNot Available
LicenseCustom
Updated on2022-02-11 08:12:51
Metadata
Data TypeImage
Data Volume10,000
Annotation Amount8,000
File Size1.05GB
Copyright Owner
UC Berkeley
Annotator
Unknown