PlatformMarketplaceSolutionsResourcesOpen DatasetsCommunityCompany
Crowd Instance-level Human Parsing
update dataset overview and ba...
Feb 10, 2022 7:47 AM


The Crowd Instance-level Human Parsing (CIHP) dataset contains 38,280 multi-person images with elaborate annotations and high appearance variability as well as complexity. The dataset can be used for the human part segmentation task.


The images in the CIHP are collected from unconstrained resources like Google and Bing. We manually specify several keywords (e.g. family, couple, party, meeting, etc.) to gain a great diversity of multi-person images. The crawled images are elaborately annotated by a professional labeling organization with well quality control. We supervise the whole annotation process and conduct a second-round check for each annotated image. We remove the unusable images that are of low resolution, image quality, or contain one or no person instance. In total, 38,280 images are kept to construct the CIHP dataset. Following random selection, we arrive at a unique split that consists of 28,280 training and 5,000 validation images with publicly available annotations, as well as 5,000 test images with annotations withheld for benchmarking purposes.


All images of the CIHP dataset contain two or more instances with an average of 3.4. The distribution of the number of persons per image is illustrated below.

Generally, we follow LIP to define and annotate the semantic part labels. However, we find that the Jumpsuit label defined in LIP is infrequent compared to other labels. To parse the human more completely and precisely, we use a more common body part label (Tosor-skin) instead. The 19 semantic part labels in the CIHP are Hat, Hair, Sunglasses, Upper-clothes, Dress, Coat, Socks, Pants, Gloves, Scarf, Skirt, Torsoskin, Face, Right/Left arm, Right/Left leg, and Right/Left shoe. The numbers of images for each semantic part label are presented below.


Folder Structure

├── Testing
│   │── test_id.txt
│   └── Images
│       ├── <image_name.jpg>
│       └── ...
├── Training
│   ├── Images
│   │   ├── <image_name.jpg>
│   │   └── ...
│   ├── Category_ids
│   │   ├── <mask_name.jpg>
│   │   └── ...
│   ├── Instance_ids
│   │   ├── <mask_name.jpg>
│   │   └── ...
│   └── train_id.txt
└── Validation


@InProceedings{Gong_2018_ECCV, author = {Gong, Ke and Liang, Xiaodan and Li, Yicheng and Chen, Yimin and Yang, Ming and Lin, Liang}, title = {Instance-level Human Parsing via Part Grouping Network}, booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)}, month = {September}, year = {2018} }

Data Preview
List Dataset Files
🎉Many thanks to Hello Dataset for contributing the dataset
Basic Information
Application ScenariosPerson
TasksNot Available
Updated on2022-02-11 08:13:47
Data TypeImage
Data Volume38,280
Annotation Amount33,280
File Size1.11GB
Copyright Owner
Xiaodan Liang