The Crowd Instance-level Human Parsing (CIHP) dataset contains 38,280 multi-person images with elaborate annotations and high appearance variability as well as complexity. The dataset can be used for the human part segmentation task.
The images in the CIHP are collected from unconstrained resources like Google and Bing. We manually specify several keywords (e.g. family, couple, party, meeting, etc.) to gain a great diversity of multi-person images. The crawled images are elaborately annotated by a professional labeling organization with well quality control. We supervise the whole annotation process and conduct a second-round check for each annotated image. We remove the unusable images that are of low resolution, image quality, or contain one or no person instance. In total, 38,280 images are kept to construct the CIHP dataset. Following random selection, we arrive at a unique split that consists of 28,280 training and 5,000 validation images with publicly available annotations, as well as 5,000 test images with annotations withheld for benchmarking purposes.
All images of the CIHP dataset contain two or more instances with an average of 3.4. The distribution of the number of persons per image is illustrated below.
Generally, we follow LIP to define and annotate the semantic part labels. However, we find that the Jumpsuit label defined in LIP is infrequent compared to other labels. To parse the human more completely and precisely, we use a more common body part label (Tosor-skin) instead. The 19 semantic part labels in the CIHP are Hat, Hair, Sunglasses, Upper-clothes, Dress, Coat, Socks, Pants, Gloves, Scarf, Skirt, Torsoskin, Face, Right/Left arm, Right/Left leg, and Right/Left shoe. The numbers of images for each semantic part label are presented below.
CIHP
├── Testing
│ │── test_id.txt
│ └── Images
│ ├── <image_name.jpg>
│ └── ...
├── Training
│ ├── Images
│ │ ├── <image_name.jpg>
│ │ └── ...
│ ├── Category_ids
│ │ ├── <mask_name.jpg>
│ │ └── ...
│ ├── Instance_ids
│ │ ├── <mask_name.jpg>
│ │ └── ...
│ └── train_id.txt
└── Validation
@InProceedings{Gong_2018_ECCV, author = {Gong, Ke and Liang, Xiaodan and Li, Yicheng and Chen, Yimin and Yang, Ming and Lin, Liang}, title = {Instance-level Human Parsing via Part Grouping Network}, booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)}, month = {September}, year = {2018} }