We have created a 102 category dataset, consisting of 102 flower categories. The flowers chosen to be flower commonly occuring in the United Kingdom. Each class consists of between 40 and 258 images. The details of the categories and the number of images for each class can be found on this category statistics page.
The images have large scale, pose and light variations. In addition, there are categories that have large variations within the category and several very similar categories. The dataset is visualized using isomap with shape and colour features.
We visualize the categories in the dataset using SIFT features as shape descriptors and HSV as colour descriptor. The images are randomly sampled from the category.
The images are contained in the file 102flowers.tgz and the image labels in imagelabels.mat.
We provide 4 distance matrices. D_hsv, D_hog, D_siftint, D_siftbdy. These are the chi^2 distance matrices used in the publication: Automated flower classification over a large number of classes.
The datasplits used in this paper are specified in setid.mat.
The results in the paper are produced on a 103 category database. The two categories labeled Petunia have since been merged since they are the same.
There is a training file (trnid), a validation file (valid) and a testfile (tstid).
We provide the segmentations for the images in the file 102segmentations.tgz
More details can be found in: Delving into the whorl of flower segmentation.
Please use the following citation when referencing the dataset:
@InProceedings{Nilsback08,
author = "Maria-Elena Nilsback and Andrew Zisserman",
title = "Automated Flower Classification over a Large Number of Classes",
booktitle = "Indian Conference on Computer Vision, Graphics and Image Processing",
month = "Dec",
year = "2008",
}