Pascal VOC 数据集介绍

爱音乐的程序员小新人 08-16 133

前言：

目前看官们对“wordpress发表文章失败407”可能比较注意，咱们都需要了解一些“wordpress发表文章失败407”的相关内容。那么小编也在网摘上收集了一些对于“wordpress发表文章失败407””的相关资讯，希望同学们能喜欢，你们快快来了解一下吧！

介绍Pascal VOC数据集：

Challenge and tasks，只介绍Detection与Segmentation相关内容。数据格式衡量方式voc2007, voc2012Challenge and tasks

给定自然图片，从中识别出特定物体。

待识别的物体有20类：

personbird, cat, cow, dog, horse, sheepaeroplane, bicycle, boat, bus, car, motorbike, trainbottle, chair, dining table, potted plant, sofa, tv/monitor

有以下几个task：

* Classification（略过）

* Detection: 将图片中所有的目标用bounding box（bbox）框出来

* Segmentation: 将图片中所有的目标分割出来

* Person Layout（略过）

接下来本文只介绍Detection与Segmentation相关的内容。

Dataset所有的标注图片都有Detection需要的label，但只有部分数据有Segmentation Label。VOC2007中包含9963张标注过的图片，由train/val/test三部分组成，共标注出24,640个物体。VOC2007的test数据label已经公布，之后的没有公布（只有图片，没有label）。对于检测任务，VOC2012的trainval/test包含08-11年的所有对应图片。 trainval有11540张图片共27450个物体。对于分割任务， VOC2012的trainval包含07-11年的所有对应图片， test只包含08-11。trainval有 2913张图片共6929个物体。

Detection Ground Truth and Evaluation

Ground truth

<database>The VOC2007 Database</database>

<annotation>PASCAL VOC2007</annotation>

<image>flickr</image>

</source>

<owner>

<flickrid>dictioncanary</flickrid>

</owner>

</size>

<pose>Unspecified</pose>

</bndbox>

</object>

</annotation>

1234567891011121314151617181920212223242526272829303132

Evaluation

提交的结果存储在一个文件中，每行的格式为：

例如：

comp3_det_test_car.txt:

000004 0.702732 89 112 516 466

000006 0.870849 373 168 488 229

000006 0.852346 407 157 500 213

000006 0.914587 2 161 55 221

000008 0.532489 175 184 232 201

123456confidence会被用于计算mean average precision(mAP). 简要流程如下，详细可参考根据confidence对结果排序，计算top-1, 2, …N对应的precision和recall将recall划分为n个区间t in [t1, ..., tn]找出满足recall>=t的最大presicision最后得到n个最大precision，求它们的平均值

aps = []

for t in np.arange(0., 1.1, 0.1):#将recall分为多个区间

# 在所有 recall > t对应的precision中找出最大值

mask = tf.greater_equal(recall, t)

v = tf.reduce_max(tf.boolean_mask(precision, mask))

aps.append(v / 11.)

# 得到其平均值

ap = tf.add_n(aps)

return ap

123456789

代码给出的是voc07的计算方式， voc2010在recall区间区分上有变化：假如有M个正样例，则将recall划分为[1/M, 1/(M - 1), 1/(M - 2), ... 1]。其余步骤不变。

如输出的bbox与一个ground truth bbox的 IOU大于0.5，且类别相同，则为True Positive, 否则为False Positive对于一个ground truth bbox，只会有一个 true positive，其余都为false positive.