【Detectron】(一)Detectron 使用模型初步了解(configs文件)


Detectron的docker安装见我的文章 :FAIR Detectron(mask_rcnn官方版本)的docker安装



1. Common Settings and Notes

1.All baselines were run on Big Basin servers with 8 NVIDIA Tesla P100 GPU accelerators (with 16GB GPU memory, CUDA 8.0, and cuDNN 6.0.21).
2.All baselines were trained using 8 GPU data parallel sync SGD with a minibatch size of either 8 or 16 images (see the im/gpu column).
3.For training, only horizontal flipping data augmentation was used.
4.For inference, no test-time augmentations (e.g., multiple scales, flipping) were used.
5.All models were trained on the union of coco_2014_train and coco_2014_valminusminival, which is exactly equivalent to the recently defined coco_2017_train dataset.
6.All models were tested on the coco_2014_minival dataset, which is exactly equivalent to the recently defined coco_2017_val dataset.
7.Inference times are often expressed as "X + Y", in which X is time taken in reasonably well-optimized GPU code and Y is time taken in unoptimized CPU code. (The CPU code time could be reduced substantially with additional engineering.)
8.Inference results for boxes, masks, and keypoints ("kps") are provided in the COCO json format.
9.The model id column is provided for ease of reference.
10.To check downloaded file integrity: for any download URL on this page, simply append .md5sum to the URL to download the file's md5 hash.
11.All models and results below are on the COCO dataset.
12.Baseline models and results for the Cityscapes dataset are coming soon!


里面包含了faster rcnn、mask rcnn等训练使用的主干网模型。以12_2017_baselines文件夹为例,其中的文件如图所示:

2. 训练方案:


1x:For minibatch size 16, this schedule starts at a LR of 0.02 and is decreased by a factor of * 0.1 after 60k and 80k iterations and finally terminates at 90k iterations. This schedules results in 12.17 epochs over the 118,287 images incoco_2014_trainunioncoco_2014_valminusminival(or equivalently,coco_2017_train).
2x:Twice as long as the 1x schedule with the LR change points scaled proportionally.
s1x:("stretched 1x"): This schedule scales the 1x schedule by roughly 1.44x, but also extends the duration of the first learning rate. With a minibatch size of 16, it reduces the LR by * 0.1 at 100k and 120k iterations, finally ending after 130k iterations.

1x - minibatch 为 16, 该方案的初始 LR=0.002,并在 60K 和 80K 次迭代后衰减 *0.1,在 90K 次迭代终止.
该方案在 coco_2014_train union coco_2014_valminusminival (或,coco_2017_train) 数据集上的 118287张图片上训练了 12.17 个 epochs.

2x - 1x 方案的两倍,LR 相应的进行比例缩放.

s1x - stretched 1x,该方案是将方案 1x 大约拉伸了 1.44x,并延长了第一个学习率的持续时间.
minibatch 为 16 时,该方案将 LR 在 100K 和 120K 次迭代后衰减 *0.1,在 130K 次迭代终止.

所有的训练方案都使用了 500 次线性学习率的学习率,进行热身.
当minibatch尺寸在8和16之间时,根据论文 Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour调整SGD迭代率和基本学习率

3. detectron ImageNet预训练模型


  • R-50.pkl: converted copy of MSRA’s original ResNet-50 model
  • R-101.pkl: converted copy of MSRA’s original ResNet-101 model
  • X-101-64x4d.pkl: converted copy of FB’s original ResNeXt-101-64x4d model trained with Torch7
  • X-101-32x8d.pkl: ResNeXt-101-32x8d model trained with Caffe2 at FB
  • X-152-32x8d-IN5k.pkl: ResNeXt-152-32x8d model trained on ImageNet-5k with Caffe2 at FB (see our ResNeXt paperfor details on ImageNet-5k)

4. Proposal, Box, and Mask 相关的Detection Baselines

RPN Proposal Baselines


  • 表中的model以及1,2,3等链接需要到官网下载;
  • inference time指的是产生RPN proposals的时间;
  • prop. AR指的是每张图片1000个proposals的proposal召回率;
  • Proposal download links ("props"): "1" is coco_2014_train;"2" is coco_2014_valminusminival;and "3" is coco_2014_minival。

5. Fast & Mask R-CNN Baselines Using Precomputed RPN Proposals


  • 每一行使用的预计算的RPN与RPN Proposal Baselines表中是对应的;
  • Inference time不包括proposal的产生时间。

6. End-to-End Faster & Mask R-CNN Baselines


  • 这些模型中,RPN和检测是联合训练且是EndToEnd;
  • Inference time是从图片到检测,包括proposal的产生时间。


编辑于 2018-05-25