Skip to content

Instantly share code, notes, and snippets.

@YimianDai
Created July 30, 2019 00:42
Show Gist options
  • Save YimianDai/9ba86f58b012dcf0d783b8099d37bbf1 to your computer and use it in GitHub Desktop.
Save YimianDai/9ba86f58b012dcf0d783b8099d37bbf1 to your computer and use it in GitHub Desktop.
SNIPER

目标检测 - SNIPER-Efficient Multi-Scale Training - 论文笔记

SNIPER:

  1. adaptively samples chips from multiple scales of an image pyramid, conditioned on the image content.

  2. We sample positive chips conditioned on the ground-truth instances and negative chips based on proposals generated by a region proposal network.

    • negative chips 竟然是用 RPN 产生的,真神奇
  3. R-CNN 是 scale invariant 的 (with the assumption that CNNs can classify images of a fixed resolution)

    • 应该是把每个 Proposal 都 resize 成 a canonical 224x224 size image
  4. Fast-RCNN 就不是 scale invariant 的了

    • However, convolution for objects of different sizes is performed at a single scale, which destroys the scale invariance properties of R-CNN
    • R-CNN 是因为把所有 object proposal 不管原大小都 resize 成了一样的大小 224x224,强制所有目标都在一个 Resolution 和 scale,所以才有 scale invariant,而 Fast-RCNN 没有这么做

Fast-RCNN 的缺点:

  1. in multi-scale training, Fast-RCNN upsamples and downsamples every proposal (whether small or big) in the image,这会导致本来就是 large 的 objects 还是会被 upsample 成 extreme large objects,本来是 small 的 objects 也会被 down-sample 成 extreme small objects

R-CNN 的优点:

  1. each proposal is resized to a canonical size of 224x224 pixels. Large objects are not upsampled and small objects are not downsampled in R-CNN.

SNIPER 就是两个好处都要要:we propose SNIPER, which retains the benefits of both these approaches by generating scale specific context-regions (chips) that cover maximum proposals at a particular scale.

R-CNN more appropriately does not up/downsample every pixel in the image but only in those regions which are likely to contain objects to an appropriate resolution. However, R-CNN does not share the convolutional features for nearby proposals like Fast-RCNN, which makes it slow.

两个好处是,R-CNN 的 contain objects to an appropriate resolution,和 Fast R-CNN 的 share the convolutional features for nearby proposals,which makes it fast

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment