Anchor-based object detection methods [26, 38] detect objects by classifying and regressing a series of pre-set anchors
为什么 anchor-based detection methods whose performance decrease sharply as the objects becoming smaller?
- The anchor-based detection frameworks tend to miss small and medium faces.(一个是 feature 没怎么留下,做 Prediction 做不好;一个是压根就不会被标记出来,连做在训练里做 Prediction 成 Positive 的机会都没有)
- Firstly, the stride size of the lowest anchor-associated layer is too large, small and medium faces have been highly squeezed on these layers and have few features for detection,这个问题可以通过把 Prediction Layer 下沉来解决,但这也会引入新的问题就是下沉后语义信息不够
- Secondly, small face, anchor scale and receptive field are mutual mismatch: anchor scale mismatches receptive field and both are too large to fit small face,
- 怎么理解这个 anchor scale mismatches receptive field
- In the anchor-based detection frameworks, anchor scales are discrete (i.e., 16, 32, 64, 128, 256, 512 in our method) but face scale is continuous.
- those faces whose scale distribute away from anchor scales can not match enough anchors, such as tiny and outer face 这不就是 anchor scale 和 small face scale mismatch 构成的么?跟第 1 点有什么区别?
- 用了一个 two-stage 来匹配,我觉得 Fabian 的根据 GT 来做 Positive Label assignment 是极好的
- Background from small anchors
- 负类很多,容易 False Positive
- 可以加入 Focal Loss
- a max-out background label 感觉蛮不错的
anchors at different layers match their corresponding effective receptive field and different scales of anchors evenly distribute on the image