S3FD · GitHub

Anchor-based object detection methods [26, 38] detect objects by classifying and regressing a series of pre-set anchors

为什么 anchor-based detection methods whose performance decrease sharply as the objects becoming smaller？

The anchor-based detection frameworks tend to miss small and medium faces.（一个是 feature 没怎么留下，做 Prediction 做不好；一个是压根就不会被标记出来，连做在训练里做 Prediction 成 Positive 的机会都没有）
- Firstly, the stride size of the lowest anchor-associated layer is too large, small and medium faces have been highly squeezed on these layers and have few features for detection，这个问题可以通过把 Prediction Layer 下沉来解决，但这也会引入新的问题就是下沉后语义信息不够
- Secondly, small face, anchor scale and receptive ﬁeld are mutual mismatch： anchor scale mismatches receptive ﬁeld and both are too large to ﬁt small face,
  - 怎么理解这个 anchor scale mismatches receptive ﬁeld
In the anchor-based detection frameworks, anchor scales are discrete (i.e., 16, 32, 64, 128, 256, 512 in our method) but face scale is continuous.
- those faces whose scale distribute away from anchor scales can not match enough anchors, such as tiny and outer face 这不就是 anchor scale 和 small face scale mismatch 构成的么？跟第 1 点有什么区别？
- 用了一个 two-stage 来匹配，我觉得 Fabian 的根据 GT 来做 Positive Label assignment 是极好的
Background from small anchors
- 负类很多，容易 False Positive
- 可以加入 Focal Loss
- a max-out background label 感觉蛮不错的

anchors at different layers match their corresponding effective receptive ﬁeld and different scales of anchors evenly distribute on the image

YimianDai/S3FD.md