MultiBox

这篇文章依赖更长的 arXiv 版本

Szegedy, Christian, et al. "Scalable, high-quality object detection." arXiv preprint arXiv:1412.1441 (2014).

会议版本

Dumitru Erhan, Christian Szegedy, Alexander Toshev, Dragomir Anguelov; The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014, pp. 2147-2154

只是一个 bbox generator，产生的 bbox 是被认为会包含有 Object 的 bbox，其实就是 Region Proposal Network 啦，Learning to Propose Regions

是 Proposal Generation Model

哈哈，作者自己论文里说了

Our work builds upon the MultiBox approach presented in [4], which was an earlier attempt to learn a proposal generation model but was never directly competitive with the best expert-engineered alternatives.

那看来是之前的会议版本的 Proposal Generation 效果不好啊，为啥不好呢？

在新的版本里，作者用了 latest Inception-Style architecture，multi-scale Convolutional predictors of bounding box shape and confidence 就好了，难怪要在摘要里把这个叫做 multi-scale convolutional MultiBox (MSC-MultiBox)，突出个 multi-scale

RPN 是 Anchor 机制，MultiBox 也是 Anchor 机制，两个有啥差别呢？论文自己都说了，The biggest similarity is the usage of priors (called “anchors in the Fast R-CNN work [8])，区别是：

MultiBox 用了 multiple tapering layers（指的就是 Fig. 2），而 RPN predicting boxes of many scales from a single feature map
MultiBox 的 confidences 是 class-agnostic 的，难道 RPN 是预测 class label 的么？也不是吧？
Regression and Classification loss 不一样，真的不一样么？
网络结构不一样

class agnostic

YimianDai/MultiBox.md