Skip to content

Instantly share code, notes, and snippets.

View YimianDai's full-sized avatar
💭
I may be slow to respond.

Yimian Dai YimianDai

💭
I may be slow to respond.
View GitHub Profile
@YimianDai
YimianDai / Qi2013ARD.md
Created July 30, 2019 04:47
GRSL-2013-A Robust Directional Saliency-Based Method for Infrared Small-Target Detection Under Various Complex Backgrounds

GRSL - 2013 - A Robust Directional Saliency-Based Method for Infrared Small-Target Detection Under Various Complex Backgrounds

对问题的假设

  • 把问题刻画成 显著性检测,假设 目标是 isotropic Gaussian-like shape,而 background clutters are generally local orientational
    • formulate this problem as salient region detection, which is inspired by the fact that a small target can often attract attention of human eyes in infrared images.
    • This visual effect arises from the discrepancy that a small target resembles isotropic Gaussian-like shape due to the optics point spread function of the thermal imaging system at a long distance, whereas background clutters are generally local orientational.

进一步假设:

@YimianDai
YimianDai / ECCV-2012-Diagnosing-Error in-Object-Detectors.md
Created July 30, 2019 04:43
ECCV-2012-Diagnosing Error in Object Detectors

ECCV-2012-Diagnosing Error in Object Detectors

这篇文章主要是分析了 Object Detection 中 Error 的来源,对于 False Positive,主要是 Localization Error,也就是 Classification 对了,但是 IoU 小了,和 Confusion with similar object

对于 False Negative,原因主要是,目标小、遮挡严重 和 不常见的视角

最后给了处理这些难点的建议,比如对于小目标,给出的建议是用 low-resolutional template 以及 加入 context 来 disambiguation。

这是一篇蛮好的文章,我们的结果不好,为什么不好,是什么导致了不好,只有搞清楚了才可以对症下药。

Anchor-based object detection methods [26, 38] detect objects by classifying and regressing a series of pre-set anchors

为什么 anchor-based detection methods whose performance decrease sharply as the objects becoming smaller?

  1. The anchor-based detection frameworks tend to miss small and medium faces.(一个是 feature 没怎么留下,做 Prediction 做不好;一个是压根就不会被标记出来,连做在训练里做 Prediction 成 Positive 的机会都没有)
    • Firstly, the stride size of the lowest anchor-associated layer is too large, small and medium faces have been highly squeezed on these layers and have few features for detection,这个问题可以通过把 Prediction Layer 下沉来解决,但这也会引入新的问题就是下沉后语义信息不够
  • Secondly, small face, anchor scale and receptive field are mutual mismatch: anchor scale mismatches receptive field and both are too large to fit small face,
@YimianDai
YimianDai / Focal-Loss.md
Created July 30, 2019 04:19
Focal Loss

人生也跟 Focal Loss 一样

如果 easy samples 占据了你绝大部分的 loss,那么你再怎么优化,也只是在优化那些 easy samples 上的预测,所以这些优化都是无用功,在那些 Hard Samples 上一点力都没用,最后自然是 简单的能完成,难的一点都不会

人生也是如此,所以也要对人生采用 Focal Loss

自己人生的症结也是在于 gradient 被 overwhelming easy samples 的 loss 给 dominated 了,所以再怎么优化效果都不会好。希望自己接下来能够遵循 Focal loss,直面困难,走出舒适区。

@YimianDai
YimianDai / MultiBox.md
Last active July 30, 2019 04:23
MultiBox

这篇文章依赖更长的 arXiv 版本

Szegedy, Christian, et al. "Scalable, high-quality object detection." arXiv preprint arXiv:1412.1441 (2014).

会议版本

Dumitru Erhan, Christian Szegedy, Alexander Toshev, Dragomir Anguelov; The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014, pp. 2147-2154

只是一个 bbox generator,产生的 bbox 是被认为会包含有 Object 的 bbox,其实就是 Region Proposal Network 啦,Learning to Propose Regions

@YimianDai
YimianDai / Linux.md
Last active November 6, 2019 23:19
Notes on Linux
@YimianDai
YimianDai / OverFeat.md
Last active July 30, 2019 04:42
OverFeat

CNN 能从最初的 Classification 扩展到 Detection、Segmentation 这样的任务的依据是人们认为 the most features learned in Convolutional layers are general purpose。

BBR 最早是在 DPM 中被引入,然后在 R-CNN 中被用。

ICLR-2014-OverFeat Integrated Recognition, Localization and Detection using Convolutional Networks

这篇文章是最早探索怎么把原来用于分类的 CNN 用于 Object Detection 的文章之一,有很多很有启发的点,但个人感觉写得不是很容易懂,这里只摘录一点我读到的。

目标检测 - SNIPER-Efficient Multi-Scale Training - 论文笔记

SNIPER:

  1. adaptively samples chips from multiple scales of an image pyramid, conditioned on the image content.

  2. We sample positive chips conditioned on the ground-truth instances and negative chips based on proposals generated by a region proposal network.

    • negative chips 竟然是用 RPN 产生的,真神奇
  3. R-CNN 是 scale invariant 的 (with the assumption that CNNs can classify images of a fixed resolution)

@YimianDai
YimianDai / Data-Flow-in-SSD.md
Last active August 21, 2019 05:32
Data Flow in SSD

通常我们说机器学习三要素:Model、Loss、Optimization,大多数计算机视觉的论文也主要关注在 Model 和 Loss 上。在 Deep Learning 统治的当今,主流的范式往往是设计一个新的 loss,或者提出一个新的网络结构,把传统的 heuristic 方法 hard encoded 到网络结构中去实现端对端学习 [1]。

随着 SSD 这样特别强调 Data Augmentation 的方法的流行,以及 SNIP 这样强调 Data Scale 的 Argument 得到认可,Data 本身也已经是计算机视觉需要关心的一大要素。但现实是,当我们去接触代码的时候,Data Pre-processing 会是个又臭又长的过程。 又因为 Data 是Model、Loss、Metric 最重要的输入,如果 Data 与函数预想的不一样,就得不到想要的结果,特别是自己从头到尾写一个实现的时候。这篇日志以 GluonCV 中 SSD 实现中的 Data 为例,剖析一下 Data 在各个环节中需要的形式以及经历的操作。

1. Training

按照先创建 train_dataset,再对其做 SSDDefaultTrainTransform,然后输入 DataLoader 通过 batchify_fn 拼成一个 batch,接着输入 SSD 的 net 实例,最后输入 SSDMultiBoxLoss 损失函数计算损失的顺序过程看一下 Data 在 Training 时在各环节的流动和变化情况。

@YimianDai
YimianDai / SSD.md
Last active August 21, 2019 05:26
SSD
extra_spec = {
    300: [((256, 1, 1, 0), (512, 3, 2, 1)),
          ((128, 1, 1, 0), (256, 3, 2, 1)),
          ((128, 1, 1, 0), (256, 3, 1, 0)),
          ((128, 1, 1, 0), (256, 3, 1, 0))],

    512: [((256, 1, 1, 0), (512, 3, 2, 1)),