-
-
Save quantombone/1144423 to your computer and use it in GitHub Desktop.
function top = nms(boxes, overlap) | |
% top = nms_fast(boxes, overlap) | |
% Non-maximum suppression. (FAST VERSION) | |
% Greedily select high-scoring detections and skip detections | |
% that are significantly covered by a previously selected | |
% detection. | |
% NOTE: This is adapted from Pedro Felzenszwalb's version (nms.m), | |
% but an inner loop has been eliminated to significantly speed it | |
% up in the case of a large number of boxes | |
% Tomasz Malisiewicz ([email protected]) | |
if isempty(boxes) | |
top = []; | |
return; | |
end | |
x1 = boxes(:,1); | |
y1 = boxes(:,2); | |
x2 = boxes(:,3); | |
y2 = boxes(:,4); | |
s = boxes(:,end); | |
area = (x2-x1+1) .* (y2-y1+1); | |
[vals, I] = sort(s); | |
pick = s*0; | |
counter = 1; | |
while ~isempty(I) | |
last = length(I); | |
i = I(last); | |
pick(counter) = i; | |
counter = counter + 1; | |
xx1 = max(x1(i), x1(I(1:last-1))); | |
yy1 = max(y1(i), y1(I(1:last-1))); | |
xx2 = min(x2(i), x2(I(1:last-1))); | |
yy2 = min(y2(i), y2(I(1:last-1))); | |
w = max(0.0, xx2-xx1+1); | |
h = max(0.0, yy2-yy1+1); | |
o = w.*h ./ area(I(1:last-1)); | |
I([last; find(o>overlap)]) = []; | |
end | |
pick = pick(1:(counter-1)); | |
top = boxes(pick,:); |
@quantombone, I have a quick question w.r.t. the overlap formulation (i.e. o = w.*h ./ area(I(1:last-1));
). Isn't it overlap = intersection_area(...) / union_area(...)
instead of overlap = intersection_area(...) / area(lower_confidence_bbox only)
?
See more details here:
npinto/felzenszwalb_cvpr2008@df4860c#L53R35
I'm probably very confused so I'm looking for some light here ;-)
There have been many late-nights hour discussion with my and my fellow vision hackers at CMU about this strange asymmetry in Felzenszwalb et al's nms.m (which doesn't actually use the standard overlap score criterion). At first I thought this was a bug, but I found in my experiments it doesn't really matter whether you set the divisor as the "correct value" or Pedro's value.
I think this was done based on observing how objects of the same category occlude each other. Think of group of people. If a smaller object region occludes a larger object region, then we expect the smaller object to be on the foreground (with a high recognition score) and the larger object to the behind it (and have a lower score). nms.m will reject the configuration two detections where one is "background/larger/high-scoring" and the other is "foreground/smaller/low-scoring", but glady keeps "background/larger/low-scoring" with "foreground/smaller/high-scoring"
You can confirm:
b = [1 1 100 100 1.0; 40 40 70 70 0.8]
nms(b)
ans = [1 1 100 100 1.0]
"here the code doesn't want to keep this configuration of two boxes, so it removes the low scoring detection"
b = [1 1 100 100 0.8; 40 40 70 70 1.0]
nms(b,0.5)
ans = [40 40 70 70 1.0; 1 1 100 100 0.8]
"here the code is willing to keep this configuration of boxes, so it returns both (sorted)"
Thanks for the feedback!
No problem, I'm already hacking on this kind of stuff most days (and nights). Always willing to help out a fellow vision hacker.
Thanks for the script, many years later, it's 200x faster than the one I had hacked together in numpy
Anybody know how to embed code from a github repository directly on a blog, without have to make a gist out of it?