Kirill888 · November 23, 2017 12:52
diff --git a/cv2-resize-nn.ipynb b/cv2-resize-nn.ipynb
 {
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Given a pixel coordinate in the detination image $x_d$, code computes coordinate in the source image $x_s$ as following:\n",
    "\n",
    "$$\n",
    "x_{s} = \\lfloor x_{d}s_{x} \\rfloor\n",
    "$$\n",
    "\n",
    "Where $s_x$ is a scaling factor ($s_x= \\frac{w_s}{w_d}$), in the code it's called `ifx`. When shrinking $s_x \\gt 1$, when expanding $s_x \\lt 1$. This computation is wrong as it aligns pixel `0` of the destination image to pixel `0` of the source image no matter what scaling is applied. To convince yourself that this incorrect imagine shrinking an odd sized image (say 5x5) all the way down to a single pixel, should you pick top-left pixel of the 5x5 or center pixel of the 5x5 in this case? Current implementation picks top-left pixel instead of the center pixel because of the wrong equation above.\n",
    "\n",
    "Instead you should use this:\n",
    "\n",
    "$$\n",
    "x_{s} = \\lfloor x_{d}s_{x} + \\frac{s_{x} - 1}{2} \\rfloor\n",
    "$$\n",
    "\n",
    "This is how translation factor is derived:\n",
    "\n",
    "$$\n",
    "\\begin{array}{rcl}\n",
    "x_{s} &=& x_d s_{x} + t_x \\\\\n",
    "-\\frac{1}{2} &=& -\\frac{1}{2} s_x + t_x\\\\\n",
    "t_x &=& \\frac{1}{2} s_x - \\frac{1}{2}\\\\\n",
    "t_x &=& \\frac{s_x-1}{2}\n",
    "\\end{array}\n",
    "$$\n",
    "\n",
    "We pick translation factor such that source and destination left edges are aligned, i.e. when we map $-0.5$ from destination image coordinate system into source coordinate system we still end up at $-0.5$.\n",
    "\n",
    "\n",
    "Incorrect code is here:\n",
    "\n",
    "https://github.com/opencv/opencv/blob/981009ac1f06106244ac52b16a20a4dc337ad816/modules/imgproc/src/resize.cpp#L235\n",
    "\n",
    "```\n",
    "   //int sx = cvFloor(x*ifx);\n",
    "   int sx = cvFloor(x*ifx + (ifx-1)*0.5);\n",
    "```\n",
    "\n",
    "https://github.com/opencv/opencv/blob/981009ac1f06106244ac52b16a20a4dc337ad816/modules/imgproc/src/resize.cpp#L144\n",
    "\n",
    "```\n",
    "   //int sy = std::min(cvFloor(y*ify), ssize.height-1);\n",
    "   int sy = std::min(cvFloor(y*ify + (ify-1)*0.5), ssize.height-1);\n",
    "```\n",
    "\n",
    "There are probably more optimized versions (maybe AVX version) that are also using wrong math, I didn't look for them.\n",
    "\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
 }
	{
	"cells": [
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Given a pixel coordinate in the detination image $x_d$, code computes coordinate in the source image $x_s$ as following:\n",
	"\n",
	"$$\n",
	"x_{s} = \\lfloor x_{d}s_{x} \\rfloor\n",
	"$$\n",
	"\n",
	"Where $s_x$ is a scaling factor ($s_x= \\frac{w_s}{w_d}$), in the code it's called `ifx`. When shrinking $s_x \\gt 1$, when expanding $s_x \\lt 1$. This computation is wrong as it aligns pixel `0` of the destination image to pixel `0` of the source image no matter what scaling is applied. To convince yourself that this incorrect imagine shrinking an odd sized image (say 5x5) all the way down to a single pixel, should you pick top-left pixel of the 5x5 or center pixel of the 5x5 in this case? Current implementation picks top-left pixel instead of the center pixel because of the wrong equation above.\n",
	"\n",
	"Instead you should use this:\n",
	"\n",
	"$$\n",
	"x_{s} = \\lfloor x_{d}s_{x} + \\frac{s_{x} - 1}{2} \\rfloor\n",
	"$$\n",
	"\n",
	"This is how translation factor is derived:\n",
	"\n",
	"$$\n",
	"\\begin{array}{rcl}\n",
	"x_{s} &=& x_d s_{x} + t_x \\\\\n",
	"-\\frac{1}{2} &=& -\\frac{1}{2} s_x + t_x\\\\\n",
	"t_x &=& \\frac{1}{2} s_x - \\frac{1}{2}\\\\\n",
	"t_x &=& \\frac{s_x-1}{2}\n",
	"\\end{array}\n",
	"$$\n",
	"\n",
	"We pick translation factor such that source and destination left edges are aligned, i.e. when we map $-0.5$ from destination image coordinate system into source coordinate system we still end up at $-0.5$.\n",
	"\n",
	"\n",
	"Incorrect code is here:\n",
	"\n",
	"https://github.com/opencv/opencv/blob/981009ac1f06106244ac52b16a20a4dc337ad816/modules/imgproc/src/resize.cpp#L235\n",
	"\n",
	"```\n",
	" //int sx = cvFloor(x*ifx);\n",
	" int sx = cvFloor(xifx + (ifx-1)0.5);\n",
	"```\n",
	"\n",
	"https://github.com/opencv/opencv/blob/981009ac1f06106244ac52b16a20a4dc337ad816/modules/imgproc/src/resize.cpp#L144\n",
	"\n",
	"```\n",
	" //int sy = std::min(cvFloor(y*ify), ssize.height-1);\n",
	" int sy = std::min(cvFloor(yify + (ify-1)0.5), ssize.height-1);\n",
	"```\n",
	"\n",
	"There are probably more optimized versions (maybe AVX version) that are also using wrong math, I didn't look for them.\n",
	"\n"
	]
	}
	],
	"metadata": {
	"kernelspec": {
	"display_name": "Python 3",
	"language": "python",
	"name": "python3"
	},
	"language_info": {
	"codemirror_mode": {
	"name": "ipython",
	"version": 3
	},
	"file_extension": ".py",
	"mimetype": "text/x-python",
	"name": "python",
	"nbconvert_exporter": "python",
	"pygments_lexer": "ipython3",
	"version": "3.6.3"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 2
	}