Skip to content

Instantly share code, notes, and snippets.

@KuRRe8
Last active June 22, 2025 09:09
Show Gist options
  • Save KuRRe8/36f63d23ef205a8e02b7b7ec009cc4e8 to your computer and use it in GitHub Desktop.
Save KuRRe8/36f63d23ef205a8e02b7b7ec009cc4e8 to your computer and use it in GitHub Desktop.
和Python使用有关的一些教程,按类别分为不同文件

Python教程

Python是一个新手友好的语言,并且现在机器学习社区深度依赖于Python,C++, Cuda C, R等语言,使得Python的热度稳居第一。本Gist提供Python相关的一些教程,可以直接在Jupyter Notebook中运行。

  1. 语言级教程,一般不涉及初级主题;
  2. 标准库教程,最常见的标准库基本用法;
  3. 第三方库教程,主要是常见的库如numpy,pytorch诸如此类,只涉及基本用法,不考虑新特性

其他内容就不往这个Gist里放了,注意Gist依旧由git进行版本控制,所以可以git clone 到本地,或者直接Google Colab\ Kaggle打开相应的ipynb文件

直接在网页浏览时,由于没有文件列表,可以按Ctrl + F来检索相应的目录,或者点击下面的超链接。

想要参与贡献的直接在评论区留言,有什么问题的也在评论区说 ^.^

目录-语言部分

目录-库部分

目录-具体业务库部分-本教程更多关注机器学习深度学习内容

目录-附录

Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

Python 描述符

描述符是实现了__get__ __set__ __delete__其中任一方法的类型对象,根据是否有__set__区别是否为数据描述符。

class DataDesc:
    def __get__(self, obj, objtype=None):
        return 'data'
    def __set__(self, obj, value):
        pass

class NonDataDesc:
    def __get__(self, obj, objtype=None):
        return 'nondata'

class C:
    a = DataDesc()
    b = NonDataDesc()
    c = 42
    d = NonDataDesc()

obj = C()
obj.a = 'inst_a'
obj.b = 'inst_b'
obj.c = 'inst_c'
# d 没有实例属性,直接访问

print(obj.a)  # 'data'(数据描述符优先)
print(obj.b)  # 'inst_b'(实例属性优先)
print(obj.c)  # 'inst_c'(实例属性优先)
print(obj.d)  # 'nondata'(非数据描述符被触发)

由此,我们有以下总结:

  1. 描述符必须是type类型的成员,也就是class定义的内部成员(或者通过type(,,)动态定义的类型)
  2. 通过类名可访问描述符C.a,也可以通过实例对象访问obj.a,而实例对象访问时候会有以下查找顺序:
    1. 无论obj.__dict__中是否有a,优先查找C.__dict__中的数据描述符。
    2. 如果没有1那么回退到访问obj.dict__该名称a的普通对象(当obj只有__slot__时候转而查找__slot)
    3. 如果没有2则查找C.__dict__的非数据描述符
    4. 如果没有3则返回C.__dict__中名为a的普通对象
    5. 如果没有4则报错
  3. 注意上述数据中的第三条,平时在class中使用def定义的都是function类型的实例,其实现了__get__方法,所以在obj调用实例方法obj.foo时候(此时obj.__dict__自然没有覆盖obj.foo的项),会访问function的描述符协议,导致其访问内容变为bound method而不是function。这也是为什么直接通过类名和实例名访问函数时候的行为不一样。 特别的,内建的staticmethod和classmethod都是实现了__get__的装饰器,所以访问这些方法时候都会按描述符协议去访问。
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# OpenCV (cv2) - Python 计算机视觉基础库教程\n",
"\n",
"欢迎来到 OpenCV 教程!OpenCV (Open Source Computer Vision Library) 是一个功能强大的开源计算机视觉和机器学习软件库。它包含了数千种优化的算法,用于处理图像和视频,进行特征检测、对象识别、跟踪等。\n",
"\n",
"**为什么 OpenCV 对 ML/DL/数据科学很重要?**\n",
"\n",
"1. **图像/视频处理基石**:提供了读取、写入、显示、操作图像和视频的基础功能。\n",
"2. **数据预处理与增强**:在将图像数据输入深度学习模型之前,经常需要使用 OpenCV 进行缩放、裁剪、颜色空间转换、滤波、数据增强等操作。\n",
"3. **特征提取**:包含许多经典的计算机视觉特征提取算法 (如 SIFT, SURF, ORB - 虽然现代 DL 模型常直接学习特征)。\n",
"4. **与其他库集成**:OpenCV 读取的图像通常表示为 NumPy 数组,可以无缝地与其他科学计算库(NumPy, SciPy, Matplotlib, PyTorch, TensorFlow)集成。\n",
"5. **广泛应用**:从简单的图像编辑到复杂的实时视觉系统(如自动驾驶、机器人视觉)都有应用。\n",
"\n",
"**注意**: OpenCV 的 Python 接口通常通过导入 `cv2` 模块来使用。\n",
"\n",
"**本教程将涵盖 OpenCV 的核心基础操作:**\n",
"\n",
"1. 安装与准备\n",
"2. 图像读取、显示和保存 (`imread`, `imshow`, `imwrite`)\n",
"3. 图像基本属性与像素访问\n",
"4. 颜色空间转换 (`cvtColor`)\n",
"5. 图像缩放、旋转与平移\n",
"6. 图像阈值处理 (`threshold`)\n",
"7. 图像滤波与模糊 (Blurring)\n",
"8. 边缘检测 (Canny)\n",
"9. 绘制图形与文本\n",
"10. (简介) 视频读取与处理 (`VideoCapture`)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. 安装与准备\n",
"\n",
"你需要安装 OpenCV 的 Python 包。通常使用 `opencv-python`。\n",
"\n",
"```bash\n",
"pip install opencv-python numpy matplotlib\n",
"```\n",
"如果需要包含 SIFT, SURF 等专利算法的完整包(可能存在法律风险,请自行判断),可以安装 `opencv-contrib-python` (它包含了 `opencv-python` 的所有内容以及额外模块)。不要同时安装两者。\n",
"\n",
"```bash\n",
"# pip install opencv-contrib-python \n",
"```"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import cv2\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import os\n",
"\n",
"# 用于在Jupyter Notebook中显示图像\n",
"%matplotlib inline \n",
"\n",
"print(f\"OpenCV version: {cv2.__version__}\")\n",
"\n",
"# --- Helper function to display images in Jupyter ---\n",
"def display_image(title, image):\n",
" \"\"\"Displays an image using Matplotlib, handling color conversion.\"\"\"\n",
" if image is None:\n",
" print(f\"Error: Image '{title}' is None.\")\n",
" return\n",
" # OpenCV loads images in BGR format by default\n",
" # Matplotlib expects RGB format\n",
" if len(image.shape) == 3 and image.shape[2] == 3: # Check if it's a color image\n",
" image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)\n",
" else: # Grayscale image\n",
" image_rgb = image\n",
" \n",
" plt.figure(figsize=(5, 4))\n",
" plt.imshow(image_rgb, cmap='gray' if len(image.shape) == 2 else None)\n",
" plt.title(title)\n",
" plt.axis('off') # Hide axes\n",
" plt.show()\n",
"\n",
"# --- Create a simple dummy image for testing ---\n",
"dummy_image_path = \"dummy_image.png\"\n",
"img_height, img_width = 100, 150\n",
"dummy_img = np.zeros((img_height, img_width, 3), dtype=np.uint8) # Black image\n",
"# Draw a white rectangle\n",
"cv2.rectangle(dummy_img, (30, 20), (120, 80), (255, 255, 255), -1) \n",
"cv2.imwrite(dummy_image_path, dummy_img)\n",
"print(f\"Dummy image created at {dummy_image_path}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. 图像读取、显示和保存\n",
"\n",
"* **`cv2.imread(filepath, flags)`**: 读取图像文件。\n",
" * `flags`: 控制读取模式,常用:\n",
" * `cv2.IMREAD_COLOR` (默认): 加载彩色图像,忽略透明度 (BGR 格式)。\n",
" * `cv2.IMREAD_GRAYSCALE`: 以灰度模式加载图像。\n",
" * `cv2.IMREAD_UNCHANGED`: 加载包括 Alpha 通道(透明度)的图像。\n",
"* **`cv2.imshow(window_name, image)`**: 在一个独立的 OpenCV 窗口中显示图像(**在 Jupyter 中通常不直接使用,因为它需要 GUI 循环**)。\n",
"* **`cv2.imwrite(filepath, image)`**: 将图像保存到文件。\n",
"\n",
"**注意**: 在 Jupyter 中,我们通常使用 `matplotlib.pyplot.imshow` (如上面的 `display_image` 助手函数) 来显示图像。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(\"--- Reading, Displaying, Saving Images ---\")\n",
"\n",
"# 1. 读取图像\n",
"img_color = cv2.imread(dummy_image_path, cv2.IMREAD_COLOR)\n",
"img_gray = cv2.imread(dummy_image_path, cv2.IMREAD_GRAYSCALE)\n",
"\n",
"if img_color is not None:\n",
" print(f\"Color image loaded successfully. Shape: {img_color.shape}\")\n",
" # 2. 显示图像 (使用 Matplotlib 助手函数)\n",
" display_image(\"Color Image (BGR loaded, RGB displayed)\", img_color)\n",
"else:\n",
" print(f\"Error loading color image from {dummy_image_path}\")\n",
"\n",
"if img_gray is not None:\n",
" print(f\"Grayscale image loaded successfully. Shape: {img_gray.shape}\")\n",
" display_image(\"Grayscale Image\", img_gray)\n",
"else:\n",
" print(f\"Error loading grayscale image from {dummy_image_path}\")\n",
"\n",
"# 3. 保存图像\n",
"output_gray_path = \"dummy_gray_saved.png\"\n",
"if img_gray is not None:\n",
" success = cv2.imwrite(output_gray_path, img_gray)\n",
" if success:\n",
" print(f\"Grayscale image saved to {output_gray_path}\")\n",
" # Clean up saved gray image\n",
" if os.path.exists(output_gray_path):\n",
" os.remove(output_gray_path)\n",
" else:\n",
" print(f\"Failed to save image to {output_gray_path}\")\n",
"\n",
"# 清理原始 dummy image\n",
"if os.path.exists(dummy_image_path):\n",
" os.remove(dummy_image_path)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. 图像基本属性与像素访问\n",
"\n",
"OpenCV 图像在 Python 中表示为 NumPy 数组。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 重新创建一个简单的图像\n",
"img = np.array([[[0, 0, 255], [0, 255, 0], [255, 0, 0]],\n",
" [[255, 255, 0], [255, 0, 255], [0, 255, 255]],\n",
" [[50, 50, 50], [150, 150, 150],[250, 250, 250]]], dtype=np.uint8)\n",
"\n",
"print(\"--- Image Properties and Pixel Access ---\")\n",
"print(f\"Image shape (Height, Width, Channels): {img.shape}\")\n",
"print(f\"Image height: {img.shape[0]} pixels\")\n",
"print(f\"Image width: {img.shape[1]} pixels\")\n",
"print(f\"Number of channels: {img.shape[2] if len(img.shape) == 3 else 1}\")\n",
"print(f\"Image data type: {img.dtype}\")\n",
"print(f\"Total number of pixels: {img.size}\")\n",
"\n",
"# 访问像素值 (注意 OpenCV 是 BGR 顺序)\n",
"# 访问坐标 (row, column) 或 (y, x)\n",
"px_top_left = img[0, 0] # (y=0, x=0)\n",
"print(f\"\\nPixel at (0, 0) [BGR]: {px_top_left}\") # [0, 0, 255] -> Blue=0, Green=0, Red=255\n",
"\n",
"# 访问单个通道的值\n",
"blue_channel_top_left = img[0, 0, 0]\n",
"print(f\"Blue channel value at (0, 0): {blue_channel_top_left}\")\n",
"\n",
"# 修改像素值\n",
"img_copy = img.copy()\n",
"img_copy[0, 0] = [255, 255, 255] # Set top-left pixel to white\n",
"print(f\"Pixel at (0, 0) after modification [BGR]: {img_copy[0, 0]}\")\n",
"display_image(\"Image with top-left pixel modified\", img_copy)\n",
"\n",
"# 访问图像区域 (ROI - Region of Interest)\n",
"# 使用 NumPy 切片\n",
"roi = img[1:3, 0:2] # Rows 1 to 2, Columns 0 to 1\n",
"print(f\"\\nROI shape: {roi.shape}\")\n",
"display_image(\"Region of Interest (ROI)\", roi)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. 颜色空间转换 (`cvtColor`)\n",
"\n",
"`cv2.cvtColor(image, code)` 用于在不同颜色空间之间转换图像。\n",
"常用 `code` 包括:\n",
"* `cv2.COLOR_BGR2GRAY`: BGR 转灰度\n",
"* `cv2.COLOR_BGR2RGB`: BGR 转 RGB (用于 Matplotlib 显示)\n",
"* `cv2.COLOR_RGB2BGR`: RGB 转 BGR\n",
"* `cv2.COLOR_BGR2HSV`: BGR 转 HSV (色相、饱和度、明度)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 使用之前的彩色图像 img\n",
"print(\"--- Color Space Conversion ---\")\n",
"display_image(\"Original Image (BGR loaded)\", img)\n",
"\n",
"# 转换为灰度图\n",
"img_gray_cvt = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)\n",
"print(f\"Grayscale image shape: {img_gray_cvt.shape}\")\n",
"display_image(\"Converted to Grayscale\", img_gray_cvt)\n",
"\n",
"# 转换为 HSV (常用于颜色检测)\n",
"img_hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)\n",
"print(f\"HSV image shape: {img_hsv.shape}\")\n",
"# HSV 图像直接显示可能不直观\n",
"# display_image(\"Converted to HSV\", img_hsv) \n",
"print(\"HSV image generated (not displayed here).\")\n",
"\n",
"# BGR to RGB for Matplotlib\n",
"img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)\n",
"display_image(\"Converted to RGB (for Matplotlib)\", img_rgb)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 5. 图像缩放、旋转与平移\n",
"\n",
"* **缩放 (Resizing)**: `cv2.resize(image, dsize, fx, fy, interpolation)`\n",
"* **旋转 (Rotation)**: 通常通过仿射变换 (Affine Transformation) 实现,需要计算旋转矩阵 `cv2.getRotationMatrix2D()` 和应用变换 `cv2.warpAffine()`。\n",
"* **平移 (Translation)**: 也是通过仿射变换实现,需要构建平移矩阵。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 使用之前的 img\n",
"print(\"--- Image Transformations ---\")\n",
"display_image(\"Original Image\", img)\n",
"h, w = img.shape[:2]\n",
"\n",
"# --- 缩放 --- \n",
"# 方法1: 指定目标尺寸 (dsize)\n",
"new_width, new_height = w // 2, h // 2\n",
"img_resized_dsize = cv2.resize(img, (new_width, new_height), interpolation=cv2.INTER_LINEAR)\n",
"print(f\"Resized shape (dsize): {img_resized_dsize.shape}\")\n",
"display_image(\"Resized (Half Size using dsize)\", img_resized_dsize)\n",
"\n",
"# 方法2: 指定缩放因子 (fx, fy)\n",
"img_resized_factor = cv2.resize(img, None, fx=1.5, fy=0.8, interpolation=cv2.INTER_CUBIC)\n",
"print(f\"Resized shape (factors): {img_resized_factor.shape}\")\n",
"display_image(\"Resized (Using factors fx=1.5, fy=0.8)\", img_resized_factor)\n",
"\n",
"# --- 旋转 --- \n",
"center = (w // 2, h // 2)\n",
"angle = 45 # 旋转角度 (逆时针)\n",
"scale = 1.0 # 缩放因子\n",
"rotation_matrix = cv2.getRotationMatrix2D(center, angle, scale)\n",
"print(f\"\\nRotation Matrix (for 45 deg):\\n{rotation_matrix}\")\n",
"img_rotated = cv2.warpAffine(img, rotation_matrix, (w, h))\n",
"display_image(f\"Rotated {angle} Degrees\", img_rotated)\n",
"\n",
"# --- 平移 --- \n",
"tx, ty = w // 4, h // 4 # 向右平移 tx, 向下平移 ty\n",
"translation_matrix = np.float32([[1, 0, tx], [0, 1, ty]])\n",
"print(f\"\\nTranslation Matrix (tx={tx}, ty={ty}):\\n{translation_matrix}\")\n",
"img_translated = cv2.warpAffine(img, translation_matrix, (w, h))\n",
"display_image(f\"Translated by ({tx}, {ty})\", img_translated)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 6. 图像阈值处理 (`threshold`)\n",
"\n",
"将灰度图像转换为二值图像(通常是黑白)。根据像素值与阈值的比较结果,将像素设置为两个预定值之一。\n",
"`retval, dst = cv2.threshold(src, thresh, maxval, type)`\n",
"\n",
"* `src`: 输入灰度图像。\n",
"* `thresh`: 阈值。\n",
"* `maxval`: 当像素值超过(或满足某些类型条件)阈值时赋予的新值。\n",
"* `type`: 阈值处理类型,常用:\n",
" * `cv2.THRESH_BINARY`: 像素 > thresh 则为 maxval,否则为 0。\n",
" * `cv2.THRESH_BINARY_INV`: 像素 > thresh 则为 0,否则为 maxval。\n",
" * `cv2.THRESH_TRUNC`: 像素 > thresh 则为 thresh,否则不变。\n",
" * `cv2.THRESH_TOZERO`: 像素 > thresh 则不变,否则为 0。\n",
" * `cv2.THRESH_TOZERO_INV`: 像素 > thresh 则为 0,否则不变。\n",
" * `cv2.THRESH_OTSU`: 大津二值化,自动计算最佳阈值 (此时 `thresh` 参数被忽略,但仍需设置一个值,如 0)。通常与 `THRESH_BINARY` 结合使用。\n",
" * `cv2.THRESH_TRIANGLE`: 类似于 Otsu,另一种自动阈值方法。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 使用之前创建的灰度图\n",
"if 'img_gray_cvt' in locals() and img_gray_cvt is not None:\n",
" print(\"--- Thresholding --- \")\n",
" display_image(\"Original Grayscale for Thresholding\", img_gray_cvt)\n",
" \n",
" # 简单二值阈值\n",
" thresh_value = 100\n",
" max_value = 255\n",
" ret1, thresh_binary = cv2.threshold(img_gray_cvt, thresh_value, max_value, cv2.THRESH_BINARY)\n",
" print(f\"Threshold value used (Binary): {ret1}\")\n",
" display_image(f\"Binary Threshold (>{thresh_value})\", thresh_binary)\n",
" \n",
" ret2, thresh_binary_inv = cv2.threshold(img_gray_cvt, thresh_value, max_value, cv2.THRESH_BINARY_INV)\n",
" display_image(f\"Inverse Binary Threshold (>{thresh_value})\", thresh_binary_inv)\n",
" \n",
" # Otsu's Binarization (自动寻找阈值)\n",
" ret_otsu, thresh_otsu = cv2.threshold(img_gray_cvt, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)\n",
" print(f\"Optimal threshold value found by Otsu: {ret_otsu}\")\n",
" display_image(\"Otsu's Binarization\", thresh_otsu)\n",
"else:\n",
" print(\"Skipping Thresholding example as grayscale image is not available.\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 7. 图像滤波与模糊 (Blurring)\n",
"\n",
"滤波用于平滑图像、去除噪声等。常用方法包括:\n",
"* **均值滤波 (`cv2.blur`)**: 用核窗口内像素的平均值代替中心像素。\n",
"* **高斯滤波 (`cv2.GaussianBlur`)**: 使用高斯核进行加权平均,中心像素权重最大,离中心越远权重越小,效果更自然。\n",
"* **中值滤波 (`cv2.medianBlur`)**: 用核窗口内像素的中值代替中心像素,对去除椒盐噪声特别有效。\n",
"* **双边滤波 (`cv2.bilateralFilter`)**: 在平滑图像的同时保持边缘清晰(计算复杂,较慢)。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 添加一些噪声以便观察滤波效果\n",
"if 'img_color' in locals() and img_color is not None:\n",
" print(\"--- Image Filtering/Blurring --- \")\n",
" noise = np.random.randint(0, 50, img_color.shape, dtype='uint8')\n",
" img_noisy = cv2.add(img_color, noise) # 添加随机噪声\n",
" display_image(\"Noisy Image\", img_noisy)\n",
" \n",
" # 均值滤波\n",
" kernel_size = (5, 5) # 核大小必须是奇数\n",
" img_blurred_avg = cv2.blur(img_noisy, kernel_size)\n",
" display_image(\"Average Blurred\", img_blurred_avg)\n",
" \n",
" # 高斯滤波\n",
" # 第三个参数是X方向的标准差,如果为0,则根据核大小计算\n",
" img_blurred_gaussian = cv2.GaussianBlur(img_noisy, kernel_size, 0)\n",
" display_image(\"Gaussian Blurred\", img_blurred_gaussian)\n",
" \n",
" # 中值滤波 (对椒盐噪声效果好,这里可能效果不明显)\n",
" # 核大小必须是奇数整数\n",
" img_blurred_median = cv2.medianBlur(img_noisy, 5) \n",
" display_image(\"Median Blurred\", img_blurred_median)\n",
"else:\n",
" print(\"Skipping Filtering example as color image is not available.\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 8. 边缘检测 (Canny)\n",
"\n",
"`cv2.Canny(image, threshold1, threshold2)` 是一个流行的边缘检测算法。\n",
"* 它涉及多个阶段:高斯滤波、计算梯度强度和方向、非极大值抑制、双阈值处理和滞后连接。\n",
"* `threshold1` 和 `threshold2` 是双阈值的下限和上限。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 使用灰度图进行边缘检测\n",
"if 'img_gray_cvt' in locals() and img_gray_cvt is not None:\n",
" print(\"--- Canny Edge Detection --- \")\n",
" display_image(\"Grayscale for Edge Detection\", img_gray_cvt)\n",
" \n",
" low_threshold = 50\n",
" high_threshold = 150\n",
" edges = cv2.Canny(img_gray_cvt, low_threshold, high_threshold)\n",
" \n",
" display_image(\"Canny Edges\", edges)\n",
"else:\n",
" print(\"Skipping Canny Edge Detection example.\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 9. 绘制图形与文本\n",
"\n",
"OpenCV 提供了在图像上绘制线条、矩形、圆形、文本等的函数。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 创建一个黑色背景图像\n",
"canvas = np.zeros((300, 500, 3), dtype=\"uint8\")\n",
"print(\"--- Drawing Shapes and Text ---\")\n",
"\n",
"# 绘制线条 cv2.line(image, start_point, end_point, color, thickness)\n",
"cv2.line(canvas, (0, 0), (500, 300), (0, 255, 0), 3) # 绿色对角线\n",
"\n",
"# 绘制矩形 cv2.rectangle(image, top_left, bottom_right, color, thickness)\n",
"# thickness=-1 表示填充矩形\n",
"cv2.rectangle(canvas, (50, 50), (200, 150), (0, 0, 255), 5) # 红色边框\n",
"cv2.rectangle(canvas, (250, 80), (350, 180), (255, 0, 0), -1) # 蓝色填充\n",
"\n",
"# 绘制圆形 cv2.circle(image, center_coordinates, radius, color, thickness)\n",
"cv2.circle(canvas, (400, 100), 50, (255, 255, 0), -1) # 青色填充圆\n",
"\n",
"# 添加文本 cv2.putText(image, text, org(bottom-left), fontFace, fontScale, color, thickness, lineType)\n",
"font = cv2.FONT_HERSHEY_SIMPLEX\n",
"cv2.putText(canvas, 'OpenCV Shapes!', (50, 250), font, 1.5, (255, 255, 255), 2, cv2.LINE_AA)\n",
"\n",
"display_image(\"Canvas with Shapes and Text\", canvas)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 10. (简介) 视频读取与处理 (`VideoCapture`)\n",
"\n",
"`cv2.VideoCapture` 类用于从摄像头或视频文件捕获帧。\n",
"\n",
"**基本流程:**\n",
"1. 创建 `VideoCapture` 对象:`cap = cv2.VideoCapture(0)` (摄像头 0) 或 `cap = cv2.VideoCapture(\"myvideo.mp4\")`。\n",
"2. 检查是否成功打开:`cap.isOpened()`。\n",
"3. 循环读取帧:`ret, frame = cap.read()`。\n",
" * `ret` 是一个布尔值,表示是否成功读取帧。\n",
" * `frame` 是读取到的图像帧 (NumPy 数组)。\n",
"4. 处理每一帧 `frame` (例如,转换为灰度、应用滤波、检测特征等)。\n",
"5. 显示处理后的帧 (例如使用 `cv2.imshow`,在脚本中)。\n",
"6. 检查退出条件 (例如,按 'q' 键)。\n",
"7. 释放资源:`cap.release()` 和 `cv2.destroyAllWindows()` (在脚本中)。\n",
"\n",
"**注意**: 在 Jupyter 中直接运行视频处理循环可能不方便,且 `cv2.imshow` 通常无法工作。通常在独立的 Python 脚本中实现。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(\"--- Video Processing Introduction ---\")\n",
"\n",
"# 伪代码示例,因为直接运行摄像头或长视频处理不适合 Notebook\n",
"print(\"Video processing typically involves these steps (pseudo-code):\")\n",
"print(\"cap = cv2.VideoCapture(0) # Or video file path\")\n",
"print(\"if not cap.isOpened(): print('Error opening video source')\")\n",
"print(\"while True:\")\n",
"print(\" ret, frame = cap.read()\")\n",
"print(\" if not ret: break # End of video or error\")\n",
"print(\" # --- Process the frame --- \")\n",
"print(\" # gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)\")\n",
"print(\" # edges = cv2.Canny(gray, 50, 150)\")\n",
"print(\" # cv2.imshow('Processed Frame', edges) # Display (in script)\")\n",
"print(\" # if cv2.waitKey(1) & 0xFF == ord('q'): break # Exit on 'q' key\")\n",
"print(\"cap.release()\")\n",
"print(\"cv2.destroyAllWindows()\")\n",
"\n",
"# 尝试读取视频文件的一帧作为示例 (如果有名为 'sample_video.mp4' 的文件)\n",
"sample_video_path = 'sample_video.mp4' # 你需要有一个视频文件\n",
"cap = None\n",
"if os.path.exists(sample_video_path):\n",
" try:\n",
" cap = cv2.VideoCapture(sample_video_path)\n",
" if cap.isOpened():\n",
" ret, frame = cap.read()\n",
" if ret:\n",
" print(f\"\\nSuccessfully read one frame from '{sample_video_path}'\")\n",
" display_image(\"First Frame of Video\", frame)\n",
" else:\n",
" print(f\"Could not read frame from '{sample_video_path}'\")\n",
" else:\n",
" print(f\"Could not open video file '{sample_video_path}'\")\n",
" except Exception as e:\n",
" print(f\"Error processing sample video: {e}\")\n",
" finally:\n",
" if cap is not None:\n",
" cap.release()\n",
"else:\n",
" print(f\"\\nSample video '{sample_video_path}' not found. Skipping video frame reading example.\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 总结\n",
"\n",
"OpenCV 是进行计算机视觉任务不可或缺的库。它提供了大量用于图像和视频处理、分析的基础工具。\n",
"\n",
"**关键要点:**\n",
"* 使用 `cv2.imread`, `cv2.imwrite` 读写图像。\n",
"* 图像在 Python 中表示为 NumPy 数组 (通常是 BGR 顺序)。\n",
"* 使用 `cv2.cvtColor` 进行颜色空间转换。\n",
"* 掌握图像缩放、旋转、阈值处理、滤波、边缘检测等基本操作。\n",
"* 可以在图像上绘制各种图形和文本。\n",
"* `cv2.VideoCapture` 用于处理视频流。\n",
"* 在 Jupyter 中显示图像通常借助 Matplotlib。\n",
"\n",
"OpenCV 的功能非常广泛,包括特征检测、对象跟踪、相机标定、深度学习模型集成 (DNN 模块) 等高级主题。官方文档和教程是进一步学习的重要资源。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 5
}
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

对动态语言Python的一些感慨

众所周知Python是完全动态的语言,体现在

  1. 类型动态绑定
  2. 运行时检查
  3. 对象结构内容可动态修改(而不仅仅是值)
  4. 反射
  5. 一切皆对象(instance, class, method)
  6. 可动态执行代码(eval, exec)
  7. 鸭子类型支持

动态语言的约束更少,对使用者来说更易于入门,但相应的也会有代价就是运行时开销很大,和底层汇编执行逻辑完全解耦不知道代码到底是怎么执行的。

而且还有几点是我认为较为严重的缺陷。下面进行梳理。

破坏了OOP的语义

较为流行的编程语言大多支持OOP编程范式。即继承和多态。同样,Python在执行简单任务时候可以纯命令式(Imperative Programming),也可以使用复杂的面向对象OOP。

但是,其动态特性破环了OOP的结构:

  1. 类型模糊:任何类型实例,都可以在运行时添加或者删除属性或者方法(相比之下静态语言只能在运行时修改它们的值)。经此修改的实例,按理说不再属于原来的类型,毕竟和原类型已经有了明显的区别。但是该实例的内建__class__属性依旧会指向原类型,这会给类型的认知造成困惑。符合一个class不应该只是名义上符合,而是内容上也应该符合。
  2. 破坏继承:体现在以下两个方面
    1. 大部分实践没有虚接口继承。abc模块提供了虚接口的基类ABC,经典的做法是让自己的抽象类继承自ABC,然后具体类继承自自己的抽象类,然后去实现抽象方法。但PEP提案认为Pythonic的做法是用typing.Protocol来取代ABC,具体类完全不继承任何虚类,只要实现相应的方法,那么就可以被静态检查器认为是符合Protocol的。
    2. 不需要继承自具体父类。和上一条一样,即使一个类没有任何父类(除了object类),它依旧可以生成同名的方法,以实现和父类方法相同的调用接口。这样在语义逻辑上,类的定义完全看不出和其他类有何种关系。完全可以是一种松散的组织结构,任何两个类之间都没继承关系。
  3. 破坏多态:任何一个入参出参,天然不限制类型。这使得要求父类型的参数处,传入子类型显得没有意义,依旧是因为任何类型都能动态修改满足要求。

破坏了设计模式

经典的模式诸如工厂模式,抽象工厂,访问者模式,都严重依赖于继承和多态的性质。但是在python的设计中,其动态能力使得设计模式形同虚设。 大家常见的库中使用设计模式的有transformers库,其中的from_pretrained系列则是工厂模式,通过字符串名称确定了具体的构造器得到具体的子类。而工厂构造器的输出类型是一个所有模型的基类。

安全性问题

Python在代码层面一般不直接管理指针,所以指针越界,野指针,悬空指针等问题一般不存在。而gc机制也能自动处理垃圾回收使得编码过程不必关注这类安全性问题。但与之相对的,Python也有自己的安全性问题。以往非托管形式的代码的攻击难度较大,注入代码想要稳定执行需要避免破坏原来的结构导致程序直接崩溃(段错误)。 Python却可以直接注入任何代码修改原本的逻辑,并且由于不是在code段固定的内容,攻击时候也无需有额外考虑。运行时可以手动修改globals() locals()内容,亦有一定风险。 另一个危险则是类型不匹配导致的代码执行问题,因为只有在运行时才确定类型,无法提前做出保证,可能会产生类型错误的异常,造成程序崩溃。

总结

我出身于C++。但是近年来一直在用python编程。而且python的市场占有率已经多年第一,且遥遥领先。这和其灵活性分不开关系。对于一个面向大众的编程语言,这样的特性是必要的。即使以上说了诸多python的不严谨之处,但是对于程序员依旧可以选择严谨的面向对象写法。所以,程序的优劣不在于语言怎么样,而在于程序员本身。程序员有责任写出易于维护,清晰,规范的代码~

Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@KuRRe8
Copy link
Author

KuRRe8 commented May 8, 2025

返回顶部

有见解,有问题,或者单纯想盖楼灌水,都可以在这里发表!

因为文档比较多,有时候渲染不出来ipynb是浏览器性能的问题,刷新即可

或者git clone到本地来阅读

ChatGPT Image May 9, 2025, 04_45_04 AM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment