Skip to content

Instantly share code, notes, and snippets.

@nobucshirai
nobucshirai / annotated_pages_extractor.py
Created February 12, 2025 03:38
PDF Annotation Extractor – This script processes PDF files, detecting and extracting only the pages that contain annotations. Useful for reviewing highlighted or commented content.
#!/usr/bin/env python3
"""
Extract annotated pages from PDF files.
This script reads one or more PDF files, checks each page for annotations,
and writes a new PDF containing only those pages that contain annotations.
If an output filename is not provided, the script uses the input file's basename
but adds "_extracted" before the ".pdf" extension.
Before overwriting an existing file, the user is prompted for confirmation.
@nobucshirai
nobucshirai / img2pdf.py
Last active February 18, 2025 02:22
Image to PDF Converter: Easily combine multiple images into a single PDF file. Just provide image paths and an optional output name. Perfect for quick document assembly tasks.
#!/usr/bin/env python3
"""
Merge image files into a single PDF, optionally annotating images with their filenames using ImageMagick.
The grid layout is controlled by the number of rows and columns.
Use the --with-text flag to enable filename annotation (default is without text).
You can adjust the annotation font size by using the --font-scale option.
"""
import argparse
import os