This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| from __future__ import annotations | |
| """Parallel OCR batch processor (Ollama + Qwen 2.5‑VL) | |
| Refactor – 26 Jul 2025 (retry tweak + graceful Ctrl‑C + size ordering) | |
| ==================================================================== | |
| * Persistent `requests.Session` (1 connection reused) | |
| * Single‑stage text cleanup via `_finalize_text()` – no second pass | |
| * Unified `Text:` namespace prefix (NO trailing space) | |
| * Prompt no longer instructs model to output a prefix | |
| * **NEW:** configurable retry strategy – default is **one retry** (2 total attempts) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import os | |
| from pathlib import Path | |
| def update_symlinks(base_path): | |
| """ | |
| Create/update an 'All' folder containing symlinks to files found in all | |
| subdirectories of the base_path (except 'All' itself). The directory structure | |
| is replicated: for each top-level folder, its internal structure is reproduced | |
| inside the 'All' folder. | |
| """ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import os | |
| import shutil | |
| from datetime import datetime | |
| from pathlib import Path | |
| def get_file_year(file_path, min_year, max_year): | |
| """Determine a file's year based on creation or modification date.""" | |
| try: | |
| ctime = os.path.getctime(file_path) | |
| mtime = os.path.getmtime(file_path) |