This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<!doctype html> | |
<html lang="en"> | |
<head> | |
<meta charset="utf-8"> | |
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> | |
<link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css" integrity="sha384-ggOyR0iXCbMQv3Xipma34MD+dH/1fQ784/j6cY/iJTQUOhcWr7x9JvoRxT2MZw1T" crossorigin="anonymous"> | |
<style type="text/css"> | |
.gt .diff { | |
color: green; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Needs OCR-D/core#327 OCR-D/ocrd_olena#10 OCR-D/ocrd_segment#11 bertsky/ocrd_cis | |
# Runs a preprocessing and resegmentation workflow for GT annotation, | |
# then extracts page images along JSON descriptions of region polygons and classes; | |
# finally, creates a flattened directory under $TARGET. | |
# Run: preprocess-ocrd-gt.sh [TARGET-DIRECTORY [METS-FILE]] | |
# (default is all METS files anywhere under CWD) | |
TARGET=${1:-../1000pages-crop-sauvola-denoise-deskew-repair} | |
WORKSPACES=${2:-$(find . -name mets.xml)} |
NewerOlder