Skip to content

Instantly share code, notes, and snippets.

@innermond
Created October 16, 2014 14:39
Show Gist options
  • Select an option

  • Save innermond/aa10a75a0fdf42c0d837 to your computer and use it in GitHub Desktop.

Select an option

Save innermond/aa10a75a0fdf42c0d837 to your computer and use it in GitHub Desktop.
A picture worths a thousand words, really? Let see how many, indeed!
#!/bin/bash
# path where reside images for OCR
path=$1
filename=''
ocrized=''
home=$(pwd)
# cd $path
# create a collecting file for ocr text
collector=$2
# cycle through all images and ocr-ize them
for file in $(ls -f $path/*.png); do
# filename=$path/$file
filename=$file
ocrized="${filename%.*}.txt"
echo "ocr-ize $filename"
if [ -f $filename ]; then
tesseract $filename ${filename%.*}
echo "append ocr $ocrized to collector file $collector"
cat $ocrized >> $collector
# echo $ocrized
fi
done
# cd home
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment