Skip to content

Instantly share code, notes, and snippets.

@jeffrafter
Created July 23, 2013 01:25
Show Gist options
  • Save jeffrafter/6059163 to your computer and use it in GitHub Desktop.
Save jeffrafter/6059163 to your computer and use it in GitHub Desktop.
Fonts... leptonica, jbig2enc, tesseract and you
1. brew install ghostscript
2. Download leptonica: http://code.google.com/p/leptonica/downloads/detail?name=leptonica-1.69.tar.bz2
tar xvzf leptonica-1.69.tar.bz2
cd leptonica-1.69
./configure
make
make install
Get a pdf:
curl -O http://www.rand.org/content/dam/rand/pubs/papers/2008/P4874.pdf
Run the pdf2tiff program (in prog):
mkdir 4874
leptonica/leptonica-1.69/prog/pdf2tiff P4874.pdf 4874
Remove pages you don't need
rm 4874/4874002.tif
rm 4874/4874005.tif
rm 4874/4874020.tif
You can deskew:
leptonica/leptonica-1.69/prog/skewtest 4874001.tif 4874001-s.tif
3. git clone https://github.com/agl/jbig2enc
cd jbig2enc
./autogen.sh
# Turn on DUMP_ALL_SYMBOLS and DUMP_SYMBOL_GRAPH in jbig2enc.c
make
make install
4. Run: jbig2 -s -t 0.5 -a 4874/4874*.tif
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment