Skip to content

Instantly share code, notes, and snippets.

@HiCraigChen
Last active September 20, 2019 04:43
Show Gist options
  • Save HiCraigChen/0faade4bc4c338e2ced29d71122e9676 to your computer and use it in GitHub Desktop.
Save HiCraigChen/0faade4bc4c338e2ced29d71122e9676 to your computer and use it in GitHub Desktop.
pdftotext python code
from chalice import Chalice
import os
app = Chalice(app_name='pdftotext')
@app.route('/')
def index():
# Get a sample pdf file
os.system("curl https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf -o /tmp/file.pdf")
# Execute pdftotext
os.system("lib/poppler-utils-0.26/usr/bin/pdftotext /tmp/file.pdf /tmp/out.txt")
# Read .txt file and return
f = open("/tmp/out.txt","r")
data = f.readlines()
f.close()
return {'text': data}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment