Skip to content

Instantly share code, notes, and snippets.

@tsh-code
Created February 9, 2024 13:09
Show Gist options
  • Save tsh-code/ac517a8870c7789c41a9fc68ff590b44 to your computer and use it in GitHub Desktop.
Save tsh-code/ac517a8870c7789c41a9fc68ff590b44 to your computer and use it in GitHub Desktop.
config = {
"directory": "OCR/Invoices1",
"ocr_library": "pytesseract",
"output_file": "results-bc-without.txt",
"statistics_file": "statistics-bc-without.txt",
"apply_cropping": False,
"resize_to_fhd": True,
"deskew_image": True,
"threshold_method": "niblack",
"skip_ocr_processing": False,
"split_with_commas": False,
"apply_nlp": False,
"data_extraction_strategy": "regexp",
"output_directory": "cropping",
"zipped_folder": "packed-cropping",
"save_intermediate_steps": False
}
run_ocr(config)
run_data_extraction(config)
pack_folder_in_parts(config)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment