Skip to content

Instantly share code, notes, and snippets.

@aino-prashant
Last active May 27, 2020 04:43
Show Gist options
  • Save aino-prashant/65cbd2cb24ded5afaae4eeda89557a00 to your computer and use it in GitHub Desktop.
Save aino-prashant/65cbd2cb24ded5afaae4eeda89557a00 to your computer and use it in GitHub Desktop.
OCR Tesseract
Ristoratori, commercianti, parrucchieri: il punto su Verona
Non era semplice scon- gate le liberta personali ed mai fermate, quelle di prima
tentare tutti, ma il premier esimi giuristi, non certo que- necessita, e sarebbe stato
Conte, con I'ultimo decreto, sto giornale, stanno par- __folle il contrario. Alcuni im-
é riuscito nellimpresa. A lando apertamente di prov- _ prenditori in base ai codici
prescindere dalle simpatieo vedimenti incostituzionali. Ateco e al silenzio-assenso
antipatie politiche € un pro- Le categorie produttive so- delle prefetture si sono fatti
fluvio di proteste. Ai cittadini no in ginocchio. Alcune, ad il segno della croce (...)
sono state nuovamente ne- onor del vero, non si sono SEGUE A PAG.2
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title></title>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
<meta name='ocr-system' content='tesseract' />
</head>
<body>
<div class='ocr_page' id='page_1'
title='image "src/test/resources/images/2265928100.jpg"; bbox 0 0 1146 372; ppageno 0'>
<div class='ocr_carea' id='block_1_1' title="bbox 19 16 1119 341">
<p class='ocr_par' id='par_1_1' lang='eng'
title="bbox 19 16 1119 341">
<span class='ocr_line' id='line_1_1'
title="bbox 20 16 1119 67; baseline 0 -12; x_size 42.278099; x_descenders 7; x_ascenders 9.2780981"><span
class='ocrx_word' id='word_1_1'
title='bbox 20 18 227 62; x_wconf 92'><strong><em>Ristoratori,</em></strong></span>
<span class='ocrx_word' id='word_1_2'
title='bbox 239 24 511 62; x_wconf 91'><strong><em>commercianti,</em></strong></span>
<span class='ocrx_word' id='word_1_3'
title='bbox 522 16 764 67; x_wconf 90'><strong><em>parrucchieri:</em></strong></span>
<span class='ocrx_word' id='word_1_4'
title='bbox 776 16 800 55; x_wconf 95'><strong><em>il</em></strong></span>
<span class='ocrx_word' id='word_1_5'
title='bbox 811 24 921 67; x_wconf 96'><strong><em>punto</em></strong></span>
<span class='ocrx_word' id='word_1_6'
title='bbox 934 29 973 55; x_wconf 96'><strong><em>su</em></strong></span>
<span class='ocrx_word' id='word_1_7'
title='bbox 982 18 1119 55; x_wconf 96'><strong><em>Verona</em></strong></span>
</span> <span class='ocr_line' id='line_1_2'
title="bbox 21 84 1118 110; baseline 0 -6; x_size 26; x_descenders 6; x_ascenders 5"><span
class='ocrx_word' id='word_1_8'
title='bbox 21 84 69 104; x_wconf 96'><strong><em>Non</em></strong></span>
<span class='ocrx_word' id='word_1_9'
title='bbox 93 89 132 104; x_wconf 93'><strong><em>era</em></strong></span>
<span class='ocrx_word' id='word_1_10'
title='bbox 156 84 267 110; x_wconf 90'><strong><em>semplice</em></strong></span>
<span class='ocrx_word' id='word_1_11'
title='bbox 291 89 359 104; x_wconf 90'><strong><em>scon-</em></strong></span>
<span class='ocrx_word' id='word_1_12'
title='bbox 400 85 451 110; x_wconf 78'><strong><em>gate</em></strong></span>
<span class='ocrx_word' id='word_1_13'
title='bbox 466 84 485 104; x_wconf 92'><strong><em>le</em></strong></span>
<span class='ocrx_word' id='word_1_14'
title='bbox 500 84 571 104; x_wconf 91'><strong><em>liberta</em></strong></span>
<span class='ocrx_word' id='word_1_15'
title='bbox 587 84 695 110; x_wconf 90'><strong><em>personali</em></strong></span>
<span class='ocrx_word' id='word_1_16'
title='bbox 710 84 737 104; x_wconf 94'><strong><em>ed</em></strong></span>
<span class='ocrx_word' id='word_1_17'
title='bbox 780 84 821 104; x_wconf 52'><strong><em>mai</em></strong></span>
<span class='ocrx_word' id='word_1_18'
title='bbox 831 84 930 108; x_wconf 92'><strong><em>fermate,</em></strong></span>
<span class='ocrx_word' id='word_1_19'
title='bbox 941 84 1012 110; x_wconf 96'><strong><em>quelle</em></strong></span>
<span class='ocrx_word' id='word_1_20'
title='bbox 1022 84 1041 104; x_wconf 96'><strong><em>di</em></strong></span>
<span class='ocrx_word' id='word_1_21'
title='bbox 1052 84 1118 109; x_wconf 96'><strong><em>prima</em></strong></span>
</span> <span class='ocr_line' id='line_1_3'
title="bbox 20 117 1118 143; baseline 0.001 -6; x_size 27; x_descenders 6; x_ascenders 5"><span
class='ocrx_word' id='word_1_22'
title='bbox 20 118 104 138; x_wconf 92'><strong><em>tentare</em></strong></span>
<span class='ocrx_word' id='word_1_23'
title='bbox 120 117 169 141; x_wconf 95'><strong><em>tutti,</em></strong></span>
<span class='ocrx_word' id='word_1_24'
title='bbox 188 122 223 138; x_wconf 95'><strong><em>ma</em></strong></span>
<span class='ocrx_word' id='word_1_25'
title='bbox 241 117 250 137; x_wconf 96'><strong><em>il</em></strong></span>
<span class='ocrx_word' id='word_1_26'
title='bbox 268 117 360 143; x_wconf 92'><strong><em>premier</em></strong></span>
<span class='ocrx_word' id='word_1_27'
title='bbox 400 117 455 138; x_wconf 45'><strong><em>esimi</em></strong></span>
<span class='ocrx_word' id='word_1_28'
title='bbox 459 117 554 143; x_wconf 45'><strong><em>giuristi,</em></strong></span>
<span class='ocrx_word' id='word_1_29'
title='bbox 565 122 608 138; x_wconf 96'><strong><em>non</em></strong></span>
<span class='ocrx_word' id='word_1_30'
title='bbox 617 118 676 138; x_wconf 79'><strong><em>certo</em></strong></span>
<span class='ocrx_word' id='word_1_31'
title='bbox 685 122 739 143; x_wconf 39'><strong><em>que-</em></strong></span>
<span class='ocrx_word' id='word_1_32'
title='bbox 780 117 900 141; x_wconf 39'><strong><em>necessita,</em></strong></span>
<span class='ocrx_word' id='word_1_33'
title='bbox 917 122 931 138; x_wconf 92'><strong><em>e</em></strong></span>
<span class='ocrx_word' id='word_1_34'
title='bbox 947 117 1044 138; x_wconf 92'><strong><em>sarebbe</em></strong></span>
<span class='ocrx_word' id='word_1_35'
title='bbox 1061 118 1118 138; x_wconf 96'><strong><em>stato</em></strong></span>
</span> <span class='ocr_line' id='line_1_4'
title="bbox 20 150 1118 176; baseline 0 -5; x_size 25; x_descenders 5; x_ascenders 5"><span
class='ocrx_word' id='word_1_36'
title='bbox 20 150 103 174; x_wconf 95'><strong><em>Conte,</em></strong></span>
<span class='ocrx_word' id='word_1_37'
title='bbox 115 156 156 171; x_wconf 92'><strong><em>con</em></strong></span>
<span class='ocrx_word' id='word_1_38'
title='bbox 169 151 250 171; x_wconf 39'><strong><em>I&#39;ultimo</em></strong></span>
<span class='ocrx_word' id='word_1_39'
title='bbox 261 151 357 174; x_wconf 91'><strong><em>decreto,</em></strong></span>
<span class='ocrx_word' id='word_1_40'
title='bbox 400 151 436 171; x_wconf 93'><strong><em>sto</em></strong></span>
<span class='ocrx_word' id='word_1_41'
title='bbox 457 151 563 176; x_wconf 91'><strong><em>giornale,</em></strong></span>
<span class='ocrx_word' id='word_1_42'
title='bbox 585 152 668 171; x_wconf 92'><strong><em>stanno</em></strong></span>
<span class='ocrx_word' id='word_1_43'
title='bbox 691 156 739 176; x_wconf 62'><strong><em>par-</em></strong></span>
<span class='ocrx_word' id='word_1_44'
title='bbox 779 150 828 171; x_wconf 18'><strong><em>__folle</em></strong></span>
<span class='ocrx_word' id='word_1_45'
title='bbox 841 151 850 171; x_wconf 93'><strong><em>il</em></strong></span>
<span class='ocrx_word' id='word_1_46'
title='bbox 863 151 974 171; x_wconf 90'><strong><em>contrario.</em></strong></span>
<span class='ocrx_word' id='word_1_47'
title='bbox 996 151 1069 171; x_wconf 90'><strong><em>Alcuni</em></strong></span>
<span class='ocrx_word' id='word_1_48'
title='bbox 1083 150 1118 171; x_wconf 92'><strong><em>im-</em></strong></span>
</span> <span class='ocr_line' id='line_1_5'
title="bbox 20 184 1117 210; baseline 0 -6; x_size 25; x_descenders 5; x_ascenders 5"><span
class='ocrx_word' id='word_1_49'
title='bbox 20 184 34 204; x_wconf 84'><strong><em>é</em></strong></span>
<span class='ocrx_word' id='word_1_50'
title='bbox 54 184 138 204; x_wconf 86'><strong><em>riuscito</em></strong></span>
<span class='ocrx_word' id='word_1_51'
title='bbox 158 184 307 209; x_wconf 82'><strong><em>nellimpresa.</em></strong></span>
<span class='ocrx_word' id='word_1_52'
title='bbox 343 184 361 204; x_wconf 77'><strong><em>A</em></strong></span>
<span class='ocrx_word' id='word_1_53'
title='bbox 401 184 465 204; x_wconf 61'><strong><em>
lando</em></strong></span> <span class='ocrx_word' id='word_1_54'
title='bbox 479 185 631 209; x_wconf 91'><strong><em>apertamente</em></strong></span>
<span class='ocrx_word' id='word_1_55'
title='bbox 645 184 663 204; x_wconf 91'><strong><em>di</em></strong></span>
<span class='ocrx_word' id='word_1_56'
title='bbox 678 189 738 209; x_wconf 73'><strong><em>prov-</em></strong></span>
<span class='ocrx_word' id='word_1_57'
title='bbox 0 0 1146 372; x_wconf 3'><strong><em>_</em></strong></span>
<span class='ocrx_word' id='word_1_58'
title='bbox 780 184 891 210; x_wconf 30'><strong><em>prenditori</em></strong></span>
<span class='ocrx_word' id='word_1_59'
title='bbox 908 184 926 204; x_wconf 95'><strong><em>in</em></strong></span>
<span class='ocrx_word' id='word_1_60'
title='bbox 943 184 1000 204; x_wconf 95'><strong><em>base</em></strong></span>
<span class='ocrx_word' id='word_1_61'
title='bbox 1015 184 1034 204; x_wconf 91'><strong><em>ai</em></strong></span>
<span class='ocrx_word' id='word_1_62'
title='bbox 1050 184 1117 204; x_wconf 91'><strong><em>codici</em></strong></span>
</span> <span class='ocr_line' id='line_1_6'
title="bbox 21 217 1118 243; baseline 0 -5; x_size 26; x_descenders 5; x_ascenders 5"><span
class='ocrx_word' id='word_1_63'
title='bbox 21 217 162 243; x_wconf 92'><strong><em>prescindere</em></strong></span>
<span class='ocrx_word' id='word_1_64'
title='bbox 170 217 227 238; x_wconf 93'><strong><em>dalle</em></strong></span>
<span class='ocrx_word' id='word_1_65'
title='bbox 236 217 359 243; x_wconf 23'><strong><em>simpatieo</em></strong></span>
<span class='ocrx_word' id='word_1_66'
title='bbox 400 217 520 238; x_wconf 24'><strong><em>vedimenti</em></strong></span>
<span class='ocrx_word' id='word_1_67'
title='bbox 545 217 737 238; x_wconf 88'><strong><em>incostituzionali.</em></strong></span>
<span class='ocrx_word' id='word_1_68'
title='bbox 779 217 848 238; x_wconf 34'><strong><em>Ateco</em></strong></span>
<span class='ocrx_word' id='word_1_69'
title='bbox 860 222 874 238; x_wconf 93'><strong><em>e</em></strong></span>
<span class='ocrx_word' id='word_1_70'
title='bbox 886 217 904 238; x_wconf 93'><strong><em>al</em></strong></span>
<span class='ocrx_word' id='word_1_71'
title='bbox 917 217 1118 238; x_wconf 91'><strong><em>silenzio-assenso</em></strong></span>
</span> <span class='ocr_line' id='line_1_7'
title="bbox 20 250 1117 276; baseline 0 -5; x_size 26; x_descenders 5; x_ascenders 6"><span
class='ocrx_word' id='word_1_72'
title='bbox 20 250 122 276; x_wconf 92'><strong><em>antipatie</em></strong></span>
<span class='ocrx_word' id='word_1_73'
title='bbox 135 250 233 276; x_wconf 91'><strong><em>politiche</em></strong></span>
<span class='ocrx_word' id='word_1_74'
title='bbox 245 250 258 271; x_wconf 61'><strong><em>€</em></strong></span>
<span class='ocrx_word' id='word_1_75'
title='bbox 271 256 299 271; x_wconf 91'><strong><em>un</em></strong></span>
<span class='ocrx_word' id='word_1_76'
title='bbox 312 256 359 276; x_wconf 91'><strong><em>pro-</em></strong></span>
<span class='ocrx_word' id='word_1_77'
title='bbox 401 250 429 271; x_wconf 92'><strong><em>Le</em></strong></span>
<span class='ocrx_word' id='word_1_78'
title='bbox 443 251 554 276; x_wconf 93'><strong><em>categorie</em></strong></span>
<span class='ocrx_word' id='word_1_79'
title='bbox 569 250 687 276; x_wconf 92'><strong><em>produttive</em></strong></span>
<span class='ocrx_word' id='word_1_80'
title='bbox 702 256 739 271; x_wconf 93'><strong><em>so-</em></strong></span>
<span class='ocrx_word' id='word_1_81'
title='bbox 780 251 836 271; x_wconf 84'><strong><em>delle</em></strong></span>
<span class='ocrx_word' id='word_1_82'
title='bbox 849 250 963 276; x_wconf 93'><strong><em>prefetture</em></strong></span>
<span class='ocrx_word' id='word_1_83'
title='bbox 976 251 993 271; x_wconf 96'><strong><em>si</em></strong></span>
<span class='ocrx_word' id='word_1_84'
title='bbox 1006 256 1048 271; x_wconf 93'><strong><em>sono</em></strong></span>
<span class='ocrx_word' id='word_1_85'
title='bbox 1050 250 1117 271; x_wconf 92'><strong><em>fatti</em></strong></span>
</span> <span class='ocr_line' id='line_1_8'
title="bbox 19 283 1118 310; baseline 0 -6; x_size 26; x_descenders 6; x_ascenders 5"><span
class='ocrx_word' id='word_1_86'
title='bbox 19 283 82 304; x_wconf 93'><strong><em>fluvio</em></strong></span>
<span class='ocrx_word' id='word_1_87'
title='bbox 92 284 111 304; x_wconf 92'><strong><em>di</em></strong></span>
<span class='ocrx_word' id='word_1_88'
title='bbox 122 285 225 310; x_wconf 92'><strong><em>proteste.</em></strong></span>
<span class='ocrx_word' id='word_1_89'
title='bbox 234 284 257 304; x_wconf 91'><strong><em>Ai</em></strong></span>
<span class='ocrx_word' id='word_1_90'
title='bbox 268 284 358 304; x_wconf 87'><strong><em>cittadini</em></strong></span>
<span class='ocrx_word' id='word_1_91'
title='bbox 401 289 429 304; x_wconf 86'><strong><em>no</em></strong></span>
<span class='ocrx_word' id='word_1_92'
title='bbox 442 284 460 304; x_wconf 92'><strong><em>in</em></strong></span>
<span class='ocrx_word' id='word_1_93'
title='bbox 474 284 594 310; x_wconf 92'><strong><em>ginocchio.</em></strong></span>
<span class='ocrx_word' id='word_1_94'
title='bbox 606 284 695 308; x_wconf 91'><strong><em>Alcune,</em></strong></span>
<span class='ocrx_word' id='word_1_95'
title='bbox 710 284 737 304; x_wconf 0'><strong><em>ad</em></strong></span>
<span class='ocrx_word' id='word_1_96'
title='bbox 780 284 790 304; x_wconf 0'><strong><em>il</em></strong></span>
<span class='ocrx_word' id='word_1_97'
title='bbox 812 289 887 310; x_wconf 89'><strong><em>segno</em></strong></span>
<span class='ocrx_word' id='word_1_98'
title='bbox 908 284 966 304; x_wconf 90'><strong><em>della</em></strong></span>
<span class='ocrx_word' id='word_1_99'
title='bbox 988 289 1055 304; x_wconf 90'><strong><em>croce</em></strong></span>
<span class='ocrx_word' id='word_1_100'
title='bbox 1077 283 1118 310; x_wconf 89'><strong><em>(...)</em></strong></span>
</span> <span class='ocr_line' id='line_1_9'
title="bbox 20 317 993 341; baseline -0.001 -3; x_size 26.90715; x_descenders 5.9071507; x_ascenders 5"><span
class='ocrx_word' id='word_1_101'
title='bbox 20 322 78 338; x_wconf 96'><strong><em>sono</em></strong></span>
<span class='ocrx_word' id='word_1_102'
title='bbox 90 318 147 338; x_wconf 93'><strong><em>state</em></strong></span>
<span class='ocrx_word' id='word_1_103'
title='bbox 160 318 309 338; x_wconf 92'><strong><em>nuovamente</em></strong></span>
<span class='ocrx_word' id='word_1_104'
title='bbox 321 322 359 338; x_wconf 7'><strong><em>ne-</em></strong></span>
<span class='ocrx_word' id='word_1_105'
title='bbox 400 322 454 338; x_wconf 7'><strong><em>
onor</em></strong></span> <span class='ocrx_word' id='word_1_106'
title='bbox 467 317 501 338; x_wconf 96'><strong><em>del</em></strong></span>
<span class='ocrx_word' id='word_1_107'
title='bbox 516 322 574 341; x_wconf 96'><strong><em>vero,</em></strong></span>
<span class='ocrx_word' id='word_1_108'
title='bbox 591 322 633 338; x_wconf 96'><strong><em>non</em></strong></span>
<span class='ocrx_word' id='word_1_109'
title='bbox 648 317 665 338; x_wconf 95'><strong><em>si</em></strong></span>
<span class='ocrx_word' id='word_1_110'
title='bbox 681 322 738 338; x_wconf 81'><strong><em>sono</em></strong></span>
<span class='ocrx_word' id='word_1_111'
title='bbox 780 317 877 338; x_wconf 53'><strong><em>SEGUE</em></strong></span>
<span class='ocrx_word' id='word_1_112'
title='bbox 884 317 902 337; x_wconf 53'><strong><em>A</em></strong></span>
<span class='ocrx_word' id='word_1_113'
title='bbox 912 317 993 338; x_wconf 86'><strong><em>PAG.2</em></strong></span>
</span>
</p>
</div>
</div>
</body>
</html>
@aino-prashant
Copy link
Author

Image
2265928100

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment