This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| """ | |
| groq_ocr.py | |
| Processes newspaper images using Groq's vision API and extracts individual | |
| articles to a CSV. Each row represents one article with associated page metadata. | |
| Usage: | |
| python newspaper_ocr.py --input_dir processed_output/images --output ocr_results.csv | |
| Requirements: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # AUFBAU | |
| ## RECONSTRUCTION | |
| **AN AMERICAN WEEKLY PUBLISHED IN NEW YORK** | |
| **by The New World Club, Inc., 209 West 48th Street, New York 19, N. Y. Phone: CIrcle 7-4462** | |
| *Entered as second-class matter January 20, 1934, at the Post Office New York, N. Y. under Act of March 3, 1879* | |
| **Vol. XV—No. 26 | NEW YORK, N. Y., FRIDAY, JULY 1, 1949 | Price 10¢** | |
| *** | |
| > **Zunächst in "Aufbau":** |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| **Role:** You are a precise archaeological document analyst specializing in the digitization of field notebooks and excavation catalogues. | |
| **Task:** | |
| 1. Perform a spatial analysis of the document to distinguish between text blocks, artifact photographs/sketches, and marginalia. | |
| 2. Extract metadata and create a brief 2-3 sentence overview of the document's contents. | |
| 3. Transcribe the document EXACTLY as written into a valid YAML structure. | |
| 4. Extract archaeological entities into specific categories based only on explicit mentions. | |
| **Critical Rules:** | |
| - **Zero Hallucination:** Only include information directly visible in the image. If a word is illegible, mark it as `[illegible]`. |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| index | fruit | quantity | color | store | |
|---|---|---|---|---|---|
| 0 | apple | 12 | red | loblaws | |
| 1 | banana | 18 | yellow | farm boy | |
| 2 | grape | 30 | purple | freshco | |
| 3 | cherry | 4 | red | iga | |
| 4 | watermelon | 2 | green | farm boy | |
| 5 | raspberries | 23 | red | iga |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import mesa | |
| class LetterAgent(mesa.Agent): | |
| def __init__(self, model): | |
| super().__init__(model) | |
| self.letters_sent = 0 | |
| self.letters_received = 0 | |
| def step(self): | |
| print(f"Hi, I am agent {self.unique_id}.") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| %%capture | |
| !python3 -m pip install git+https://github.com/p-lod/plodlib | |
| !pip install requests_cache | |
| !pip install rdflib | |
| import plodlib | |
| import json | |
| import pandas as pd | |
| from string import Template | |
| import rdflib as rdf |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Input Text: | |
| This archive presents appendices B-I and supplementary material resulting from the programme of archaeological works undertaken during the construction scheme to widen the A1 trunk road between Dishforth and Leeming Bar in North Yorkshire. The Iron Age to early medieval evidence from Healam Bridge, along with other evidence for Roman activity along the route is published in two volumes | |
| Extracted Entities: | |
| Input Text: | |
| This collection comprises images, spreadsheets, reports, vector graphics, and scanned site records and drawings from archaeological recording by Archaeological Research Services at Lower Radbourne Deserted Medieval Village, Warwickshire. The work was undertaken between April and December 2021. Area C32070 was dominated by intercutting features predominantly dated to two broad phases, prehistoric and medieval. The prehistoric features were represented by a large ring ditch, potentially dating to the Early Bronze Age, four smaller potential Bronze Age ring ditches and a series of inte |
NewerOlder