Skip to content

Instantly share code, notes, and snippets.

@birkin
Created August 28, 2025 15:23
Show Gist options
  • Save birkin/270956a8e0a09b18e1a7715b42de9869 to your computer and use it in GitHub Desktop.
Save birkin/270956a8e0a09b18e1a7715b42de9869 to your computer and use it in GitHub Desktop.
warc gist experimentation for tracking downloaded warcs
{
"files": [
{
"seedId": "seed123",
"crawlJobId": "job456",
"crawlTimestamp": "20171003231334202",
"service": "ARCHIVEIT",
"collectionId": 7310,
"filePath": "/pairtree/73/10/warcs/20171003231334202-00000.warc.gz",
"serialNumber": "00000",
"fileFormat": "warc.gz",
"crawlDates": ["2017-10-03T23:13:34Z"]
},
{
"seedId": "seed123",
"crawlJobId": "job456",
"crawlTimestamp": "20171003231334202",
"service": "ARCHIVEIT",
"collectionId": 7310,
"filePath": "/pairtree/73/10/warcs/20171003231334202-00001.warc.gz",
"serialNumber": "00001",
"fileFormat": "warc.gz",
"crawlDates": ["2017-10-03T23:13:34Z"]
},
{
"seedId": "seed999",
"crawlJobId": "job777",
"crawlTimestamp": "20181205094510225",
"service": "ARCHIVEIT",
"collectionId": 10234,
"filePath": "/pairtree/102/34/warcs/20181205094510225-00000.warc.gz",
"serialNumber": "00000",
"fileFormat": "warc.gz",
"crawlDates": ["2018-12-05T09:45:10Z"]
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment