Skip to content

Instantly share code, notes, and snippets.

@pbinkley
pbinkley / IA-IIIF-bookmarklet.js
Last active September 15, 2023 16:35
Bookmarklet that copies an Internet Archive IIIF manifest uri to the clipboard, if you are looking at a regular Internet Archive item.
javascript:(()=>{id=/https:\/\/archive.org\/details\/([^/]*).*/.exec(location.href);if(id){manifest="https://iiif.archive.org/iiif/"+id[1]+"/manifest.json";navigator.clipboard.writeText(manifest);alert("Copied manifest uri: "+manifest);}else{alert("This isn't an archive.org item.");}})();
@pbinkley
pbinkley / watch.py
Created April 1, 2022 14:20
Script to tail list of files downloaded by browsertrix
import tailer
import json
import urllib.parse
for line in tailer.follow(open("crawls/collections/library-sumdu-edu-ua/pages/pages.jsonl")):
data = json.loads(line)
url = data['url']
print(urllib.parse.unquote(url))
@pbinkley
pbinkley / crawl-config.yaml
Created March 15, 2022 18:54
SUCHO browsertrix config for a WikiMedia site
collection: "wiki-library-kr-ua"
workers: 16
saveState: always
seeds:
- url: https://wiki.library.kr.ua/
include: .*\.wiki\.library\.kr\.ua/
exclude:
- .*action\=.*
- .*page\=.*
- .*limit\=.*
{
"catalog": [
{
"manifestId": "https://pbinkley.github.io/rcb-manual/iiif/manualonmethodso00robe/manifest.json"
},
{
"manifestId": "https://iiif.archive.org/iiif/3/manualonmethodso00robe/manifest.json"
}
],
"companionWindows": {
pbinkley@Hoffnung:~/Projects/iiif/annonatate/annonatate-sourcecode-pbinkley (debug)
$ python annonatate/flaskserver.py
b05940aac8848aa8d00f
* Serving Flask app "flaskserver" (lazy loading)
* Environment: development
* Debug mode: on
* Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
* Restarting with inotify reloader
b05940aac8848aa8d00f
* Debugger is active!
$ python annonatate/flaskserver.py
b05940aac8848aa8d00f
* Serving Flask app "flaskserver" (lazy loading)
* Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.
* Debug mode: on
* Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
* Restarting with inotify reloader
b05940aac8848aa8d00f
@pbinkley
pbinkley / youtube-comments.json
Created May 6, 2021 17:58
Youtube comments demo sitemap for webscraper.io
{
"_id": "youtube-comments",
"startUrl": [
"https://www.youtube.com/watch?v=DLctAw4JZXE"
],
"selectors": [
{
"id": "comment",
"type": "SelectorElement",
"parentSelectors": [
@pbinkley
pbinkley / mirador.md
Created May 5, 2021 04:03
Mirador demo page for Wax sites
title layout
Mirador
default

{% assign iiif_collections = site.collections | where_exp: "coll", "coll['images']['source']" %} {% assign default_collection = iiif_collections[0]['label'] %} {% assign default_item = site.data[default_collection][0]['pid'] %}

@pbinkley
pbinkley / index.html
Last active June 3, 2021 15:30
Self-contained html file for showing OpenSeadragon view of a IIIF image
<!DOCTYPE html>
<html>
<head>
<meta charset='utf-8'>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<link rel="shortcut icon" type="image/png" href="/images/logo16.png">
<title>IIIF Tile Source | OpenSeadragon</title>
<link rel='stylesheet'
@pbinkley
pbinkley / LICENSE.txt
Last active February 25, 2021 01:13 — forked from mejackreed/LICENSE.txt
Leaflet-IIIF Basic Example
MIT License
Copyright (c) 2016 Jack Reed
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions: