Skip to content

Instantly share code, notes, and snippets.

@peterk
peterk / summarize.py
Created February 9, 2020 10:20
A short script to test text summarization with the KB BERT model
from summarizer import Summarizer # see https://github.com/dmmiller612/bert-extractive-summarizer
import transformers
import os
import sys
# load text file to summarize
filename = sys.argv[1]
print("Summarizing %s" % filename)
@peterk
peterk / summarize.py
Created February 9, 2020 10:20
A short script to test text summarization with the KB BERT model
from summarizer import Summarizer # see https://github.com/dmmiller612/bert-extractive-summarizer
import transformers
import os
import sys
filename = sys.argv[1]
print("Summarizing %s" % filename)
body = ""
@peterk
peterk / libris_auth_strindberg.json
Created October 21, 2018 09:17
Authority data for August Strindberg in Libris
{
"@context": "/context.jsonld",
"@id": "https://libris.kb.se/tr574vdc33gk2cc",
"@type": "Record",
"_marcUncompleted": [
{
"375": {
"ind1": " ",
"ind2": " ",
"subfields": [
@peterk
peterk / valkompass.json
Created May 15, 2018 07:04
Expressen valkompass - tredjepartsanrop
{
"assets.expressen.se": {
"hostname": "assets.expressen.se",
"favicon": "",
"firstPartyHostnames": [
"www.expressen.se"
],
"firstParty": false,
"thirdParties": []
},
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX dc: <http://purl.org/dc/terms/>
PREFIX sdo: <http://schema.org/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT (COUNT(*) AS ?count)
WHERE {
?person wdt:P31 wd:Q5 .
@peterk
peterk / checklist.md
Last active March 31, 2017 08:39
Alpha Library Privacy Checklist (0.1)

Protection of privacy in the library environment (DRAFT)

v0.1 (roughly google translated from swedish - sorry for poor english).

Libraries should work for the democratic development of society by contributing to the dissemination of knowledge and freedom of opinion. This means that access to the internet is an important service to provide to visitors of the libraries. However, it does not mean that visitors should be able to do what they want with library equipment or access other visitor's information. Patrons should trust that their use of library services do not infringe on their privacy.

This document is a first draft of a checklist aimed at reducing the risk of intrusion into the visitors' privacy when using digital services in a library environment. The checklist is not exhaustive. The idea is that a librarian can use the checklist to get a basic idea of a library´s protection of patron privacy. This should be used in discussion with colleague

# coding: utf8
from openpyxl import Workbook, load_workbook
from openpyxl.compat import range
import networkx as nx
# Detta skript läser en NB-export i Excel och skriver en gexf-fil för vidare bearbetning i Gephi
wb = Workbook()
<?xml version='1.0' encoding='utf-8'?>
<metadata>
<records>
<record>
<source>https://data.kb.se/datasets/2014/10/suecia/14818031%2C1.tif/</source>
<title>Skoklosters slott och sockenkyrka</title>
<filename>Suecia antiqua (SELIBR 14818031)-1</filename>
<Commons_filename>Suecia antiqua (SELIBR 14818031)-1</Commons_filename>
<description>Skoklosters slott och sockenkyrka by Dahlbergh, Erik, 1625-1703.</description>
<permissions>{{Kungliga biblioteket image|libris-id=14818031}}
@peterk
peterk / suecia.py
Last active June 6, 2016 05:56
Draft script to prepare Suecia images for Wikimedia commons upload
import re
import requests
from lxml import html
from lxml.builder import E
from lxml.etree import tostring
url = "https://data.kb.se/datasets/2014/10/suecia/"
template = "{{Kungliga biblioteket image|libris-id=%s|url=%s}}"
def getmeta(libris_id):
ext_ip="10.0.0.1" # Change this value to your vultr IP
server "default" {
listen on $ext_ip port 80
}
types {
text/css css ;
text/html htm html ;
text/txt txt ;
image/gif gif ;