Skip to content

Instantly share code, notes, and snippets.

@ssp
ssp / git-big.md
Created December 2, 2012 13:00
git with a lot of data

Testing how git behaves when dealing with many files

  • XML: ** 391041 files with 1 authority record each as XML (100 directories with about 3500-4000 files each) ** taking about 1,5GB of disk space
  • JSON: ** another copy of the same data in 350MB of JSON batch upload files for CouchDB

git add . … 200 seconds CPU time @ 25% CPU

@ssp
ssp / emf-converter.sh
Created November 13, 2012 14:54
Convert Windows EMF to usable graphics files with a crazy chain of tools.
#! /bin/sh
# Convert Windows EMF to usable graphics files with a crazy chain of tools.
# 2012 Sven-S. Porst, SUB Göttingen <[email protected]>
# Configuration
BASEPATH=/Users/ssp/SUB/edfu/emf
WINE=wine
METAFILE2EPS=$BASEPATH/metafile2eps-linux/metafile2eps.exe
PS2PDF=ps2pdf
INKSCAPE=/Applications/Graphik/Inkscape.app/Contents/Resources/bin/inkscape
@ssp
ssp / base-attribute.diff
Created August 31, 2012 10:09
diff on Metaproxy’s router_flexml.cpp to allow base attributes when reading tags
--- a/src/router_flexml.cpp
+++ b/src/router_flexml.cpp
@@ -119,6 +119,8 @@ void mp::RouterFleXML::Rep::parse_xml_filters(xmlDocPtr doc,
id_value = value;
else if (name == "type")
type_value = value;
+ else if (name == "base")
+ ;// Ignore XInclude base attribute.
else
throw mp::XMLError("Only attribute id or type allowed"
@ssp
ssp / Template
Created July 30, 2012 13:28
xmlinclude Opac Nutzung
plugin.tx_xmlinclude.settings {
parseAsHTML = 1
cookiePassthrough.1 = DB
cookiePassthrough.2 = PSC_1
XSL.51 = fileadmin/xsl/Opac.xsl
useRealURL = 1
}
page.CSS_inlineStyle (
#content h1 {
@ssp
ssp / gist:3205626
Created July 30, 2012 08:09
grep searchterms from query logs
# limit to:
# * searches
# * not SX20 presentation (used for Neuerwerbungen RSS feed)
# extract:
# * query term in TRM parameter
cat access_log | grep SRCHA | grep TRM | grep -v SX20 | grep -v "XML=1" | sed -e "s/.*TRM=//" -e "s/[& ].*//" -e "s/+/ /g" > searchterms
@ssp
ssp / gist:2966967
Created June 21, 2012 16:51
UnicodeChecker + Spotlight + command line
# Assuming you have UnicodeChecker’s Spotlight support installed on your Mac,
# you can use mdfind to find unicode characters on your machine.
# A bit of command line wrangling later you may find the character names.
mdfind roman numeral uccharacter | xargs -I FILENAME -L 1 sh -c 'plutil -convert xml1 -o - "FILENAME" | xpath "//string[preceding-sibling::key/text()=\"name\"]/text()"; echo ""' 2>/dev/null
Opac Abfrage geänderte MSC 2000 vs 2010
von: http://msc2010.org/msc2000to2010.html
msc 05e25 or 05e35 or 15a90 or 34d40 or 34m20 or 34m37 or 35a05 or 35a07 or 35j45 or 35j55 or 35j85 or 35q72 or 35q80 or 39a11 or 65q05 or 97c90
GSO: 388 Treffer
Göttingen Opac: 435 Treffer
@ssp
ssp / gist:1778585
Created February 9, 2012 08:54
Solr URLs
Delete all documents:
/solr/update/?stream.body=%3Cdelete%3E%3Cquery%3E*:*%3C/query%3E%3C/delete%3E
@ssp
ssp / pazpar2-curl.sh
Created January 24, 2012 20:22
curl commands for starting a pazpar2 session, searching and querying results
#!/usr/bin/env sh
curl "http://vlib.sub.uni-goettingen.de/pazpar2/search.pz2?command=init&service=vlib" | grep init | sed -e 's/.*<session>//' | sed -e 's/<\/session>.*//' > /tmp/sessionID ; curl "http://vlib.sub.uni-goettingen.de/pazpar2/search.pz2?command=search&query=gaga&session=`cat /tmp/sessionID`" ; curl "http://vlib.sub.uni-goettingen.de/pazpar2/search.pz2?command=show&session=`cat /tmp/sessionID`" ; curl "http://vlib.sub.uni-goettingen.de/pazpar2/search.pz2?command=show&session=`cat /tmp/sessionID`" ; curl "http://vlib.sub.uni-goettingen.de/pazpar2/search.pz2?command=show&session=`cat /tmp/sessionID`" ; curl "http://vlib.sub.uni-goettingen.de/pazpar2/search.pz2?command=show&session=`cat /tmp/sessionID`" ; curl "http://vlib.sub.uni-goettingen.de/pazpar2/search.pz2?command=show&session=`cat /tmp/sessionID`" ; curl "http://vlib.sub.uni-goettingen.de/pazpar2/search.pz2?command=show&session=`cat /tmp/sessionID`" ; curl "http://vlib.sub.uni-goettingen.de/pazpar2/search.pz2?command=show&session=`cat /tmp/se
@ssp
ssp / pz2-log-analysis.sh
Created January 23, 2012 16:47
pazpar2 log analysis
# Try to grep rough usage information from a pazpar2 log:
# Count how often each service has been initialised and
# add a + to the service name if pz:allow is used.
# Both for current and archived logs
cat pazpar2.log | grep "command=init" | sed "s/.*search.pz2?command=init&service=\([-a-zA-Z0-9]*\)/\1/g" | sed "s/.pz:allow.*/+/g" | sort | uniq -c
zcat pazpar2.log.2.gz | grep "command=init" | sed "s/.*search.pz2?command=init&service=\([-a-zA-Z0-9]*\)/\1/g" | sed "s/.pz:allow.*/+/g" | sort | uniq -c