Skip to content

Instantly share code, notes, and snippets.

View tingletech's full-sized avatar

Brian Tingle tingletech

View GitHub Profile

Vertex type 1: Cached Descriptive Summary Data

  • properties; title, source, rights etc.
  • out edges; obtain_content, source_metadata, access point
  • in edges; obtain_content, descriptive_summary

Vertex type 2: vernacular metadata (descriptive and technical)

  • properties; blob and parsed original data
  • out edges; referenced_file
  • in edges; source_metadata

Amazon migration for dsc.cdlib.org

current production environment retires 2015-Dec-15

dsc.cdlib.org was designed to merge many Veritas VCS clusters into one virtual machine, for migration from solaris to vmware. Migrating to AWS, we are going to split things up some again.

XTF based websites

During the VCS (fail-over clustering) on big solaris boxes era, it was easy to maintain high application available through a patch cycle at the OS level. Our VMware architecture fails to provide the same level

mkdir project
cd project
which virtualenv
# if not installed `sudo easy_install virtualenv`
virtualenv .
. bin/activate
# needed for Pillow

Config

The third factor in the 12 factor methodology is "Store config in the environment".

  • REGISTRY_URL -- registry data, default to https://registry.cdlib.org

  • SOLR_API -- solr index, default to $REGISTRY_URL/solr/dc-collection

  • REGISTRY_API -- JSON authority files $REGISTRY_URL/api/v1

  • ASSET_BASE -- base URL for CSS and JS, default to ??

SNAC Linking Protocol

proposal

In order to automate the maintenance of links to related archival collections and bibliographic resources on the SNAC identify page (ARK page), a sitemaps based linking protocol could be established.

A pilot member would create a sitemap.txt, sitemap.xml, or siteMapIndex.xml that would list all the URLs that should be processed by the co-operative. ARKs related to the URL in the sitemap will be identified in sitemap extension elements and in the content of the URLs.

By creating a link to SNAC on your HTML, and then submitting a sitemap for harvesting, then SNAC will automatically "link back" to your site. Sitemaps will be periodically reprocessed to maintain link accuracy.

Apache Maven 3.1.1 (0728685237757ffbf44136acec0402957f723d9a; 2013-09-17 11:22:22-0400)
Maven home: /usr/local/Cellar/maven/3.1.1/libexec
Java version: 1.8.0, vendor: Oracle Corporation
Java home: /Library/Java/JavaVirtualMachines/jdk1.8.0.jdk/Contents/Home/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "mac os x", version: "10.9.4", arch: "x86_64", family: "mac"
[INFO] Error stacktraces are turned on.
[DEBUG] Reading global settings from /usr/local/Cellar/maven/3.1.1/libexec/conf/settings.xml
[DEBUG] Reading user settings from /Users/tingle/.m2/settings.xml
[DEBUG] Using local repository at /Users/tingle/.m2/repository
@tingletech
tingletech / SNAC_API.md
Last active August 29, 2015 14:05
unofficial/unsupported SNAC 2 prototype API
#!/usr/bin/env python
import sys
import time
import logging
import daemonocle
from shove import Shove
def main():
logging.basicConfig(
import subprocess
import re
from collections import defaultdict
import sys
def sanity_check_ldd(path):
parse_ldd = re.compile('^\t(.*?)\\.') # parse ldd outout
d = defaultdict(bool)
# look for duplicate libraries