Skip to content

Instantly share code, notes, and snippets.

View ross-spencer's full-sized avatar
💭
🖖

Ross Spencer ross-spencer

💭
🖖
View GitHub Profile
@ross-spencer
ross-spencer / walkthrough.md
Last active June 9, 2019 18:16
Walkthrough: Making pull-requests in the Archivematica project

Walkthrough: Making pull-requests in the Archivematica project

There are a number of steps to follow to make a new pull-request against the Archivematica project. This walk-through should help guide folk through that.

Requirements

  • A Linux-based operating system.
  • A GitHub account.
  • Git installed.
@ross-spencer
ross-spencer / identifiers.json
Created January 23, 2019 20:30
Sample identifiers.json file for legacy PID binding in Archivematica
[{
"file": "objects/transcription/alto-text-0001.xml",
"identifiers": [{
"identifier": "file:///ARCH00152.dig354/transcription transcript/ARCH00152_355_0000.xml",
"identiferType": "URL"
},
{
"identifier": "http://hdl.handle.net/10622/3BF316F5-E00B-4148-B1ED-43EA61EFA263",
"identiferType": "HANDLE"
}
@ross-spencer
ross-spencer / dataset.json
Created January 17, 2019 11:11
Archivematica: Example output of Job: Convert Dataverse Structure
{
"authority": "10.5072/FK2",
"id": 1589,
"identifier": "QAWS8O",
"latestVersion": {
"UNF": "UNF:6:doAry72PFwD1Edcrhsj/Qw==",
"createTime": "2019-01-16T19:15:52Z",
"files": [
{
"dataFile": {
@ross-spencer
ross-spencer / git.md
Created January 7, 2019 11:18
Some Git Bits 'n' Pieces

GIT Guidelines at Artefactual

Artefactual has its own Git repository server on Gitolite. The implications of this is that the work a user completes on GitHub will be overwritten by any mirroring process onto the GitHub servers.

In short, branches, pull-requests, etc. subsequently merged via GitHub will never make it onto the Artefactual servers. They push, but never pull.

git remote set-url origin [email protected]:archivematica.git
git checkout -b dev/issue-1-my-new-branch
@ross-spencer
ross-spencer / queries.md
Last active December 4, 2019 17:05
Useful Archivematica SQL Queries

Return task and module descriptions:

select MicroServiceChainLinks.pk, MicroServiceChainLinks.microserviceGroup, 
       TasksConfigs.description, StandardTasksConfigs.execute,
       StandardTasksConfigs.arguments
from MicroServiceChainLinks
inner join TasksConfigs 
on TasksConfigs.pk = MicroServiceChainLinks.currentTask
inner join StandardTasksConfigs
@ross-spencer
ross-spencer / hashsum.sh
Last active December 30, 2018 17:33
Bash script to run the various coreutils hash utilities, code copyright Artefactual Systems Inc. GPLv3
#!/bin/bash
# Script to run the various coreutils checksum utilities suite. Tools include
# sha1sum, sha256sum etc. Easily extensible to the other algorithms in the
# same suite, for example, b2sum for Blake2 hash comparison.
#
# The script can be run standalone outside of Archivematica using a transfer
# style layout e.g.
#
# transfer/
@ross-spencer
ross-spencer / handle-search.py
Last active November 23, 2018 16:14
Search for UBC handle
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""Script to connect to an Elasticsearch instance to basically, perform a
search for ANY phrase across the index.
Bzsed on: https://gist.github.com/ross-spencer/895b5a346729075dd98f76cd5314728c
"""
from __future__ import print_function
@ross-spencer
ross-spencer / create_dated_files.py
Last active July 13, 2020 01:46
Create files with old dates
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""Script to generate a sample set of files with a random distribution of
dates. Right now, this is very likely to be a uniform distribution so numpy
needs to be explored some more.
"""
import argparse
import atexit
import datetime
import logging
@ross-spencer
ross-spencer / extract_ids.py
Last active August 30, 2018 20:00
Reading DSpace METS Example
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from __future__ import print_function
import sys
import xml.etree.ElementTree as et
try:
tree = et.parse(sys.argv[1])
except IndexError:
@ross-spencer
ross-spencer / pst.md
Last active August 14, 2018 21:26
PST in the UKWA

Finding PST (Email Archives in the UKWA) using moonshine

Stat the UKWA Shine Service:

ross-spencer@artefactual:~/Desktop/Artefactual/moonshine$ ./moonshine-linux64 -ffb 2142444E -stat
2018/08/14 16:17:27 Searching Shine
2018/08/14 16:17:27 Created URL: https://www.webarchive.org.uk/shine/search?page=1&query=content_ffb:"2142444e"&sort=crawl_date&order=asc
2018/08/14 16:17:27 Pinging URL: https://www.webarchive.org.uk/shine/search?page=1&query=content_ffb:"2142444e"&sort=crawl_date&order=asc
2018/08/14 16:17:29 121 files discovered
2018/08/14 16:17:29 13 pages available