This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import pandas as pd | |
#load csv file - NB first remove extra rows at top, need to replace these before import | |
#df = pd.read_csv('clean_input.csv', encoding = 'ISO-8859-1', low_memory=False) | |
#load Excel directly, - NB first remove extra rows at top, need to replace these before import | |
df = pd.read_excel(open('clean_input.xlsx','rb')) | |
#print column headings |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<data> | |
{ | |
for $Record in /ead | |
where $Record/archdesc/did/langmaterial/language/@langcode[not(contains(., 'eng'))] | |
let $id := $Record/archdesc/did/unitid[1]/text() | |
let $title := $Record/archdesc/did/unittitle | |
let $repo := $Record/eadheader/eadid/@mainagencycode | |
let $lang := $Record/archdesc/did/langmaterial/language/text() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<data> | |
{ | |
for $Record in /ead | |
where $Record/archdesc/dsc//title[not(@render)] | |
let $id := $Record/archdesc/did/unitid[1]/text() | |
let $title := $Record/archdesc/did/unittitle | |
let $repo := $Record/eadheader/eadid/@mainagencycode |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(Get-ChildItem -Recurse -File) | | |
Where-Object { $_.Name -match '[^a-zA-Z0-9.-]' } | | |
Rename-Item -NewName { $_.Name -replace '[^a-zA-Z0-9.-]+', '_' } -WhatIf | |
# As written, this will preview a list of changes where non-ASCII characters as described by the regex are replaced by an underscore | |
# Delete line 4 (-WhatIf) and re-run to actually make the changes | |
# To act on directories: in line 1, replace -File with -Directory, and replace regex with [^a-zA-Z0-9] so as to capture periods | |
# In lines 2 and 3, replace regex with any other character or regex that is desired, e.g. ' ' for replacing only spaces | |
#Be sure to use the correct regex for files and directories, because we do not want to rename hidden files starting with a period (e.g. .DS_store), because they are otherwise hard to find programatically |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
xquery version "3.0"; | |
declare namespace marc="http://www.loc.gov/MARC21/slim"; | |
(: This xquery will match all records containing an arbitrary term, from the raw ArchivesSpace MARC output from the OAI feed :) | |
<results> | |
{ | |
for $MarcRecord in /repository/record/metadata/collection/record | |
for $word in ("alien", "Alien", "Aliens", "aliens") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<?xml version="1.0" encoding="UTF-8"?> | |
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" | |
xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:ead="urn:isbn:1-931666-22-9" | |
exclude-result-prefixes="xs " version="2.0"> | |
<xsl:template match="c"> | |
<xsl:text>Title|Date|Box|Folder|Publish Status </xsl:text> | |
<xsl:apply-templates select="//c[@level = 'file']"/> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import xmltodict, json | |
from timeit import default_timer as timer | |
import os | |
import sys | |
import datetime | |
#this script will interate over the entire CLIO corpus and return all RBML records as individual marcxml records with bib id as filename | |
#this function wraps the json in a dict with a record key, and casts it to an individual marcxml record | |
def write_marcxml_record(record): |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import requests, csv, json, urllib, time | |
startTime = time.time() | |
baseURLexact = 'http://id.loc.gov/authorities/names/' | |
#http://id.loc.gov/authorities/names/nr2002027244.json | |
with open('input_numbers.csv', 'r') as csvfile: | |
reader = csv.reader(csvfile, delimiter=',', quotechar='"') | |
for row in reader: | |
number = str(row[0]) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<data> | |
{ | |
for $Record in /ead | |
where $Record/archdesc/scopecontent/p[contains(., 'Mrs')] | |
let $id := $Record/archdesc/did/unitid[1]/text() | |
let $title := $Record/archdesc/did/unittitle | |
let $repo := $Record/eadheader/eadid/@mainagencycode | |
let $scopeMrs := $Record/archdesc/scopecontent/p[contains(., 'Mrs')] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import xmltodict, json | |
from timeit import default_timer as timer | |
import os | |
import sys | |
import datetime | |
#this script will interate over the entire CLIO corpus and return the leaders, 245$a s, and 035s for all RBML records | |
#this function returns a record | |
def handle_record(_, record): |
NewerOlder