This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
'error': { | |
'code': 'modification-failed', | |
'info': 'Malformed input: Surname, Given Name; Family Number; Age; Birth Year; Place of Birth Zimonowitz, Annie; 96; 6; 1914; New York Zimonowitz, Milton; 96; 4; 1916; New York Zimonowitz, Jack; 96; 2; 1918; New York Fernstein, James; 97; 31; 1889; New York Fernstein, Pauline; 97; 24; 1896; Russia Fernstein, George; 97; 6; 1914; New York Fernstein, Leo; 97; 3; 1917; New York Morowitz, Morris; 98; 66; 1854; Russia Morowitz, Stella; 98; 64; 1856; Russia Morowitz, Rose; 98; 16; 1904; New York Morowitz, Jacob; 98; 27; 1893; New York Corni, Gabriel; 99; 29; 1891; Turkey Corni, Sarah; 99; 25; 1895; Turkey Corni, Stella; 99; 5; 1915; New York Corni, Simon; 99; 3; 1917; New York Corni, Celia; 99; 0; 1920; New York Corni, Morris; 99; 24; 1896; Turkey Corni, Rachael; 99; 17; 1903; Turkey Fliegal, Joseph; 100; 33; 1887; Austria Fliegal, Sadie; 100; 27; 1893; Austria Fliegal, Sidney; 100; 7; 1913; New York Fliegal, Max; 100; 4; 1916; New York Fliegal, Abraham I; 100; 0; 19 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# This script requires Wand for Python. Install using the documentation at http://docs.wand-py.org/en/0.4.1/index.html before running. | |
import sys, os, datetime | |
from wand.image import Image | |
list = os.listdir(os.getcwd()) | |
tuples = [] | |
for file in list: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# -*- coding: utf-8 -*- | |
### Script courtesy of Dominic-MP. Thanks Dominic! | |
### NOTES: The following data is hard-coded: | |
### * "M251_" in source file names--this is necessary to parse the roll number. This needs to be updated for other publications or it will not be able to open the files. | |
### * Source XML files need to be in subdirectory titled "metadata" and have file names "M268_ROLL_metadata.xml", where "ROLL" is a four-digit number with leading zeroes | |
### * Following fields are all hard-coded based on M268's data: Level of description (file unit), general records type, data control group, use restriction, access restriction, online resource note, variant control number, physical occurrence, copy status, reference unit, location, media occurrence, general media type, object type, object designator, thumbnail file name. | |
### * All file paths must be in the form "https://opaexport-conv.s3.amazonaws.com/" + supplied path. | |
### * All online resources must be in the form "http://www.fold3.com/image/" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import os, re | |
file = 'm384-import-11.xml' | |
filenames = ['M384_0201_output.xml','M384_0202_output.xml','M384_0203_output.xml','M384_0204_output.xml','M384_0205_output.xml','M384_0206_output.xml','M384_0207_output.xml','M384_0208_output.xml','M384_0209_output.xml','M384_0210_output.xml','M384_0211_output.xml','M384_0212_output.xml','M384_0213_output.xml','M384_0214_output.xml','M384_0215_output.xml','M384_0216_output.xml','M384_0217_output.xml','M384_0218_output.xml','M384_0219_output.xml','M384_0220_output.xml'] | |
counter = 2 | |
outputfile = file | |
for fname in filenames: | |
in_size = (os.stat(fname).st_size / 1000000) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# -*- coding: utf-8 -*- | |
import csv, xml, re, time, os, datetime | |
import xml.etree.ElementTree as ET | |
x = 0 | |
while x < 426: | |
roll = 2 + x | |
file = 'M269_' + str(roll).zfill(4) | |
## This part takes the partner XML and reformats it to more usable XML (i.e. going from attributes to elements - http://www.ibm.com/developerworks/library/x-eleatt/). The reformatted XML is saved as a new document with "_(reformatted)" appended to the name, so that the original file is not altered. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# -*- coding: utf-8 -*- | |
### NOTES: The following data is hard-coded: | |
### * "M268_" in source file names--this is necessary to parse the roll number. This needs to be updated for other publications or it will not be able to open the files. | |
### * Source XML files need to be in subdirectory titled "metadata" and have file names "M268_ROLL_metadata.xml", where "ROLL" is a four-digit number with leading zeroes | |
### * Following fields are all hard-coded based on M268's data: Level of description (file unit), general records type, data control group, use restriction, access restriction, online resource note, variant control number, physical occurrence, copy status, reference unit, location, media occurrence, general media type, object type, object designator, thumbnail file name. | |
### * All file paths must be in the form "https://opaexport-conv.s3.amazonaws.com/" + supplied path. | |
### * All online resources must be in the form "http://www.fold3.com/image/" + footnote ID. | |
### * The objects file is set to be |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import requests, json, csv, urllib, argparse | |
## This is what allows the user to pass the initial Wikipedia category as an argument, such as'--c "History of the United States"'. | |
parser = argparse.ArgumentParser() | |
parser.add_argument('--c', dest='cat', metavar='CAT', | |
action='store') | |
args = parser.parse_args() | |
## The script will create two CSVs. One with the articles and page views, and another that is a running list of subcategories, so that it can continue to run down the list and take each new category in turn. Here, the names of the CSVs are generated from the initial category given by the user, and a set is created, starting with that category, to ensure duplicates are not added. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> | |
<html xmlns="http://www.w3.org/1999/xhtml" lang="en"> | |
<head> | |
<title>Upcoming Events</title> | |
<meta http-equiv="Content-Type" content="text/html;" /> | |
<meta http-equiv="Content-Language" content="en-US" /> | |
<link rel="icon" href="http://archives.gov/favicon.ico" type="image/x-icon" /> | |
<link rel="shortcut icon" href="http://archives.gov/favicon.ico" /> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
# -*- coding: utf-8 -*- | |
import requests, json, csv, argparse | |
parser = argparse.ArgumentParser() | |
parser.add_argument('--series_NAID', dest='series_NAID', metavar='SERIES_NAID', | |
action='store') | |
parser.add_argument('--file_units', dest='file_units', metavar='FILE_UNITS', | |
action='store') |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
18503259 | 07542 | |
---|---|---|
18475522 | 28003 | |
18471472 | 05386 | |
17412775 | 14289 | |
17408517 | 27799 | |
17408508 | 27773 | |
17408507 | 27772 | |
17408488 | 27714 | |
17408487 | 27714 | |
17408401 | 27426 |
NewerOlder