Be sure to scroll sideways, so you can take it all in.
# INSTITUTION ROUTES
institution_ptrn = /(?:[a-zA-Z0-9\-_]+)(?:\.[a-zA-Z]+)+/
# In Rails 3.2.13 using accepts_nested_attributes_for, a scoped | |
# validates_uniqueness_of validator will incorrectly reject unique values. | |
# Below are two models, Question and AnswerChoice. The validates_uniqueness_of | |
# validator in AnswerChoice says that AnswerChoice label must be unique within a | |
# Question. This validator incorrectly rejects unique labels when they are | |
# reordered. Follow these steps to see the bug. | |
# | |
# This occurs when using nested resources and allowing users to edit the | |
# Question and AnswerChoices in a single form. The form looks like this: |
# Given an S3 URL, this returns a signed S3 url that will force the browser | |
# to download the file rather than opening it in the browser window. The key | |
# is to add response-content-disposition to the query string, which tells S3 | |
# to send the content-disposition header you specify. S3 will respect this | |
# parameter only if the URL is signed. You can override other S3 headers such | |
# as content-type using this same method. See: | |
# | |
# http://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectGET.html | |
# | |
def s3_download_url(url) |
curl "https://storage.googleapis.com/golang/go1.4.linux-amd64.tar.gz" > go1.4.linux-amd64.tar.gz | |
sudo tar -C /usr/local -xzf go1.4.linux-amd64.tar.gz | |
rm go1.4.linux-amd64.tar.gz | |
rm -rf go1.4.linux-amd64 | |
curl "https://s3.amazonaws.com/bitly-downloads/nsq/nsq-0.3.0.linux-amd64.go1.3.3.tar.gz" > nsq-0.3.0.linux-amd64.go1.3.3.tar.gz | |
tar -xzf nsq-0.3.0.linux-amd64.go1.3.3.tar.gz | |
cd nsq-0.3.0.linux-amd64.go1.3.3/bin | |
cp * ~/go/bin | |
rm nsq-0.3.0.linux-amd64.go1.3.3.tar.gz |
After copying a bag (via rsync) from the ingest node, I send this to the node that asked me to replicate: | |
PUT http://localhost:3003/api-v1/replicate/30000000-0000-4000-a000-000000000001/ | |
{ | |
"from_node": "hathi", | |
"to_node": "aptrust", | |
"uuid": "00000000-0000-4000-a000-000000000003", | |
"replication_id": "30000000-0000-4000-a000-000000000001", | |
"fixity_algorithm": "sha256", | |
"fixity_nonce": null, |
I sync replication requests from remote nodes to my own node. Any requests in which I'm the to_node go into my processing queue. For each replication request in the queue, I do this: | |
1. Copy the bag from the remote node, via rsync/ssh. | |
2. Calculate the sha256 digest of the bag's tag manifest. | |
3. Send that fixity value back to the ingest node. If I get back a record in which StoreRequested == false, I delete the bag from my staging area and consider the job done. | |
4. If StoreRequested == true, I validate the bag by making sure all required files and tags are present, and all checksums in the manifest-sha256.txt match. If the bag is invalid, I cancel the transfer on the remote node with a cancel reason indicating that the bag did not pass validation. I delete the bag from staging, and am done. | |
5. If the bag is valid, I copy it to long-term storage and delete it from my staging area. | |
6. I update the transfer record on the ingest node to say Stored = true. | |
My own node does not know the bag is stored until the n |
// In the code below, extract is a tar-stream extract object. | |
// See https://github.com/mafintosh/tar-stream/blob/master/extract.js | |
// | |
// This 'entry' handler does not care about the contents of the entry, | |
// it just has to restructure the tar header data into a format that | |
// the caller understands. It then passes 'stream' to a PassThrough | |
// which discards the contents and allows tar-stream to move on to | |
// the next entry. | |
// | |
// The problem with the initial pipe to PassThrough on line 35 is that |
// This code shows how to stop a JavaScript async.queue after | |
// the first error. | |
const async = require('async'); | |
// Create a queue that calls the write function (defined below). | |
// Concurrency is set to 1 to guarantee sequential operation. | |
let q = async.queue(write, 1); | |
// The drain function is called when all tasks in the queue are complete. |