This contains the files I used to perform the timings, as well as the timings themselves.
The timings are to process one bag with 60,000 small files and one bag with one large (10GB) file. Scripts related to the bag with many files are named like *-lots
, and scripts related to the bag with one large file are named like *-large
.
$ ruby -version
ruby 2.0.0p247 (2013-06-27 revision 41674) [x86_64-darwin12.4.0]
The ruby version is a fork of https://github.com/tipr/bagit with these changes.
The actual script that I'm running for this uses this changes. It's listed below as bagit-dir
.
$ go version
go version go1.1.2 darwin/amd64
The Go version is of https://github.com/APTrust/bagins.
I generated the input files using the rand-lots
and rand-large
scripts. Output went into the directories bag-lots
and bag-large
.
Timings were done with the time-lots
and time-large
scripts. They ran each processor five times and used the UNIX utility time
. Kind of the sledgehammer approach to benchmarking.