Skip to content

Instantly share code, notes, and snippets.

@dbrady
Created May 26, 2011 06:05
Show Gist options
  • Save dbrady/992630 to your computer and use it in GitHub Desktop.
Save dbrady/992630 to your computer and use it in GitHub Desktop.
Unit Tests for CSV demonstrating that Enumerable is supported only at the whim of the underlying StringIO seek position
#!/usr/bin/env ruby
unless RUBY_VERSION.start_with? "1.9"
puts "This test file is for Ruby 1.9.x"
exit -1
end
require 'csv'
require 'test/unit'
# Hi James!
# This is written with love and respect. I'm a little cranky with this
# code at the moment, but I think once you see what I'm seeing you'll
# probably agree.
# 0. This is maddening. I just went to pull up the docco that I was
# working off of today, and I can't find it. I just found THIS page,
# which DOES have the examples I was hoping to find:
#
# http://www.ruby-doc.org/ruby-1.9/index.html
#
# Hang on, lemme go crack open the work laptop. Oh, wild. I was
# working entirely off of the CSV::Table documentation, because that's
# where Google led me.
# 1. I didn't get that CSV was a file I/O based metaphor from its
# documentation at all. That's because it doesn't HAVE much
# documentation. This thing NEEDS some examples desperately. Stack
# Overflow has a few examples of people saying "is there a simple
# example of how to use CSV?" It would be REALLY helpful if those
# simple examples were in the rdoc.
# 2. Note that my use case is probably simple and edge-casey. I'm not
# streaming data. I'm grabbing a CSV file wholesale, reading it in one
# go, and then pushing it out to an HTML table--that isn't allowed to
# get friendly with the report's native data format, so NO, it can't
# just use CSV's iteration helpers. :-)
# 3. I TOTALLY expected that CSV would uphold the Enumerable contract,
# because it says "Modules included: Enumerable" right on the tin.
# dbrady's not so very humble opinion: this cost me an hour of
# debugging time late in the afternoon of a deploy day. If you don't
# want to change it because I'm just using it wrong, PLEASE document
# this. I read your existing documentation in good faith and with a
# careful eye and was led down the garden path. (See point 0 above,
# though--I didn't know there was a CSV page apart from CSV::Table. I
# blame the crappy 4-iframe bullshit indexing of RDoc, that forces me
# to use Google to search the damn documentation.)
# Then again, if you're not shocked and appalled by the tests that
# follow, what I *really* want to know is who are you and what have
# you done with James Edward Gray II?
# The tests that follow are in two groups. test_workaround_* tests DO
# pass, but demonstrate the need to completely reach up CSV's butt and
# violate its encapsulation rather vigorously. All other tests fail,
# demonstrating the problem that CSV presents an Enumerable and/or
# random-access interface that is completely dependent on the StringIO
# seek position.
class TestCsvSeekDependencies < Test::Unit::TestCase
def setup
@csv = CSV.new(<<PETS, :headers => true)
"name","type","age"
Noni,"Border Collie/Golden Retriever Mix",8
BamBam,Cat,2
Pebbles,Cat,2
PETS
end
def test_sanity
assert true, "This really should have been true."
end
def test_headers_do_not_depend_on_first_fetch
assert_equal ["name", "type", "age"], @csv.headers
end
def test_workaround_headers_do_not_depend_on_first_fetch
@csv.count
assert_equal ["name", "type", "age"], @csv.headers
end
def test_headers_does_not_get_erased_by_rewind
@csv.count
assert_equal ["name", "type", "age"], @csv.headers
@csv.rewind
assert_equal ["name", "type", "age"], @csv.headers
end
def test_count_does_not_depend_on_seek_position
assert_equal 3, @csv.count
assert_equal 3, @csv.count
end
def test_workaround_count_does_not_depend_on_seek_position
assert_equal 3, @csv.count
@csv.rewind
assert_equal 3, @csv.count
@csv.rewind
assert_equal 3, @csv.count
end
def test_first_does_not_depend_on_seek_position
assert_equal "Noni", @csv.first["name"]
assert_equal "Noni", @csv.first["name"]
end
def test_workaround_first_does_not_depend_on_seek_position
assert_equal "Noni", @csv.first["name"]
@csv.rewind
assert_equal "Noni", @csv.first["name"]
end
def test_first_does_not_depend_on_count_not_having_been_called_previously
assert_equal "Noni", @csv.first["name"]
assert_equal 3, @csv.count
end
def test_workaround_first_does_not_depend_on_count_not_having_been_called_previously
assert_equal "Noni", @csv.first["name"]
@csv.rewind
assert_equal 3, @csv.count
end
def test_each_should_not_depend_on_each_not_having_been_called
names = []
@csv.each do |row|
names << row["name"]
end
assert_equal ["Noni", "BamBam", "Pebbles"], names
names = []
@csv.each do |row|
names << row["name"]
end
assert_equal ["Noni", "BamBam", "Pebbles"], names
end
def test_each_should_not_depend_on_first_not_having_been_called
names = []
@csv.first
@csv.each do |row|
names << row["name"]
end
assert_equal ["Noni", "BamBam", "Pebbles"], names
end
def test_map_should_not_depend_on_geez_we_get_it_already
names = @csv.map {|row| row["name"] }
assert_equal ["Noni", "BamBam", "Pebbles"], names
names = @csv.map {|row| row["name"] }
assert_equal ["Noni", "BamBam", "Pebbles"], names
end
def test_anyp_should_not_depend_on_well_you_see_the_problem_now
assert_equal true, @csv.any? {|row| row["name"] == "BamBam"}
# next line passes, because seek pos is left at line 2
assert_equal true, @csv.any? {|row| row["name"] == "Pebbles"}
# fails because seek pos is now at EOF
assert_equal true, @csv.any? {|row| row["name"] == "Noni"}
end
def test_etc_and_so_forth_for_all_of_enumerable
@csv.count
assert_equal [["name", "BamBam"], ["type", "Cat"], ["age", "2"]], @csv.find {|row| row["name"] == "BamBam"}.to_a
assert_equal 12, @csv.inject(0) {|a,b| a + b["age"].to_i}
assert_equal [["name", "Pebbles"], ["type", "Cat"], ["age", "2"]], @csv.drop(2)[0].to_a
# ...etc...
end
def test_if_youre_gonna_play_in_texas_you_gotta_have_grep_in_the_band
# this would ALSO be dependent on seek position if it wasn't broken
assert_not_equal [], @csv.grep(/BamBam/)
end
end
# So... there you go. I hope you find this useful, and I *REALLY* hope
# you'll continue to support Enumerable in CSV. By which I mean start
# supporting. *cough* ;-)
# It may not be possible to leave the seek position alone while
# guaranteeing that Enumerable functions like map() and first() work
# correctly. You may well need to split Enumerable out of CSV
# altogether. If you do, like I said above, PLEASE provide a way to
# permit it, even if it means casting the whole file to_a or similar.
# Cf. ruby's IO#read_lines() method that just gives you back an array.
# Thanks for all your hard work! I'm grumpy because of time lost and
# the mega schedule stress it induced. I am NOT grumpy that you
# created this library and put it out there. Which is why I'm giving
# you unit tests rather than a complaint form. :-)
@JEG2
Copy link

JEG2 commented May 27, 2011 via email

@dbrady
Copy link
Author

dbrady commented May 27, 2011 via email

@JEG2
Copy link

JEG2 commented May 27, 2011 via email

@dbrady
Copy link
Author

dbrady commented May 27, 2011 via email

@JEG2
Copy link

JEG2 commented May 27, 2011 via email

@dbrady
Copy link
Author

dbrady commented May 27, 2011 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment