Skip to content

Instantly share code, notes, and snippets.

@kitplummer
Created December 7, 2009 18:27
Show Gist options
  • Select an option

  • Save kitplummer/250993 to your computer and use it in GitHub Desktop.

Select an option

Save kitplummer/250993 to your computer and use it in GitHub Desktop.
#code
require "fileutils"
module PDFToHTMLR
class PDFToHTMLRError < RuntimeError; end
VERSION = '0.1.0'
class PdfFile
attr :path
attr :target
def initialize(input_path, target_path)
@path = input_path
@target = target_path
# check to make sure file is legit
if (!File.exist?(@path))
raise PDFToHTMLRError, "invalid file path"
end
end
def convert()
`pdftohtml -stdout #{@path}`
end
end
end
#test code
def test_string_from_pdffile
file = PdfFile.new(TEST_PDF_PATH, nil)
contents = file.convert()
#p "OBJECT #{contents}"
assert_equal "String", contents.class.to_s
#assert_equal "TEST_HTML.to_s", contents
assert_equal IO.read("convert_test.html"),contents
end
#test results
Loaded suite /Users/kplummer/Development/Dozer/pdftohtmlr/test/pdftohtmlr_test
Started
"ERROR: invalid file path"
..F
Finished in 0.023023 seconds.
1) Failure:
test_string_from_pdffile:31
<"<!DOCTYPE HTML PUBLIC \\\"-//W3C//DTD HTML 4.01 Transitional//EN\\\">\\n<HTML>\\n<HEAD>\\n<TITLE>untitled</TITLE>\\n<META http-equiv=\\\"Content-Type\\\" content=\\\"text/html; charset=ISO-8859-1\\\">\\n<META name=\\\"generator\\\" content=\\\"pdftohtml 0.40\\\">\\n<META name=\\\"author\\\" content=\\\"Kit Plummer\\\">\\n<META name=\\\"keywords\\\" content=\\\"\\\">\\n<META name=\\\"date\\\" content=\\\"2009-12-07T00:01:15+00:00\\\">\\n</HEAD>\\n<BODY bgcolor=\\\"#A0A0A0\\\" vlink=\\\"blue\\\" link=\\\"blue\\\">\\n<A name=1></a>This is a test PDF document for the PDFtoHTMLR tool.<br>\\n<hr>\\n</BODY>\\n</HTML>\\n"> expected but was
<"<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\">\n<HTML>\n<HEAD>\n<TITLE>untitled</TITLE>\n<META http-equiv=\"Content-Type\" content=\"text/html; charset=ISO-8859-1\">\n<META name=\"generator\" content=\"pdftohtml 0.40\">\n<META name=\"author\" content=\"Kit Plummer\">\n<META name=\"keywords\" content=\"\">\n<META name=\"date\" content=\"2009-12-07T00:01:15+00:00\">\n</HEAD>\n<BODY bgcolor=\"#A0A0A0\" vlink=\"blue\" link=\"blue\">\n<A name=1></a>This is a test PDF document for the PDFtoHTMLR tool.<br>\n<hr>\n</BODY>\n</HTML>\n">.
3 tests, 4 assertions, 1 failures, 0 errors
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment