Skip to content

Instantly share code, notes, and snippets.

@seyyah
Forked from l3thal/gist:6836116
Created June 18, 2018 13:37
Show Gist options
  • Save seyyah/8313e9132ac5823c81a46f842d293ea1 to your computer and use it in GitHub Desktop.
Save seyyah/8313e9132ac5823c81a46f842d293ea1 to your computer and use it in GitHub Desktop.
docx unique word count
#!/usr/bin/ruby
require 'zip'
require 'nokogiri'
class Docx
def self.word_count(file, zip=Zip::ZipFile.open(file))
Nokogiri::XML.parse(zip.find_entry("word/document.xml").get_input_stream).text.split(" ").uniq.length
end
end
puts Docx.word_count("sample.docx")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment