Last active
June 29, 2017 12:13
-
-
Save Snake-Sanders/f2d8ea9e13192590d8f400b8973e3bca to your computer and use it in GitHub Desktop.
Convert Docx for Office 365.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require "fileutils" | |
# Version 3.1 | |
# | |
# Description: This script solves the problem when opening docx files with the new office 360. Some documents report "Xml parsing error" | |
# This is due to an deprecated xml tag. This script parses the docx and removes them. | |
# | |
# Requires 7zip installed and added to the path. | |
# | |
# usage: | |
# ruby convertToDocx365 file.docx | |
# | |
# This will generate a converted docx file | |
# Remember to open the docx and refresh the index table | |
# @ToDo replace 7zip and use a gem to compres the file. | |
# Begin: | |
# requires a xml filename as parameter to replace its content | |
module ConvertToDocx365 | |
def self.Convert file_name | |
puts "file to open : #{file_name}" | |
# required to match preserve and preserver | |
match_patern = /xml:space="preserv(er"|e")/ | |
text = File.read(file_name) | |
text.gsub!(match_patern, "" ) | |
# To write changes to the file, use: | |
File.open(file_name, "w") {|file| file.puts text } | |
puts "matches were replaced OK" | |
return true | |
end | |
def self.run( cmd ) | |
puts cmd | |
system cmd | |
end | |
end | |
puts "Convertions begins:" | |
docx_file = ARGV[0] | |
if( docx_file.index(' ') != nil) | |
puts "the file name contains spaces" | |
abort | |
end | |
target_doc = 'document.xml' | |
target_dir = "#{docx_file}_temp" | |
puts (File.basename docx_file) | |
puts "extracting #{target_doc}" | |
# x = extract using folder structure | |
# -r = recursive | |
# -oc = output folder | |
CMD_UNZIP = "7z x #{docx_file} -oc:#{target_dir} -r" | |
ConvertToDocx365::run CMD_UNZIP | |
puts "Replacing paterns" | |
res = ConvertToDocx365::Convert File.join( '.', target_dir, 'word', target_doc) | |
if res then | |
# rename the original file to "old" | |
CMD_MOVE = "mv #{docx_file} #{docx_file}.old" | |
ConvertToDocx365::run CMD_MOVE | |
# a = add to zip | |
CMD_ZIP = "7z a #{docx_file} .\\#{target_dir}\\* -r" | |
ConvertToDocx365::run CMD_ZIP | |
puts "Done" | |
else | |
puts "Failed" | |
end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment