Skip to content

Instantly share code, notes, and snippets.

@innermond
Created November 21, 2014 13:56
Show Gist options
  • Select an option

  • Save innermond/1069160a63f449a7c5f7 to your computer and use it in GitHub Desktop.

Select an option

Save innermond/1069160a63f449a7c5f7 to your computer and use it in GitHub Desktop.
Convert docx file to an epub file
#!/bin/bash
# $1 = path to directory where docx files are stored
dir="$1"
for dox in $1/*.docx; do
# filename with spaces replaced with underscores
doc="${dox// /_}"
# path without extension
noext=${doc%.*}
echo $doc
# rename filename with spaces replaced
if [ "$dox" != "$doc" ]; then
mv "$dox" "$doc";
fi
# convert docx to plain text
soffice --headless --convert-to "txt:Text (encoded):UTF8" "$doc" --outdir "$dir"
# prepare paragraphs for pandoc
sed -i -r ':a;/^$/{d;ba};x;s/^.+$/\n/g;p;x' "$noext".txt
# make the epub
pandoc "$noext".txt -o "$noext".epub
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment