Skip to content

Instantly share code, notes, and snippets.

@andreldm
Created March 6, 2015 14:10
Show Gist options
  • Save andreldm/d767430a74bd58baf62f to your computer and use it in GitHub Desktop.
Save andreldm/d767430a74bd58baf62f to your computer and use it in GitHub Desktop.
Script to convert doc[x] to html
# Run before: [System.Reflection.Assembly]::LoadWithPartialName("Microsoft.Office.Interop.Word")
# Usage: .\doc2html.ps1 -docpath "C:\full\path\to\docs\folder" -htmlpath "C:\full\path\output\folder"
param([string]$docpath,[string]$htmlpath = $docpath);
$srcfiles = Get-ChildItem $docPath -include *.doc,*.docx -recurse
$saveFormat = [Enum]::Parse([Microsoft.Office.Interop.Word.WdSaveFormat], "wdFormatFilteredHTML");
$word = new-object -comobject word.application
$word.Visible = $False
function saveas-filteredhtml
{
$opendoc = $word.documents.open($doc.FullName, $false, $true);
$savepath = "$htmlpath\$doc.fullname.html";
$out = $doc.name -replace ".doc(x)?$", ".html";
$opendoc.saveas("$htmlpath\$out", $saveFormat);
$opendoc.close($false);
}
ForEach ($doc in $srcfiles)
{
Write-Host "Processing :" $doc.name
saveas-filteredhtml
$doc = $null
}
$word.quit();
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment