Skip to content

Instantly share code, notes, and snippets.

@oiotoxt
Last active May 9, 2018 17:06
Show Gist options
  • Select an option

  • Save oiotoxt/5586ccbce94ba7018ba4 to your computer and use it in GitHub Desktop.

Select an option

Save oiotoxt/5586ccbce94ba7018ba4 to your computer and use it in GitHub Desktop.
#Requires -version 3.0
##############################################################################
# Filename : txt2utf8.ps1
# Version : v1.0
# Gist URL : https://gist.github.com/oiotoxt/5586ccbce94ba7018ba4
##############################################################################
#
# Convert text files recursively to UTF-8
#
# \ option : input directory
# \ option : with or without BOM
# \ option : CR+LF ==> LF
# \ option : filter
#
# Assumption
# Input-file-without-BOM is encoded in system-codepage(e.g. CP1252, CP949)
#
# Hint
# For Remote Signed : Run <Set-ExecutionPolicy RemoteSigned>
# For Unrestricted : Run <Set-ExecutionPolicy Unrestricted>
##############################################################################
# config
# input directory (relative or absolute. write drive-name in capital letter)
$_srcDir = "TextFolder"
#$_srcDir = "C:\TextFolder"
# with or without BOM
$_withBom = $TRUE
# CR+LF ==> LF
$_forceLf = $TRUE
# filter
[String[]] $_include = @("*.*")
#[String[]] $_include = @("*.h", "*.cpp")
# filter
[String[]] $_exclude = @("")
#[String[]] $_exclude = @("*.log")
##############################################################################
# output directory
$_dstDir = $_srcDir
if ($_withBom) {
$_dstDir = $_dstDir + ".UTF8+BOM"
} else {
$_dstDir = $_dstDir + ".UTF8+NOBOM"
}
if ($_forceLf) {
$_dstDir = $_dstDir + "&LF"
}
##############################################################################
# converter ( any ==> UTF8 )
Function ConvertToUtf8(
[String] $srcDir, [String] $dstDir,
[String[]] $include, [String[]] $exclude,
[bool] $withBom, [bool] $forceLF) {
$cnt = 0
$encoding = New-Object System.Text.UTF8Encoding($withBom)
foreach($item in Get-ChildItem $srcDir -Include $include -Exclude $exclude -Recurse) {
if ($item.PSIsContainer) {
continue
}
if (!$item.Fullname.Contains($srcDir)) {
Write-Host "Error. Please compare the following two lines."
Write-Host ">> `$item.Fullname :" $item.Fullname
Write-Host ">> `$srcDir :" $srcDir
return
}
$dst = $item.Fullname.Replace($srcDir, $dstDir)
if (!(Test-Path $(Split-Path $dst -Parent))) {
New-Item $(Split-Path $dst -Parent) -type Directory
}
Get-Content $item |
Out-File -Filepath $dst -Encoding utf8
if ($forceLF) {
$fileContents = [IO.File]::ReadAllText($dst) -Replace "`r`n", "`n"
} else {
$fileContents = [IO.File]::ReadAllText($dst)
}
[System.IO.File]::WriteAllText($dst, $fileContents, $encoding)
Write-Host "------------------------------------------------------------------------------"
Write-Host "`$src :" $item
Write-Host "`$dst :" $dst
Write-Host "LENGTH :" $item.Length "==>" (Get-Item $dst).Length
$cnt++
}
Write-Host "`n* * * * * * * *" $cnt "file(s) converted * * * * * * * *"
}
##############################################################################
# run
ConvertToUtf8 $_srcDir $_dstDir $_include $_exclude $_withBom $_forceLf
@oiotoxt
Copy link
Copy Markdown
Author

oiotoxt commented Jan 7, 2015

Text 파일들을 UTF8로 일괄 변경해주는 Windows PowerShell 스크립트 입니다.

옵션

  • 소스 폴더 지정
  • BOM(Byte Order Mark) 붙일지 말지 선택
  • 줄바꿈 문자 선택 (CRLF, LF)
  • 파일 필터 (*.cpp 만 선택 등)

주의!

  • 지정한 소스 폴더 내부에 BOM이 없는 파일이 있다면 현재 시스템의 코드 페이지(한글=CP949)로 인코딩된 파일이라고 가정합니다. (즉 UTF8 파일이 아니라고 가정합니다)

시나리오

  1. 관리자 권한으로 PowerShell 실행
  2. 다음 커맨드 실행

Set-ExecutionPolicy Unrestricted

  1. 다음 라인 수정

$_srcDir = "TextFolder"

  1. 다음 커맨드 실행

./txt2utf8.ps1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment