Last active
August 16, 2024 05:32
-
-
Save mklement0/8689b9b5123a9ba11df7214f82a673be to your computer and use it in GitHub Desktop.
PowerShell function that emulates Out-File for creating UTF-8-encoded files *without a BOM* (byte-order mark).
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<# | |
Prerequisites: PowerShell version 3 or above. | |
License: MIT | |
Author: Michael Klement <[email protected]> | |
DOWNLOAD and DEFINITION OF THE FUNCTION: | |
irm https://gist.github.com/mklement0/8689b9b5123a9ba11df7214f82a673be/raw/Out-FileUtf8NoBom.ps1 | iex | |
The above directly defines the function below in your session and offers guidance for making it available in future | |
sessions too. To silence the guidance information, append 4>$null | |
CAVEAT: If you run this command *from a script*, you'll get a spurious warning about dot-sourcing, which you can ignore | |
and suppress by appending 3>$null. However, it's best to avoid calling this command from scripts, because later versions | |
of this Gist aren't guaranteed to be backward-compatible; howevever, you can modify the command to lock in a | |
*specific revision* of this Gist, which is guaranteed not to change: see the instructions at | |
https://gist.github.com/mklement0/880624fd665073bb439dfff5d71da886?permalink_comment_id=4296669#gistcomment-4296669 | |
DOWNLOAD ONLY: | |
irm https://gist.github.com/mklement0/8689b9b5123a9ba11df7214f82a673be/raw > Out-FileUtf8NoBom.ps1 | |
Downloads to the specified file, which you then need to dot-source to make the function available in the current session: | |
. ./Out-FileUtf8NoBom.ps1 | |
To learn what the function does: | |
* see the next comment block | |
* or, once downloaded and defined, invoke the function with -? or pass its name to Get-Help. | |
#> | |
function Out-FileUtf8NoBom { | |
<# | |
.SYNOPSIS | |
Outputs to a UTF-8-encoded file *without a BOM* (byte-order mark). | |
.DESCRIPTION | |
Mimics the most important aspects of Out-File: | |
* Input objects are sent to Out-String first. | |
* -Append allows you to append to an existing file, -NoClobber prevents | |
overwriting of an existing file. | |
* -Width allows you to specify the line width for the text representations | |
of input objects that aren't strings. | |
However, it is not a complete implementation of all Out-File parameters: | |
* Only a literal output path is supported, and only as a parameter. | |
* -Force is not supported. | |
* Conversely, an extra -UseLF switch is supported for using LF-only newlines. | |
.NOTES | |
The raison d'être for this advanced function is that Windows PowerShell | |
lacks the ability to write UTF-8 files without a BOM: using -Encoding UTF8 | |
invariably prepends a BOM. | |
Copyright (c) 2017, 2022 Michael Klement <[email protected]> (http://same2u.net), | |
released under the [MIT license](https://spdx.org/licenses/MIT#licenseText). | |
#> | |
[CmdletBinding(PositionalBinding=$false)] | |
param( | |
[Parameter(Mandatory, Position = 0)] [string] $LiteralPath, | |
[switch] $Append, | |
[switch] $NoClobber, | |
[AllowNull()] [int] $Width, | |
[switch] $UseLF, | |
[Parameter(ValueFromPipeline)] $InputObject | |
) | |
begin { | |
# Convert the input path to a full one, since .NET's working dir. usually | |
# differs from PowerShell's. | |
$dir = Split-Path -LiteralPath $LiteralPath | |
if ($dir) { $dir = Convert-Path -ErrorAction Stop -LiteralPath $dir } else { $dir = $pwd.ProviderPath } | |
$LiteralPath = [IO.Path]::Combine($dir, [IO.Path]::GetFileName($LiteralPath)) | |
# If -NoClobber was specified, throw an exception if the target file already | |
# exists. | |
if ($NoClobber -and (Test-Path $LiteralPath)) { | |
Throw [IO.IOException] "The file '$LiteralPath' already exists." | |
} | |
# Create a StreamWriter object. | |
# Note that we take advantage of the fact that the StreamWriter class by default: | |
# - uses UTF-8 encoding | |
# - without a BOM. | |
$sw = New-Object System.IO.StreamWriter $LiteralPath, $Append | |
$htOutStringArgs = @{} | |
if ($Width) { $htOutStringArgs += @{ Width = $Width } } | |
try { | |
# Create the script block with the command to use in the steppable pipeline. | |
$scriptCmd = { | |
& Microsoft.PowerShell.Utility\Out-String -Stream @htOutStringArgs | | |
. { process { if ($UseLF) { $sw.Write(($_ + "`n")) } else { $sw.WriteLine($_) } } } | |
} | |
$steppablePipeline = $scriptCmd.GetSteppablePipeline($myInvocation.CommandOrigin) | |
$steppablePipeline.Begin($PSCmdlet) | |
} | |
catch { throw } | |
} | |
process | |
{ | |
$steppablePipeline.Process($_) | |
} | |
end { | |
$steppablePipeline.End() | |
$sw.Dispose() | |
} | |
} | |
# -------------------------------- | |
# GENERIC INSTALLATION HELPER CODE | |
# -------------------------------- | |
# Provides guidance for making the function persistently available when | |
# this script is either directly invoked from the originating Gist or | |
# dot-sourced after download. | |
# IMPORTANT: | |
# * DO NOT USE `exit` in the code below, because it would exit | |
# the calling shell when Invoke-Expression is used to directly | |
# execute this script's content from GitHub. | |
# * Because the typical invocation is DOT-SOURCED (via Invoke-Expression), | |
# do not define variables or alter the session state via Set-StrictMode, ... | |
# *except in child scopes*, via & { ... } | |
if ($MyInvocation.Line -eq '') { | |
# Most likely, this code is being executed via Invoke-Expression directly | |
# from gist.github.com | |
# To simulate for testing with a local script, use the following: | |
# Note: Be sure to use a path and to use "/" as the separator. | |
# iex (Get-Content -Raw ./script.ps1) | |
# Derive the function name from the invocation command, via the enclosing | |
# script name presumed to be contained in the URL. | |
# NOTE: Unfortunately, when invoked via Invoke-Expression, $MyInvocation.MyCommand.ScriptBlock | |
# with the actual script content is NOT available, so we cannot extract | |
# the function name this way. | |
& { | |
param($invocationCmdLine) | |
# Try to extract the function name from the URL. | |
$funcName = $invocationCmdLine -replace '^.+/(.+?)(?:\.ps1).*$', '$1' | |
if ($funcName -eq $invocationCmdLine) { | |
# Function name could not be extracted, just provide a generic message. | |
# Note: Hypothetically, we could try to extract the Gist ID from the URL | |
# and use the REST API to determine the first filename. | |
Write-Verbose -Verbose "Function is now defined in this session." | |
} | |
else { | |
# Indicate that the function is now defined and also show how to | |
# add it to the $PROFILE or convert it to a script file. | |
Write-Verbose -Verbose @" | |
Function `"$funcName`" is now defined in this session. | |
* If you want to add this function to your `$PROFILE, run the following: | |
"``nfunction $funcName {``n`${function:$funcName}``n}" | Add-Content `$PROFILE | |
* If you want to convert this function into a script file that you can invoke | |
directly, run: | |
"`${function:$funcName}" | Set-Content $funcName.ps1 -Encoding $('utf8' + ('', 'bom')[[bool] (Get-Variable -ErrorAction Ignore IsCoreCLR -ValueOnly)]) | |
"@ | |
} | |
} $MyInvocation.MyCommand.Definition # Pass the original invocation command line to the script block. | |
} | |
else { | |
# Invocation presumably as a local file after manual download, | |
# either dot-sourced (as it should be) or mistakenly directly. | |
& { | |
param($originalInvocation) | |
# Parse this file to reliably extract the name of the embedded function, | |
# irrespective of the name of the script file. | |
$ast = $originalInvocation.MyCommand.ScriptBlock.Ast | |
$funcName = $ast.Find( { $args[0] -is [System.Management.Automation.Language.FunctionDefinitionAst] }, $false).Name | |
if ($originalInvocation.InvocationName -eq '.') { | |
# Being dot-sourced as a file. | |
# Provide a hint that the function is now loaded and provide | |
# guidance for how to add it to the $PROFILE. | |
Write-Verbose -Verbose @" | |
Function `"$funcName`" is now defined in this session. | |
If you want to add this function to your `$PROFILE, run the following: | |
"``nfunction $funcName {``n`${function:$funcName}``n}" | Add-Content `$PROFILE | |
"@ | |
} | |
else { | |
# Mistakenly directly invoked. | |
# Issue a warning that the function definition didn't take effect and | |
# provide guidance for reinvocation and adding to the $PROFILE. | |
Write-Warning @" | |
This script contains a definition for function "$funcName", but this definition | |
only takes effect if you dot-source this script. | |
To define this function for the current session, run: | |
. "$($originalInvocation.MyCommand.Path)" | |
"@ | |
} | |
} $MyInvocation # Pass the original invocation info to the helper script block. | |
} |
Thanks for pointing out the problem, @hunandy14 - and sorry that your $PROFILE got corrupted.
Yes, in Windows PowerShell >>
(Out-File -Append
) blindly appends UTF-16LE encoding, which can lead to file corruption here.
(In PowerShell (Core) 7+, everything now defaults to BOM-less UTF-8, so this function isn't necessary to begin with.)
The simpler solution is to use Add-Content
, which actually tries to match the encoding of the preexisting content (and in Windows PowerShell that content may actually be ANSI-encoded).
I've updated the code accordingly.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi, appreciate your article to help me figure out the problem of UTF-8 format.
My $PROFILE file was originally in UTF-8 format, which contains Chinese text.
When I follow the instructions below, the $PROFILE file will turn to ISO 8859-2 format, it will cause damage for the Chinese text and become a garbled message.
Maybe it would be better to amend like the following below? (I think this problem only happen to people who doesn’t use English.)