Skip to content

Instantly share code, notes, and snippets.

@tucotuco
Created July 23, 2012 22:30
Show Gist options
  • Save tucotuco/3166655 to your computer and use it in GitHub Desktop.
Save tucotuco/3166655 to your computer and use it in GitHub Desktop.
UTF8 BOM prepender
#!/usr/bin/awk -f
# Prepends a three-byte (three bytes \xEF\xBB\xBF)
# Byte Order Marker (BOM) to a file to explicitly
# show UTF8 encoding.
# Assumptions:
# Original file is actually UTF8-encoded.
# Example:
# gawk -f utf8er.awk source.csv dest_with_bom.csv
BEGIN {
x=0
}
{ if ( x==0 && $0 !~ /^\xEF\xBB\xBF/ ) { x=x+1; print "\xEF\xBB\xBF"$0 }
else { x=x+1; print $0 }
}
END {}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment