Skip to content

Instantly share code, notes, and snippets.

@nickleefly
Last active December 19, 2015 10:09
Show Gist options
  • Save nickleefly/5938752 to your computer and use it in GitHub Desktop.
Save nickleefly/5938752 to your computer and use it in GitHub Desktop.
Linux command oneline sed oneline, awd oneline, perl oneline The Carriage Return and Line Feed Characters -- How They Affect Text Files On Different Platforms How do I convert between Unix and Windows text files?
HANDY ONE-LINE SCRIPTS FOR AWK 30 April 2008
Compiled by Eric Pement - eric [at] pement.org version 0.27
Latest version of this file (in English) is usually at:
http://www.pement.org/awk/awk1line.txt
This file will also be available in other languages:
Chinese - http://ximix.org/translation/awk1line_zh-CN.txt
USAGE:
Unix: awk '/pattern/ {print "$1"}' # standard Unix shells
DOS/Win: awk '/pattern/ {print "$1"}' # compiled with DJGPP, Cygwin
awk "/pattern/ {print \"$1\"}" # GnuWin32, UnxUtils, Mingw
Note that the DJGPP compilation (for DOS or Windows-32) permits an awk
script to follow Unix quoting syntax '/like/ {"this"}'. HOWEVER, if the
command interpreter is CMD.EXE or COMMAND.COM, single quotes will not
protect the redirection arrows (<, >) nor do they protect pipes (|).
These are special symbols which require "double quotes" to protect them
from interpretation as operating system directives. If the command
interpreter is bash, ksh or another Unix shell, then single and double
quotes will follow the standard Unix usage.
Users of MS-DOS or Microsoft Windows must remember that the percent
sign (%) is used to indicate environment variables, so this symbol must
be doubled (%%) to yield a single percent sign visible to awk.
If a script will not need to be quoted in Unix, DOS, or CMD, then I
normally omit the quote marks. If an example is peculiar to GNU awk,
the command 'gawk' will be used. Please notify me if you find errors or
new commands to add to this list (total length under 65 characters). I
usually try to put the shortest script first. To conserve space, I
normally use '1' instead of '{print}' to print each line. Either one
will work.
FILE SPACING:
# double space a file
awk '1;{print ""}'
awk 'BEGIN{ORS="\n\n"};1'
# double space a file which already has blank lines in it. Output file
# should contain no more than one blank line between lines of text.
# NOTE: On Unix systems, DOS lines which have only CRLF (\r\n) are
# often treated as non-blank, and thus 'NF' alone will return TRUE.
awk 'NF{print $0 "\n"}'
# triple space a file
awk '1;{print "\n"}'
NUMBERING AND CALCULATIONS:
# precede each line by its line number FOR THAT FILE (left alignment).
# Using a tab (\t) instead of space will preserve margins.
awk '{print FNR "\t" $0}' files*
# precede each line by its line number FOR ALL FILES TOGETHER, with tab.
awk '{print NR "\t" $0}' files*
# number each line of a file (number on left, right-aligned)
# Double the percent signs if typing from the DOS command prompt.
awk '{printf("%5d : %s\n", NR,$0)}'
# number each line of file, but only print numbers if line is not blank
# Remember caveats about Unix treatment of \r (mentioned above)
awk 'NF{$0=++a " :" $0};1'
awk '{print (NF? ++a " :" :"") $0}'
# count lines (emulates "wc -l")
awk 'END{print NR}'
# print the sums of the fields of every line
awk '{s=0; for (i=1; i<=NF; i++) s=s+$i; print s}'
# add all fields in all lines and print the sum
awk '{for (i=1; i<=NF; i++) s=s+$i}; END{print s}'
# print every line after replacing each field with its absolute value
awk '{for (i=1; i<=NF; i++) if ($i < 0) $i = -$i; print }'
awk '{for (i=1; i<=NF; i++) $i = ($i < 0) ? -$i : $i; print }'
# print the total number of fields ("words") in all lines
awk '{ total = total + NF }; END {print total}' file
# print the total number of lines that contain "Beth"
awk '/Beth/{n++}; END {print n+0}' file
# print the largest first field and the line that contains it
# Intended for finding the longest string in field #1
awk '$1 > max {max=$1; maxline=$0}; END{ print max, maxline}'
# print the number of fields in each line, followed by the line
awk '{ print NF ":" $0 } '
# print the last field of each line
awk '{ print $NF }'
# print the last field of the last line
awk '{ field = $NF }; END{ print field }'
# print every line with more than 4 fields
awk 'NF > 4'
# print every line where the value of the last field is > 4
awk '$NF > 4'
STRING CREATION:
# create a string of a specific length (e.g., generate 513 spaces)
awk 'BEGIN{while (a++<513) s=s " "; print s}'
# insert a string of specific length at a certain character position
# Example: insert 49 spaces after column #6 of each input line.
gawk --re-interval 'BEGIN{while(a++<49)s=s " "};{sub(/^.{6}/,"&" s)};1'
ARRAY CREATION:
# These next 2 entries are not one-line scripts, but the technique
# is so handy that it merits inclusion here.
# create an array named "month", indexed by numbers, so that month[1]
# is 'Jan', month[2] is 'Feb', month[3] is 'Mar' and so on.
split("Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec", month, " ")
# create an array named "mdigit", indexed by strings, so that
# mdigit["Jan"] is 1, mdigit["Feb"] is 2, etc. Requires "month" array
for (i=1; i<=12; i++) mdigit[month[i]] = i
TEXT CONVERSION AND SUBSTITUTION:
# IN UNIX ENVIRONMENT: convert DOS newlines (CR/LF) to Unix format
awk '{sub(/\r$/,"")};1' # assumes EACH line ends with Ctrl-M
# IN UNIX ENVIRONMENT: convert Unix newlines (LF) to DOS format
awk '{sub(/$/,"\r")};1'
# IN DOS ENVIRONMENT: convert Unix newlines (LF) to DOS format
awk 1
# IN DOS ENVIRONMENT: convert DOS newlines (CR/LF) to Unix format
# Cannot be done with DOS versions of awk, other than gawk:
gawk -v BINMODE="w" '1' infile >outfile
# Use "tr" instead.
tr -d \r <infile >outfile # GNU tr version 1.22 or higher
# delete leading whitespace (spaces, tabs) from front of each line
# aligns all text flush left
awk '{sub(/^[ \t]+/, "")};1'
# delete trailing whitespace (spaces, tabs) from end of each line
awk '{sub(/[ \t]+$/, "")};1'
# delete BOTH leading and trailing whitespace from each line
awk '{gsub(/^[ \t]+|[ \t]+$/,"")};1'
awk '{$1=$1};1' # also removes extra space between fields
# insert 5 blank spaces at beginning of each line (make page offset)
awk '{sub(/^/, " ")};1'
# align all text flush right on a 79-column width
awk '{printf "%79s\n", $0}' file*
# center all text on a 79-character width
awk '{l=length();s=int((79-l)/2); printf "%"(s+l)"s\n",$0}' file*
# substitute (find and replace) "foo" with "bar" on each line
awk '{sub(/foo/,"bar")}; 1' # replace only 1st instance
gawk '{$0=gensub(/foo/,"bar",4)}; 1' # replace only 4th instance
awk '{gsub(/foo/,"bar")}; 1' # replace ALL instances in a line
# substitute "foo" with "bar" ONLY for lines which contain "baz"
awk '/baz/{gsub(/foo/, "bar")}; 1'
# substitute "foo" with "bar" EXCEPT for lines which contain "baz"
awk '!/baz/{gsub(/foo/, "bar")}; 1'
# change "scarlet" or "ruby" or "puce" to "red"
awk '{gsub(/scarlet|ruby|puce/, "red")}; 1'
# reverse order of lines (emulates "tac")
awk '{a[i++]=$0} END {for (j=i-1; j>=0;) print a[j--] }' file*
# if a line ends with a backslash, append the next line to it (fails if
# there are multiple lines ending with backslash...)
awk '/\\$/ {sub(/\\$/,""); getline t; print $0 t; next}; 1' file*
# print and sort the login names of all users
awk -F ":" '{print $1 | "sort" }' /etc/passwd
# print the first 2 fields, in opposite order, of every line
awk '{print $2, $1}' file
# switch the first 2 fields of every line
awk '{temp = $1; $1 = $2; $2 = temp}' file
# print every line, deleting the second field of that line
awk '{ $2 = ""; print }'
# print in reverse order the fields of every line
awk '{for (i=NF; i>0; i--) printf("%s ",$i);print ""}' file
# concatenate every 5 lines of input, using a comma separator
# between fields
awk 'ORS=NR%5?",":"\n"' file
SELECTIVE PRINTING OF CERTAIN LINES:
# print first 10 lines of file (emulates behavior of "head")
awk 'NR < 11'
# print first line of file (emulates "head -1")
awk 'NR>1{exit};1'
# print the last 2 lines of a file (emulates "tail -2")
awk '{y=x "\n" $0; x=$0};END{print y}'
# print the last line of a file (emulates "tail -1")
awk 'END{print}'
# print only lines which match regular expression (emulates "grep")
awk '/regex/'
# print only lines which do NOT match regex (emulates "grep -v")
awk '!/regex/'
# print any line where field #5 is equal to "abc123"
awk '$5 == "abc123"'
# print only those lines where field #5 is NOT equal to "abc123"
# This will also print lines which have less than 5 fields.
awk '$5 != "abc123"'
awk '!($5 == "abc123")'
# matching a field against a regular expression
awk '$7 ~ /^[a-f]/' # print line if field #7 matches regex
awk '$7 !~ /^[a-f]/' # print line if field #7 does NOT match regex
# print the line immediately before a regex, but not the line
# containing the regex
awk '/regex/{print x};{x=$0}'
awk '/regex/{print (NR==1 ? "match on line 1" : x)};{x=$0}'
# print the line immediately after a regex, but not the line
# containing the regex
awk '/regex/{getline;print}'
# grep for AAA and BBB and CCC (in any order on the same line)
awk '/AAA/ && /BBB/ && /CCC/'
# grep for AAA and BBB and CCC (in that order)
awk '/AAA.*BBB.*CCC/'
# print only lines of 65 characters or longer
awk 'length > 64'
# print only lines of less than 65 characters
awk 'length < 64'
# print section of file from regular expression to end of file
awk '/regex/,0'
awk '/regex/,EOF'
# print section of file based on line numbers (lines 8-12, inclusive)
awk 'NR==8,NR==12'
# print line number 52
awk 'NR==52'
awk 'NR==52 {print;exit}' # more efficient on large files
# print section of file between two regular expressions (inclusive)
awk '/Iowa/,/Montana/' # case sensitive
SELECTIVE DELETION OF CERTAIN LINES:
# delete ALL blank lines from a file (same as "grep '.' ")
awk NF
awk '/./'
# remove duplicate, consecutive lines (emulates "uniq")
awk 'a !~ $0; {a=$0}'
# remove duplicate, nonconsecutive lines
awk '!a[$0]++' # most concise script
awk '!($0 in a){a[$0];print}' # most efficient script
CREDITS AND THANKS:
Special thanks to the late Peter S. Tillier (U.K.) for helping me with
the first release of this FAQ file, and to Daniel Jana, Yisu Dong, and
others for their suggestions and corrections.
For additional syntax instructions, including the way to apply editing
commands from a disk file instead of the command line, consult:
"sed & awk, 2nd Edition," by Dale Dougherty and Arnold Robbins
(O'Reilly, 1997)
"UNIX Text Processing," by Dale Dougherty and Tim O'Reilly (Hayden
Books, 1987)
"GAWK: Effective awk Programming," 3d edition, by Arnold D. Robbins
(O'Reilly, 2003) or at http://www.gnu.org/software/gawk/manual/
To fully exploit the power of awk, one must understand "regular
expressions." For detailed discussion of regular expressions, see
"Mastering Regular Expressions, 3d edition" by Jeffrey Friedl (O'Reilly,
2006).
The info and manual ("man") pages on Unix systems may be helpful (try
"man awk", "man nawk", "man gawk", "man regexp", or the section on
regular expressions in "man ed").
USE OF '\t' IN awk SCRIPTS: For clarity in documentation, I have used
'\t' to indicate a tab character (0x09) in the scripts. All versions of
awk should recognize this abbreviation.
#---end of file---

How do I convert between Unix and Windows text files?

The format of Windows and Unix text files differs slightly. In Windows, lines end with both the line feed and carriage return ASCII characters, but Unix uses only a line feed. As a consequence, some Windows applications will not show the line breaks in Unix-format files. Likewise, Unix programs may display the carriage returns in Windows text files with Ctrl-m ( ^M ) characters at the end of each line.

There are many ways to solve this problem. This document provides instructions for using FTP, screen capture, unix2dos and dos2unix, tr, awk, Perl, and vi to do the conversion. To use these utilities, the files you are converting must be on a Unix computer.

Note: In the instructions below, replace unixfile.txt with the name of your Unix file, and replace winfile.txt with the Windows filename..

FTP When using an FTP program to move a text file between Unix and Windows, be sure the file is transferred in ASCII format, so the document is transformed into a text format appropriate for the host. Some FTP programs, especially graphical applications (e.g., Hummingbird FTP), do this automatically. If you are using command line FTP, before you begin the transfer, enter:

ascii Note: You need to use a client that supports secure FTP to transfer files to and from Indiana University's central systems. For more, see At IU, what SSH/SFTP clients are supported and where can I get them?

dos2unix and unix2dos The utilities dos2unix and unix2dos are available for converting files from the Unix command line.

To convert a Windows file to a Unix file, enter:

dos2unix winfile.txt unixfile.txt To convert a Unix file to Windows, enter:

unix2dos unixfile.txt winfile.txt tr You can use tr to remove all carriage returns and Ctrl-z ( ^Z ) characters from a Windows file:

tr -d '\15\32' < winfile.txt > unixfile.txt However, you cannot use tr to convert a document from Unix format to Windows.

awk To use awk to convert a Windows file to Unix, enter:

awk '{ sub("\r$", ""); print }' winfile.txt > unixfile.txt To convert a Unix file to Windows, enter:

awk 'sub("$", "\r")' unixfile.txt > winfile.txt Older versions of awk do not include the sub function. In such cases, use the same command, but replace awk with gawk or nawk.

Perl To convert a Windows text file to a Unix text file using Perl, enter:

perl -p -e 's/\r$//' < winfile.txt > unixfile.txt To convert from a Unix text file to a Windows text file, enter:

perl -p -e 's/\n/\r\n/' < unixfile.txt > winfile.txt You must use single quotation marks in either command line. This prevents your shell from trying to evaluate anything inside.

vi In vi, you can remove carriage return ( ^M ) characters with the following command:

:1,$s/^M//g Note: To input the ^M character, press Ctrl-v , and then press Enter or return.

In vim, use :set ff=unix to convert to Unix; use :set ff=dos to convert to Windows.

This document was developed with support from National Science Foundation (NSF) grant OCI-1053575. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the NSF.

Permalink

danielmiessler.com | study | crlf

If you're like I used to be, you always have trouble remembering the difference between how Windows and Linux terminate lines in text files. Does Windows add the extra stuff, or does Linux? What exactly is the extra stuff? How do I get the stuff out?

Well, hopefully by the end of this you'll be taken care of once and for all.

The Characters

First and foremost, let's establish what the characters are and the differences between them. Both characters are control characters, meaning they're invisible and meant to keep track of something within an application rather than be interfaced with by the user directly. The Carriage Return (CR) is represented by ASCII number 13, and came from the movement of a typwriter to the left of a sheet of paper. Think "returning of the carriage" to the left.

The Line Feed is represented by ASCII number 10, and it harkens back to the action of a typewriter rolling a piece of paper up by one line. Interestingly enough, the combination of these two functions is integrated into the ENTER/RETURN key. Also known as the CRLF character, this handy shortcut both moves you to the left and down a line.

Usage

Essentially, the crux of the whole CR / LF / File Corruption issue is the fact that Windows, Macs, and *Nix terminate text file lines differently. Below is a list of how they break down:

  • *Nix uses the LF character
  • Macs use the CR character
  • Windows uses both -- with the CR coming before the LF

How this ends up playing out is that if you write a file in Windows and transfer it bit for bit to a *Nix machine, it'll have extra CR characters that can cause all sorts of havoc. On the other hand, if you transfer a file from a *Nix machine to a Windows machine in the same way, you'll end up with a bunch of lines joined together by little boxes where there are supposed to be line breaks (because the lines are lacking the CR character).

How To Fix It

The good news is that there are plenty of ways to fix this problem. To start with, if you have ever used one of the more advanced FTP programs you've probably noticed the Binary and ASCII options. Well, if you use Binary, files are transfered "bit for bit", or exactly as they are between the source and destination. If a text file is transfered between a *Nix and Windows box (or vice versa) using this mode the symptoms mentioned above will surface.

If you use the ASCII mode, however, and you peform that same transfer, the CR / LF conversions are done for you, i.e. if it's a Windows --> *Nix transfer, the CR characters will be removed, and if it's a *Nix --> Windows transfer they will be added.

In addition, you can always use [tr][1] to translate from one to another:

Windows --> NIX: tr -d ' ' nixfile // delete the carriage returns

Mac --> NIX: tr ' ' ' ' nixfile // translate carriage returns into newlines

NIX --> Mac: tr ' ' ' ' nixfile // translate newlines into carriage returns

Yet another option is to do this from within [vi][2] like so:

:set fileformat = unix :w

You can simply change the format among the three (unix, mac, and dos) in this fashion. And when you save via :w, it rewrites the file in the correct format.:

☐ If you enjoyed this, please consider connecting via Twitter, RSS, or my other content.

Useful One-Line Scripts for Perl Jan 28 2012 | version 1.08
-------------------------------- ----------- ------------
Compiled by Peteris Krumins ([email protected], @pkrumins on Twitter)
http://www.catonmat.net -- good coders code, great reuse
Latest version of this file is always at:
http://www.catonmat.net/download/perl1line.txt
This file is also available in other languages:
(None at the moment.)
Please email me [email protected] if you wish to translate it.
Perl One-Liners on Github:
https://github.com/pkrumins/perl1line.txt
You can send me pull requests over GitHub! I accept bug fixes,
new one-liners, translations and everything else related.
I have also written "Perl One-Liners Explained" ebook that's based on
this file. It explains all the one-liners here. Get it at:
http://www.catonmat.net/blog/perl-book/
These one-liners work both on UNIX systems and Windows. Most likely your
UNIX system already has Perl. For Windows get the Strawberry Perl at:
http://www.strawberryperl.com/
Table of contents:
1. File Spacing
2. Line Numbering
3. Calculations
4. String Creation and Array Creation
5. Text Conversion and Substitution
6. Selective Printing and Deleting of Certain Lines
7. Handy Regular Expressions
8. Perl tricks
FILE SPACING
------------
# Double space a file
perl -pe '$\="\n"'
perl -pe 'BEGIN { $\="\n" }'
perl -pe '$_ .= "\n"'
perl -pe 's/$/\n/'
# Double space a file, except the blank lines
perl -pe '$_ .= "\n" unless /^$/'
perl -pe '$_ .= "\n" if /\S/'
# Triple space a file
perl -pe '$\="\n\n"'
perl -pe '$_.="\n\n"'
# N-space a file
perl -pe '$_.="\n"x7'
# Add a blank line before every line
perl -pe 's//\n/'
# Remove all blank lines
perl -ne 'print unless /^$/'
perl -lne 'print if length'
perl -ne 'print if /\S/'
# Remove all consecutive blank lines, leaving just one
perl -00 -pe ''
perl -00pe0
# Compress/expand all blank lines into N consecutive ones
perl -00 -pe '$_.="\n"x4'
# Fold a file so that every set of 10 lines becomes one tab-separated line
perl -lpe '$\ = $. % 10 ? "\t" : "\n"'
LINE NUMBERING
--------------
# Number all lines in a file
perl -pe '$_ = "$. $_"'
# Number only non-empty lines in a file
perl -pe '$_ = ++$a." $_" if /./'
# Number and print only non-empty lines in a file (drop empty lines)
perl -ne 'print ++$a." $_" if /./'
# Number all lines but print line numbers only non-empty lines
perl -pe '$_ = "$. $_" if /./'
# Number only lines that match a pattern, print others unmodified
perl -pe '$_ = ++$a." $_" if /regex/'
# Number and print only lines that match a pattern
perl -ne 'print ++$a." $_" if /regex/'
# Number all lines, but print line numbers only for lines that match a pattern
perl -pe '$_ = "$. $_" if /regex/'
# Number all lines in a file using a custom format (emulate cat -n)
perl -ne 'printf "%-5d %s", $., $_'
# Print the total number of lines in a file (emulate wc -l)
perl -lne 'END { print $. }'
perl -le 'print $n=()=<>'
perl -le 'print scalar(()=<>)'
perl -le 'print scalar(@foo=<>)'
perl -ne '}{print $.'
perl -nE '}{say $.'
# Print the number of non-empty lines in a file
perl -le 'print scalar(grep{/./}<>)'
perl -le 'print ~~grep{/./}<>'
perl -le 'print~~grep/./,<>'
perl -E 'say~~grep/./,<>'
# Print the number of empty lines in a file
perl -lne '$a++ if /^$/; END {print $a+0}'
perl -le 'print scalar(grep{/^$/}<>)'
perl -le 'print ~~grep{/^$/}<>'
perl -E 'say~~grep{/^$/}<>'
# Print the number of lines in a file that match a pattern (emulate grep -c)
perl -lne '$a++ if /regex/; END {print $a+0}'
perl -nE '$a++ if /regex/; END {say $a+0}'
CALCULATIONS
------------
# Check if a number is a prime
perl -lne '(1x$_) !~ /^1?$|^(11+?)\1+$/ && print "$_ is prime"'
# Print the sum of all the fields on a line
perl -MList::Util=sum -alne 'print sum @F'
# Print the sum of all the fields on all lines
perl -MList::Util=sum -alne 'push @S,@F; END { print sum @S }'
perl -MList::Util=sum -alne '$s += sum @F; END { print $s }'
# Shuffle all fields on a line
perl -MList::Util=shuffle -alne 'print "@{[shuffle @F]}"'
perl -MList::Util=shuffle -alne 'print join " ", shuffle @F'
# Find the minimum element on a line
perl -MList::Util=min -alne 'print min @F'
# Find the minimum element over all the lines
perl -MList::Util=min -alne '@M = (@M, @F); END { print min @M }'
perl -MList::Util=min -alne '$min = min @F; $rmin = $min unless defined $rmin && $min > $rmin; END { print $rmin }'
# Find the maximum element on a line
perl -MList::Util=max -alne 'print max @F'
# Find the maximum element over all the lines
perl -MList::Util=max -alne '@M = (@M, @F); END { print max @M }'
# Replace each field with its absolute value
perl -alne 'print "@{[map { abs } @F]}"'
# Find the total number of fields (words) on each line
perl -alne 'print scalar @F'
# Print the total number of fields (words) on each line followed by the line
perl -alne 'print scalar @F, " $_"'
# Find the total number of fields (words) on all lines
perl -alne '$t += @F; END { print $t}'
# Print the total number of fields that match a pattern
perl -alne 'map { /regex/ && $t++ } @F; END { print $t }'
perl -alne '$t += /regex/ for @F; END { print $t }'
perl -alne '$t += grep /regex/, @F; END { print $t }'
# Print the total number of lines that match a pattern
perl -lne '/regex/ && $t++; END { print $t }'
# Print the number PI to n decimal places
perl -Mbignum=bpi -le 'print bpi(n)'
# Print the number PI to 39 decimal places
perl -Mbignum=PI -le 'print PI'
# Print the number E to n decimal places
perl -Mbignum=bexp -le 'print bexp(1,n+1)'
# Print the number E to 39 decimal places
perl -Mbignum=e -le 'print e'
# Print UNIX time (seconds since Jan 1, 1970, 00:00:00 UTC)
perl -le 'print time'
# Print GMT (Greenwich Mean Time) and local computer time
perl -le 'print scalar gmtime'
perl -le 'print scalar localtime'
# Print local computer time in H:M:S format
perl -le 'print join ":", (localtime)[2,1,0]'
# Print yesterday's date
perl -MPOSIX -le '@now = localtime; $now[3] -= 1; print scalar localtime mktime @now'
# Print date 14 months, 9 days and 7 seconds ago
perl -MPOSIX -le '@now = localtime; $now[0] -= 7; $now[4] -= 14; $now[7] -= 9; print scalar localtime mktime @now'
# Prepend timestamps to stdout (GMT, localtime)
tail -f logfile | perl -ne 'print scalar gmtime," ",$_'
tail -f logfile | perl -ne 'print scalar localtime," ",$_'
# Calculate factorial of 5
perl -MMath::BigInt -le 'print Math::BigInt->new(5)->bfac()'
perl -le '$f = 1; $f *= $_ for 1..5; print $f'
# Calculate greatest common divisor (GCM)
perl -MMath::BigInt=bgcd -le 'print bgcd(@list_of_numbers)'
# Calculate GCM of numbers 20 and 35 using Euclid's algorithm
perl -le '$n = 20; $m = 35; ($m,$n) = ($n,$m%$n) while $n; print $m'
# Calculate least common multiple (LCM) of numbers 35, 20 and 8
perl -MMath::BigInt=blcm -le 'print blcm(35,20,8)'
# Calculate LCM of 20 and 35 using Euclid's formula: n*m/gcd(n,m)
perl -le '$a = $n = 20; $b = $m = 35; ($m,$n) = ($n,$m%$n) while $n; print $a*$b/$m'
# Generate 10 random numbers between 5 and 15 (excluding 15)
perl -le '$n=10; $min=5; $max=15; $, = " "; print map { int(rand($max-$min))+$min } 1..$n'
# Find and print all permutations of a list
perl -MAlgorithm::Permute -le '$l = [1,2,3,4,5]; $p = Algorithm::Permute->new($l); print @r while @r = $p->next'
# Generate the power set
perl -MList::PowerSet=powerset -le '@l = (1,2,3,4,5); for (@{powerset(@l)}) { print "@$_" }'
# Convert an IP address to unsigned integer
perl -le '$i=3; $u += ($_<<8*$i--) for "127.0.0.1" =~ /(\d+)/g; print $u'
perl -le '$ip="127.0.0.1"; $ip =~ s/(\d+)\.?/sprintf("%02x", $1)/ge; print hex($ip)'
perl -le 'print unpack("N", 127.0.0.1)'
perl -MSocket -le 'print unpack("N", inet_aton("127.0.0.1"))'
# Convert an unsigned integer to an IP address
perl -MSocket -le 'print inet_ntoa(pack("N", 2130706433))'
perl -le '$ip = 2130706433; print join ".", map { (($ip>>8*($_))&0xFF) } reverse 0..3'
perl -le '$ip = 2130706433; $, = "."; print map { (($ip>>8*($_))&0xFF) } reverse 0..3'
STRING CREATION AND ARRAY CREATION
----------------------------------
# Generate and print the alphabet
perl -le 'print a..z'
perl -le 'print ("a".."z")'
perl -le '$, = ","; print ("a".."z")'
perl -le 'print join ",", ("a".."z")'
# Generate and print all the strings from "a" to "zz"
perl -le 'print ("a".."zz")'
perl -le 'print "aa".."zz"'
# Create a hex lookup table
@hex = (0..9, "a".."f")
# Convert a decimal number to hex using @hex lookup table
perl -le '$num = 255; @hex = (0..9, "a".."f"); while ($num) { $s = $hex[($num%16)&15].$s; $num = int $num/16 } print $s'
perl -le '$hex = sprintf("%x", 255); print $hex'
perl -le '$num = "ff"; print hex $num'
# Generate a random 8 character password
perl -le 'print map { ("a".."z")[rand 26] } 1..8'
perl -le 'print map { ("a".."z", 0..9)[rand 36] } 1..8'
# Create a string of specific length
perl -le 'print "a"x50'
# Create a repeated list of elements
perl -le '@list = (1,2)x20; print "@list"'
# Create an array from a string
@months = split ' ', "Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec"
@months = qw/Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec/
# Create a string from an array
@stuff = ("hello", 0..9, "world"); $string = join '-', @stuff
# Find the numeric values for characters in the string
perl -le 'print join ", ", map { ord } split //, "hello world"'
# Convert a list of numeric ASCII values into a string
perl -le '@ascii = (99, 111, 100, 105, 110, 103); print pack("C*", @ascii)'
perl -le '@ascii = (99, 111, 100, 105, 110, 103); print map { chr } @ascii'
# Generate an array with odd numbers from 1 to 100
perl -le '@odd = grep {$_ % 2 == 1} 1..100; print "@odd"'
perl -le '@odd = grep { $_ & 1 } 1..100; print "@odd"'
# Generate an array with even numbers from 1 to 100
perl -le '@even = grep {$_ % 2 == 0} 1..100; print "@even"'
# Find the length of the string
perl -le 'print length "one-liners are great"'
# Find the number of elements in an array
perl -le '@array = ("a".."z"); print scalar @array'
perl -le '@array = ("a".."z"); print $#array + 1'
TEXT CONVERSION AND SUBSTITUTION
--------------------------------
# ROT13 a string
'y/A-Za-z/N-ZA-Mn-za-m/'
# ROT 13 a file
perl -lpe 'y/A-Za-z/N-ZA-Mn-za-m/' file
# Base64 encode a string
perl -MMIME::Base64 -e 'print encode_base64("string")'
perl -MMIME::Base64 -0777 -ne 'print encode_base64($_)' file
# Base64 decode a string
perl -MMIME::Base64 -le 'print decode_base64("base64string")'
perl -MMIME::Base64 -ne 'print decode_base64($_)' file
# URL-escape a string
perl -MURI::Escape -le 'print uri_escape($string)'
# URL-unescape a string
perl -MURI::Escape -le 'print uri_unescape($string)'
# HTML-encode a string
perl -MHTML::Entities -le 'print encode_entities($string)'
# HTML-decode a string
perl -MHTML::Entities -le 'print decode_entities($string)'
# Convert all text to uppercase
perl -nle 'print uc'
perl -ple '$_=uc'
perl -nle 'print "\U$_"'
# Convert all text to lowercase
perl -nle 'print lc'
perl -ple '$_=lc'
perl -nle 'print "\L$_"'
# Uppercase only the first word of each line
perl -nle 'print ucfirst lc'
perl -nle 'print "\u\L$_"'
# Invert the letter case
perl -ple 'y/A-Za-z/a-zA-Z/'
# Camel case each line
perl -ple 's/(\w+)/\u$1/g'
perl -ple 's/(?<!['])(\w+)/\u\1/g'
# Strip leading whitespace (spaces, tabs) from the beginning of each line
perl -ple 's/^[ \t]+//'
perl -ple 's/^\s+//'
# Strip trailing whitespace (space, tabs) from the end of each line
perl -ple 's/[ \t]+$//'
# Strip whitespace from the beginning and end of each line
perl -ple 's/^[ \t]+|[ \t]+$//g'
# Convert UNIX newlines to DOS/Windows newlines
perl -pe 's|\n|\r\n|'
# Convert DOS/Windows newlines to UNIX newlines
perl -pe 's|\r\n|\n|'
# Convert UNIX newlines to Mac newlines
perl -pe 's|\n|\r|'
# Substitute (find and replace) "foo" with "bar" on each line
perl -pe 's/foo/bar/'
# Substitute (find and replace) all "foo"s with "bar" on each line
perl -pe 's/foo/bar/g'
# Substitute (find and replace) "foo" with "bar" on lines that match "baz"
perl -pe '/baz/ && s/foo/bar/'
# Binary patch a file (find and replace a given array of bytes as hex numbers)
perl -pi -e 's/\x89\xD8\x48\x8B/\x90\x90\x48\x8B/g' file
SELECTIVE PRINTING AND DELETING OF CERTAIN LINES
------------------------------------------------
# Print the first line of a file (emulate head -1)
perl -ne 'print; exit'
# Print the first 10 lines of a file (emulate head -10)
perl -ne 'print if $. <= 10'
perl -ne '$. <= 10 && print'
perl -ne 'print if 1..10'
# Print the last line of a file (emulate tail -1)
perl -ne '$last = $_; END { print $last }'
perl -ne 'print if eof'
# Print the last 10 lines of a file (emulate tail -10)
perl -ne 'push @a, $_; @a = @a[@a-10..$#a]; END { print @a }'
# Print only lines that match a regular expression
perl -ne '/regex/ && print'
# Print only lines that do not match a regular expression
perl -ne '!/regex/ && print'
# Print the line before a line that matches a regular expression
perl -ne '/regex/ && $last && print $last; $last = $_'
# Print the line after a line that matches a regular expression
perl -ne 'if ($p) { print; $p = 0 } $p++ if /regex/'
# Print lines that match regex AAA and regex BBB in any order
perl -ne '/AAA/ && /BBB/ && print'
# Print lines that don't match match regexes AAA and BBB
perl -ne '!/AAA/ && !/BBB/ && print'
# Print lines that match regex AAA followed by regex BBB followed by CCC
perl -ne '/AAA.*BBB.*CCC/ && print'
# Print lines that are 80 chars or longer
perl -ne 'print if length >= 80'
# Print lines that are less than 80 chars in length
perl -ne 'print if length < 80'
# Print only line 13
perl -ne '$. == 13 && print && exit'
# Print all lines except line 27
perl -ne '$. != 27 && print'
perl -ne 'print if $. != 27'
# Print only lines 13, 19 and 67
perl -ne 'print if $. == 13 || $. == 19 || $. == 67'
perl -ne 'print if int($.) ~~ (13, 19, 67)'
# Print all lines between two regexes (including lines that match regex)
perl -ne 'print if /regex1/../regex2/'
# Print all lines from line 17 to line 30
perl -ne 'print if $. >= 17 && $. <= 30'
perl -ne 'print if int($.) ~~ (17..30)'
perl -ne 'print if grep { $_ == $. } 17..30'
# Print the longest line
perl -ne '$l = $_ if length($_) > length($l); END { print $l }'
# Print the shortest line
perl -ne '$s = $_ if $. == 1; $s = $_ if length($_) < length($s); END { print $s }'
# Print all lines that contain a number
perl -ne 'print if /\d/'
# Find all lines that contain only a number
perl -ne 'print if /^\d+$/'
# Print all lines that contain only characters
perl -ne 'print if /^[[:alpha:]]+$/
# Print every second line
perl -ne 'print if $. % 2'
# Print every second line, starting the second line
perl -ne 'print if $. % 2 == 0'
# Print all lines that repeat
perl -ne 'print if ++$a{$_} == 2'
# Print all unique lines
perl -ne 'print unless $a{$_}++'
# Print the first field (word) of every line (emulate cut -f 1 -d ' ')
perl -alne 'print $F[0]'
HANDY REGULAR EXPRESSIONS
-------------------------
# Match something that looks like an IP address
/^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$/
/^(\d{1,3}\.){3}\d{1,3}$/
# Test if a number is in range 0-255
/^([0-9]|[0-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])$/
# Match an IP address
my $ip_part = qr|([0-9]|[0-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])|;
if ($ip =~ /^($ip_part\.){3}$ip_part$/) {
say "valid ip";
}
# Check if the string looks like an email address
/\S+@\S+\.\S+/
# Check if the string is a decimal number
/^\d+$/
/^[+-]?\d+$/
/^[+-]?\d+\.?\d*$/
# Check if the string is a hexadecimal number
/^0x[0-9a-f]+$/i
# Check if the string is an octal number
/^0[0-7]+$/
# Check if the string is binary
/^[01]+$/
# Check if a word appears twice in the string
/(word).*\1/
# Increase all numbers by one in the string
$str =~ s/(\d+)/$1+1/ge
# Extract HTTP User-Agent string from the HTTP headers
/^User-Agent: (.+)$/
# Match printable ASCII characters
/[ -~]/
# Match unprintable ASCII characters
/[^ -~]/
# Match text between two HTML tags
m|<strong>([^<]*)</strong>|
m|<strong>(.*?)</strong>|
# Replace all <b> tags with <strong>
$html =~ s|<(/)?b>|<$1strong>|g
# Extract all matches from a regular expression
my @matches = $text =~ /regex/g;
PERL TRICKS
-----------
# Print the version of a Perl module
perl -MModule -le 'print $Module::VERSION'
perl -MLWP::UserAgent -le 'print $LWP::UserAgent::VERSION'
PERL ONE-LINERS EXPLAINED E-BOOK
--------------------------------
I have written an ebook based on the one-liners in this file. If you wish to
support my work and learn more about these one-liners, you can get a copy
of my ebook at:
http://www.catonmat.net/blog/perl-book/
The ebook is based on the 7-part article series that I wrote on my blog.
In the ebook I reviewed all the one-liners, improved explanations, added
new ones, and added two new chapters - introduction to Perl one-liners
and summary of commonly used special variables.
You can read the original article series here:
http://www.catonmat.net/blog/perl-one-liners-explained-part-one/
http://www.catonmat.net/blog/perl-one-liners-explained-part-two/
http://www.catonmat.net/blog/perl-one-liners-explained-part-three/
http://www.catonmat.net/blog/perl-one-liners-explained-part-four/
http://www.catonmat.net/blog/perl-one-liners-explained-part-five/
http://www.catonmat.net/blog/perl-one-liners-explained-part-six/
http://www.catonmat.net/blog/perl-one-liners-explained-part-seven/
CREDITS
-------
Andy Lester http://www.petdance.com
Shlomi Fish http://www.shlomifish.org
Madars Virza http://www.madars.org
caffecaldo https://github.com/caffecaldo
Kirk Kimmel https://github.com/kimmel
avar https://github.com/avar
rent0n
FOUND A BUG? HAVE ANOTHER ONE-LINER?
------------------------------------
Email bugs and new one-liners to me at [email protected]!
HAVE FUN
--------
I hope you found these one-liners useful. Have fun!
#---end of file---
-------------------------------------------------------------------------
USEFUL ONE-LINE SCRIPTS FOR SED (Unix stream editor) Dec. 29, 2005
Compiled by Eric Pement - pemente[at]northpark[dot]edu version 5.5
Latest version of this file (in English) is usually at:
http://sed.sourceforge.net/sed1line.txt
http://www.pement.org/sed/sed1line.txt
This file will also available in other languages:
Chinese - http://sed.sourceforge.net/sed1line_zh-CN.html
Czech - http://sed.sourceforge.net/sed1line_cz.html
Dutch - http://sed.sourceforge.net/sed1line_nl.html
French - http://sed.sourceforge.net/sed1line_fr.html
German - http://sed.sourceforge.net/sed1line_de.html
Italian - (pending)
Portuguese - http://sed.sourceforge.net/sed1line_pt-BR.html
Spanish - (pending)
FILE SPACING:
# double space a file
sed G
# double space a file which already has blank lines in it. Output file
# should contain no more than one blank line between lines of text.
sed '/^$/d;G'
# triple space a file
sed 'G;G'
# undo double-spacing (assumes even-numbered lines are always blank)
sed 'n;d'
# insert a blank line above every line which matches "regex"
sed '/regex/{x;p;x;}'
# insert a blank line below every line which matches "regex"
sed '/regex/G'
# insert a blank line above and below every line which matches "regex"
sed '/regex/{x;p;x;G;}'
NUMBERING:
# number each line of a file (simple left alignment). Using a tab (see
# note on '\t' at end of file) instead of space will preserve margins.
sed = filename | sed 'N;s/\n/\t/'
# number each line of a file (number on left, right-aligned)
sed = filename | sed 'N; s/^/ /; s/ *\(.\{6,\}\)\n/\1 /'
# number each line of file, but only print numbers if line is not blank
sed '/./=' filename | sed '/./N; s/\n/ /'
# count lines (emulates "wc -l")
sed -n '$='
TEXT CONVERSION AND SUBSTITUTION:
# IN UNIX ENVIRONMENT: convert DOS newlines (CR/LF) to Unix format.
sed 's/.$//' # assumes that all lines end with CR/LF
sed 's/^M$//' # in bash/tcsh, press Ctrl-V then Ctrl-M
sed 's/\x0D$//' # works on ssed, gsed 3.02.80 or higher
# IN UNIX ENVIRONMENT: convert Unix newlines (LF) to DOS format.
sed "s/$/`echo -e \\\r`/" # command line under ksh
sed 's/$'"/`echo \\\r`/" # command line under bash
sed "s/$/`echo \\\r`/" # command line under zsh
sed 's/$/\r/' # gsed 3.02.80 or higher
# IN DOS ENVIRONMENT: convert Unix newlines (LF) to DOS format.
sed "s/$//" # method 1
sed -n p # method 2
# IN DOS ENVIRONMENT: convert DOS newlines (CR/LF) to Unix format.
# Can only be done with UnxUtils sed, version 4.0.7 or higher. The
# UnxUtils version can be identified by the custom "--text" switch
# which appears when you use the "--help" switch. Otherwise, changing
# DOS newlines to Unix newlines cannot be done with sed in a DOS
# environment. Use "tr" instead.
sed "s/\r//" infile >outfile # UnxUtils sed v4.0.7 or higher
tr -d \r <infile >outfile # GNU tr version 1.22 or higher
# delete leading whitespace (spaces, tabs) from front of each line
# aligns all text flush left
sed 's/^[ \t]*//' # see note on '\t' at end of file
# delete trailing whitespace (spaces, tabs) from end of each line
sed 's/[ \t]*$//' # see note on '\t' at end of file
# delete BOTH leading and trailing whitespace from each line
sed 's/^[ \t]*//;s/[ \t]*$//'
# insert 5 blank spaces at beginning of each line (make page offset)
sed 's/^/ /'
# align all text flush right on a 79-column width
sed -e :a -e 's/^.\{1,78\}$/ &/;ta' # set at 78 plus 1 space
# center all text in the middle of 79-column width. In method 1,
# spaces at the beginning of the line are significant, and trailing
# spaces are appended at the end of the line. In method 2, spaces at
# the beginning of the line are discarded in centering the line, and
# no trailing spaces appear at the end of lines.
sed -e :a -e 's/^.\{1,77\}$/ & /;ta' # method 1
sed -e :a -e 's/^.\{1,77\}$/ &/;ta' -e 's/\( *\)\1/\1/' # method 2
# substitute (find and replace) "foo" with "bar" on each line
sed 's/foo/bar/' # replaces only 1st instance in a line
sed 's/foo/bar/4' # replaces only 4th instance in a line
sed 's/foo/bar/g' # replaces ALL instances in a line
sed 's/\(.*\)foo\(.*foo\)/\1bar\2/' # replace the next-to-last case
sed 's/\(.*\)foo/\1bar/' # replace only the last case
# substitute "foo" with "bar" ONLY for lines which contain "baz"
sed '/baz/s/foo/bar/g'
# substitute "foo" with "bar" EXCEPT for lines which contain "baz"
sed '/baz/!s/foo/bar/g'
# change "scarlet" or "ruby" or "puce" to "red"
sed 's/scarlet/red/g;s/ruby/red/g;s/puce/red/g' # most seds
gsed 's/scarlet\|ruby\|puce/red/g' # GNU sed only
# reverse order of lines (emulates "tac")
# bug/feature in HHsed v1.5 causes blank lines to be deleted
sed '1!G;h;$!d' # method 1
sed -n '1!G;h;$p' # method 2
# reverse each character on the line (emulates "rev")
sed '/\n/!G;s/\(.\)\(.*\n\)/&\2\1/;//D;s/.//'
# join pairs of lines side-by-side (like "paste")
sed '$!N;s/\n/ /'
# if a line ends with a backslash, append the next line to it
sed -e :a -e '/\\$/N; s/\\\n//; ta'
# if a line begins with an equal sign, append it to the previous line
# and replace the "=" with a single space
sed -e :a -e '$!N;s/\n=/ /;ta' -e 'P;D'
# add commas to numeric strings, changing "1234567" to "1,234,567"
gsed ':a;s/\B[0-9]\{3\}\>/,&/;ta' # GNU sed
sed -e :a -e 's/\(.*[0-9]\)\([0-9]\{3\}\)/\1,\2/;ta' # other seds
# add commas to numbers with decimal points and minus signs (GNU sed)
gsed -r ':a;s/(^|[^0-9.])([0-9]+)([0-9]{3})/\1\2,\3/g;ta'
# add a blank line every 5 lines (after lines 5, 10, 15, 20, etc.)
gsed '0~5G' # GNU sed only
sed 'n;n;n;n;G;' # other seds
SELECTIVE PRINTING OF CERTAIN LINES:
# print first 10 lines of file (emulates behavior of "head")
sed 10q
# print first line of file (emulates "head -1")
sed q
# print the last 10 lines of a file (emulates "tail")
sed -e :a -e '$q;N;11,$D;ba'
# print the last 2 lines of a file (emulates "tail -2")
sed '$!N;$!D'
# print the last line of a file (emulates "tail -1")
sed '$!d' # method 1
sed -n '$p' # method 2
# print the next-to-the-last line of a file
sed -e '$!{h;d;}' -e x # for 1-line files, print blank line
sed -e '1{$q;}' -e '$!{h;d;}' -e x # for 1-line files, print the line
sed -e '1{$d;}' -e '$!{h;d;}' -e x # for 1-line files, print nothing
# print only lines which match regular expression (emulates "grep")
sed -n '/regexp/p' # method 1
sed '/regexp/!d' # method 2
# print only lines which do NOT match regexp (emulates "grep -v")
sed -n '/regexp/!p' # method 1, corresponds to above
sed '/regexp/d' # method 2, simpler syntax
# print the line immediately before a regexp, but not the line
# containing the regexp
sed -n '/regexp/{g;1!p;};h'
# print the line immediately after a regexp, but not the line
# containing the regexp
sed -n '/regexp/{n;p;}'
# print 1 line of context before and after regexp, with line number
# indicating where the regexp occurred (similar to "grep -A1 -B1")
sed -n -e '/regexp/{=;x;1!p;g;$!N;p;D;}' -e h
# grep for AAA and BBB and CCC (in any order)
sed '/AAA/!d; /BBB/!d; /CCC/!d'
# grep for AAA and BBB and CCC (in that order)
sed '/AAA.*BBB.*CCC/!d'
# grep for AAA or BBB or CCC (emulates "egrep")
sed -e '/AAA/b' -e '/BBB/b' -e '/CCC/b' -e d # most seds
gsed '/AAA\|BBB\|CCC/!d' # GNU sed only
# print paragraph if it contains AAA (blank lines separate paragraphs)
# HHsed v1.5 must insert a 'G;' after 'x;' in the next 3 scripts below
sed -e '/./{H;$!d;}' -e 'x;/AAA/!d;'
# print paragraph if it contains AAA and BBB and CCC (in any order)
sed -e '/./{H;$!d;}' -e 'x;/AAA/!d;/BBB/!d;/CCC/!d'
# print paragraph if it contains AAA or BBB or CCC
sed -e '/./{H;$!d;}' -e 'x;/AAA/b' -e '/BBB/b' -e '/CCC/b' -e d
gsed '/./{H;$!d;};x;/AAA\|BBB\|CCC/b;d' # GNU sed only
# print only lines of 65 characters or longer
sed -n '/^.\{65\}/p'
# print only lines of less than 65 characters
sed -n '/^.\{65\}/!p' # method 1, corresponds to above
sed '/^.\{65\}/d' # method 2, simpler syntax
# print section of file from regular expression to end of file
sed -n '/regexp/,$p'
# print section of file based on line numbers (lines 8-12, inclusive)
sed -n '8,12p' # method 1
sed '8,12!d' # method 2
# print line number 52
sed -n '52p' # method 1
sed '52!d' # method 2
sed '52q;d' # method 3, efficient on large files
# beginning at line 3, print every 7th line
gsed -n '3~7p' # GNU sed only
sed -n '3,${p;n;n;n;n;n;n;}' # other seds
# print section of file between two regular expressions (inclusive)
sed -n '/Iowa/,/Montana/p' # case sensitive
SELECTIVE DELETION OF CERTAIN LINES:
# print all of file EXCEPT section between 2 regular expressions
sed '/Iowa/,/Montana/d'
# delete duplicate, consecutive lines from a file (emulates "uniq").
# First line in a set of duplicate lines is kept, rest are deleted.
sed '$!N; /^\(.*\)\n\1$/!P; D'
# delete duplicate, nonconsecutive lines from a file. Beware not to
# overflow the buffer size of the hold space, or else use GNU sed.
sed -n 'G; s/\n/&&/; /^\([ -~]*\n\).*\n\1/d; s/\n//; h; P'
# delete all lines except duplicate lines (emulates "uniq -d").
sed '$!N; s/^\(.*\)\n\1$/\1/; t; D'
# delete the first 10 lines of a file
sed '1,10d'
# delete the last line of a file
sed '$d'
# delete the last 2 lines of a file
sed 'N;$!P;$!D;$d'
# delete the last 10 lines of a file
sed -e :a -e '$d;N;2,10ba' -e 'P;D' # method 1
sed -n -e :a -e '1,10!{P;N;D;};N;ba' # method 2
# delete every 8th line
gsed '0~8d' # GNU sed only
sed 'n;n;n;n;n;n;n;d;' # other seds
# delete lines matching pattern
sed '/pattern/d'
# delete ALL blank lines from a file (same as "grep '.' ")
sed '/^$/d' # method 1
sed '/./!d' # method 2
# delete all CONSECUTIVE blank lines from file except the first; also
# deletes all blank lines from top and end of file (emulates "cat -s")
sed '/./,/^$/!d' # method 1, allows 0 blanks at top, 1 at EOF
sed '/^$/N;/\n$/D' # method 2, allows 1 blank at top, 0 at EOF
# delete all CONSECUTIVE blank lines from file except the first 2:
sed '/^$/N;/\n$/N;//D'
# delete all leading blank lines at top of file
sed '/./,$!d'
# delete all trailing blank lines at end of file
sed -e :a -e '/^\n*$/{$d;N;ba' -e '}' # works on all seds
sed -e :a -e '/^\n*$/N;/\n$/ba' # ditto, except for gsed 3.02.*
# delete the last line of each paragraph
sed -n '/^$/{p;h;};/./{x;/./p;}'
SPECIAL APPLICATIONS:
# remove nroff overstrikes (char, backspace) from man pages. The 'echo'
# command may need an -e switch if you use Unix System V or bash shell.
sed "s/.`echo \\\b`//g" # double quotes required for Unix environment
sed 's/.^H//g' # in bash/tcsh, press Ctrl-V and then Ctrl-H
sed 's/.\x08//g' # hex expression for sed 1.5, GNU sed, ssed
# get Usenet/e-mail message header
sed '/^$/q' # deletes everything after first blank line
# get Usenet/e-mail message body
sed '1,/^$/d' # deletes everything up to first blank line
# get Subject header, but remove initial "Subject: " portion
sed '/^Subject: */!d; s///;q'
# get return address header
sed '/^Reply-To:/q; /^From:/h; /./d;g;q'
# parse out the address proper. Pulls out the e-mail address by itself
# from the 1-line return address header (see preceding script)
sed 's/ *(.*)//; s/>.*//; s/.*[:<] *//'
# add a leading angle bracket and space to each line (quote a message)
sed 's/^/> /'
# delete leading angle bracket & space from each line (unquote a message)
sed 's/^> //'
# remove most HTML tags (accommodates multiple-line tags)
sed -e :a -e 's/<[^>]*>//g;/</N;//ba'
# extract multi-part uuencoded binaries, removing extraneous header
# info, so that only the uuencoded portion remains. Files passed to
# sed must be passed in the proper order. Version 1 can be entered
# from the command line; version 2 can be made into an executable
# Unix shell script. (Modified from a script by Rahul Dhesi.)
sed '/^end/,/^begin/d' file1 file2 ... fileX | uudecode # vers. 1
sed '/^end/,/^begin/d' "$@" | uudecode # vers. 2
# sort paragraphs of file alphabetically. Paragraphs are separated by blank
# lines. GNU sed uses \v for vertical tab, or any unique char will do.
sed '/./{H;d;};x;s/\n/={NL}=/g' file | sort | sed '1s/={NL}=//;s/={NL}=/\n/g'
gsed '/./{H;d};x;y/\n/\v/' file | sort | sed '1s/\v//;y/\v/\n/'
# zip up each .TXT file individually, deleting the source file and
# setting the name of each .ZIP file to the basename of the .TXT file
# (under DOS: the "dir /b" switch returns bare filenames in all caps).
echo @echo off >zipup.bat
dir /b *.txt | sed "s/^\(.*\)\.TXT/pkzip -mo \1 \1.TXT/" >>zipup.bat
TYPICAL USE: Sed takes one or more editing commands and applies all of
them, in sequence, to each line of input. After all the commands have
been applied to the first input line, that line is output and a second
input line is taken for processing, and the cycle repeats. The
preceding examples assume that input comes from the standard input
device (i.e, the console, normally this will be piped input). One or
more filenames can be appended to the command line if the input does
not come from stdin. Output is sent to stdout (the screen). Thus:
cat filename | sed '10q' # uses piped input
sed '10q' filename # same effect, avoids a useless "cat"
sed '10q' filename > newfile # redirects output to disk
For additional syntax instructions, including the way to apply editing
commands from a disk file instead of the command line, consult "sed &
awk, 2nd Edition," by Dale Dougherty and Arnold Robbins (O'Reilly,
1997; http://www.ora.com), "UNIX Text Processing," by Dale Dougherty
and Tim O'Reilly (Hayden Books, 1987) or the tutorials by Mike Arst
distributed in U-SEDIT2.ZIP (many sites). To fully exploit the power
of sed, one must understand "regular expressions." For this, see
"Mastering Regular Expressions" by Jeffrey Friedl (O'Reilly, 1997).
The manual ("man") pages on Unix systems may be helpful (try "man
sed", "man regexp", or the subsection on regular expressions in "man
ed"), but man pages are notoriously difficult. They are not written to
teach sed use or regexps to first-time users, but as a reference text
for those already acquainted with these tools.
QUOTING SYNTAX: The preceding examples use single quotes ('...')
instead of double quotes ("...") to enclose editing commands, since
sed is typically used on a Unix platform. Single quotes prevent the
Unix shell from intrepreting the dollar sign ($) and backquotes
(`...`), which are expanded by the shell if they are enclosed in
double quotes. Users of the "csh" shell and derivatives will also need
to quote the exclamation mark (!) with the backslash (i.e., \!) to
properly run the examples listed above, even within single quotes.
Versions of sed written for DOS invariably require double quotes
("...") instead of single quotes to enclose editing commands.
USE OF '\t' IN SED SCRIPTS: For clarity in documentation, we have used
the expression '\t' to indicate a tab character (0x09) in the scripts.
However, most versions of sed do not recognize the '\t' abbreviation,
so when typing these scripts from the command line, you should press
the TAB key instead. '\t' is supported as a regular expression
metacharacter in awk, perl, and HHsed, sedmod, and GNU sed v3.02.80.
VERSIONS OF SED: Versions of sed do differ, and some slight syntax
variation is to be expected. In particular, most do not support the
use of labels (:name) or branch instructions (b,t) within editing
commands, except at the end of those commands. We have used the syntax
which will be portable to most users of sed, even though the popular
GNU versions of sed allow a more succinct syntax. When the reader sees
a fairly long command such as this:
sed -e '/AAA/b' -e '/BBB/b' -e '/CCC/b' -e d
it is heartening to know that GNU sed will let you reduce it to:
sed '/AAA/b;/BBB/b;/CCC/b;d' # or even
sed '/AAA\|BBB\|CCC/b;d'
In addition, remember that while many versions of sed accept a command
like "/one/ s/RE1/RE2/", some do NOT allow "/one/! s/RE1/RE2/", which
contains space before the 's'. Omit the space when typing the command.
OPTIMIZING FOR SPEED: If execution speed needs to be increased (due to
large input files or slow processors or hard disks), substitution will
be executed more quickly if the "find" expression is specified before
giving the "s/.../.../" instruction. Thus:
sed 's/foo/bar/g' filename # standard replace command
sed '/foo/ s/foo/bar/g' filename # executes more quickly
sed '/foo/ s//bar/g' filename # shorthand sed syntax
On line selection or deletion in which you only need to output lines
from the first part of the file, a "quit" command (q) in the script
will drastically reduce processing time for large files. Thus:
sed -n '45,50p' filename # print line nos. 45-50 of a file
sed -n '51q;45,50p' filename # same, but executes much faster
If you have any additional scripts to contribute or if you find errors
in this document, please send e-mail to the compiler. Indicate the
version of sed you used, the operating system it was compiled for, and
the nature of the problem. To qualify as a one-liner, the command line
must be 65 characters or less. Various scripts in this file have been
written or contributed by:
Al Aab # founder of "seders" list
Edgar Allen # various
Yiorgos Adamopoulos # various
Dale Dougherty # author of "sed & awk"
Carlos Duarte # author of "do it with sed"
Eric Pement # author of this document
Ken Pizzini # author of GNU sed v3.02
S.G. Ravenhall # great de-html script
Greg Ubben # many contributions & much help
-------------------------------------------------------------------------
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment