Learning Bash and Unix as I go. This file documents some of the things I learn along the way in a scratchpad fashion.
- http://www.pixelbeat.org/programming/shell_script_mistakes.html
- http://mywiki.wooledge.org/BashPitfalls
- https://dev.to/thiht/shell-scripts-matter
- https://github.com/j1elo/shell-snippets/blob/master/template.sh
- https://natelandau.com/boilerplate-shell-script-template/
- https://www.linuxjournal.com/content/bash-trap-command
- http://mywiki.wooledge.org/BashFAQ/031 The difference between '[' and '[[' in conditionals.
There are three "standard stream" through which bits flow between the computer and it's environment (Physical IO)
0 = STDIN 1 = STDOUT 2 = STDERR
https://en.wikipedia.org/wiki/Standard_streams
These are abstracted in a text terminal running in a shell on a modern computer.
You can redirect and pipe streams connecting the output of one process as the input to the next.
# cat = print a file to STDOUT; pipe STDOUT to STDIN for wc -l = count the number of lines
# from STDIN and print to STDOUT
$ cat file.txt | wc -l
# Equals the above command. However, instead of STDIN, read from FILE and print to STDOUT
$ wc -l file.txt
cmd < FILE
means use the content of FILE as STDIN, instead of whatever you type on the keyboardcmd > FILE
means redirect the data from STDOUT and write it to a FILE, instead of the screen (terminal)cmd < FILE1 > FILE2
combines both, read from FILE1 and write to FILE2cmd >> FILE
means redirect the data from STDOUT and append it to the end of FILE, instead of the screen (terminal).
You could also use HERE documents. This will take a literal, and treat it as if it were a separate file. You could redirect a HERE document to SDTIN, or use it as a FILE argument
https://en.wikipedia.org/wiki/Here_document
Print the exit status of the last command that was executed.
echo $?
Execute a command and keep a log file of the STDOUT output:
ping google.com 2>ping.log
Which is helpful if we want to execute ping in the background and keep it going:
nohup ping google.com 2>ping.log &
nohup
makes a running process immune to HUP (hangup) signals. That is, theSIGH UP
signal from logout- the
&
makes the command run as a background process.
More info about redirection: https://en.wikipedia.org/wiki/Redirection_(computing)
Maybe I forgot to put a program in the background. No worries. Press CTRL-Z
in the terminal to get a prompt. This will send a SIGTSTP
signal to a foreground program, suspending it effectively. If we press CTRL-C
, we send a SIGINT
signal which interrupts the foreground program, effectively aborting/halting execution.
bg
disown -a
This will resume the suspended program in the background, as if it was started with &
. The disown
command will remove the job from the shell's joblist, making sure it won't receive a SIGHUP
when the terminal closes. However, the job stays connected to the terminal, so when the terminal is closed, the job might fail if it tries to read from STDIN or write to STDOUT.
See: https://unix.stackexchange.com/questions/3886/difference-between-nohup-disown-and
It's like cat
but then through .gz files.
zcat collection.tar.gz
Inverting a search for a string: that is, find all lines that do NOT match this pattern:
grep -v ".*string"
Recursively look for files in the cwd and it's children, matching this pattern:
grep -r ".*string" .
Grep using a regex:
grep -G '^apple' fruits.txt
Grep using a regex incoming from a file. Note: our regex.txt file can hold multiple regexes to match each line against:
grep -Gf regex.txt fruits.txt
Suppose our regex file is a CSV with columns, delimited by a space. And we want to use the first column as a regex:
grep -Gf <(cut -d ',' -f 1 regex.txt) fruits.txt
- Note that between
<
and(
is NO space. Bash will complain if you leave a space. - The block between
()
is executed as a subcommand separately. The STDOUT of that block is then redirected to grep for the-Gf
switches - The cut command:
-d
defines delimiter, and-f 1
means extract first column
We can manipulate the regex before sending it to grep
with sed
:
grep -Gf <(cut -d ',' -f 1 regex.txt | sed -e 's/^/^/') fruits.txt
This will add a ^
in front of the regex.
Let's do something with ISO-936 codes. Given these CSV data, count the number of ISO-936-1 codes
ISO_639_2,ISO_639_1,eng_label,fre_label
(none)/,(none),Serbo-Croatian,serbo-croate
aar/,aa,Afar,afar
abk/,ab,Abkhazian,abkhaze
ace/,,Achinese,aceh
ach/,,Acoli,acoli
ada/,,Adangme,adangme
ady/,,Adyghe; Adygei,adyghé
afa/,,Afro-Asiatic languages,"afro-asiatiques, langues"
afh/,,Afrihili,afrihili
afr/,af,Afrikaans,afrikaans
ain/,,Ainu,aïnou
aka/,ak,Akan,akan
akk/,,Akkadian,akkadien
alb/sqi,sq,Albanian,albanais
BASH command:
cut -d ',' -f 2 iso-936.csv | grep -v -e "^\s*$" | wc -l
- cut the file and take the 2nd column containing the ISO-936-1 codes
grep -v
=> fetch everything that does NOT match this stringgrep -e
=> match this regular expression pattern- "^\s*$" => match all strings that are 0 or more whitespaces (empty string, or strings with whitespace)
- So, basically, an inverted search on "empty strings" will filter out values.
wc -l
=> count the lines
sed -e "s/.*\$\$a([^)][^)]*)\([^\$][^\$]*\).*/\1/"
sed will do a find/replace based on pattern matching from STDIN to STDOUT. Note the [^\$][^\$]*
and [^)][^)]*
for respectively matching literal $ and ) characters. We need to repeat the pattern because sed doesn't do PCRE regex unless you add the -E
flag (which is an undocumented alias for -r
.
See: https://www.gnu.org/software/sed/manual/sed.html#BRE-vs-ERE
sed -e "/^\$/d"
This will delete all empty lines from a given input (file or stdin)
see: https://www.cyberciti.biz/faq/using-sed-to-delete-empty-lines/
With a list of items in a file fruits.txt
:
apple
apple
orange
banana
Let's do an aggregated count:
cat fruits.txt | sort | uniq -c | sort -n
1 banana
1 orange
2 apple
xargs -n 1 -a fruits.txt echo
xargs will take a bunch of input and map them to the argumenst of a command, one by one. We can even do explicit substitution. In this example, FOO is replaced by the output of xargs through the -I
switch:
xargs -n 1 -a -I FOO -a fruits.txt echo FOO
Let's compare two files and filter file A with the values from file B. $1 is the first column in both files:
awk 'NR==FNR{a[$1];next} ($1) in a' file.b.txt file.a.txt