Last active
June 24, 2020 09:09
-
-
Save santileortiz/6614f9bb43e61ec78408accad17f9f94 to your computer and use it in GitHub Desktop.
Example of split/join of fields with AWK
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This example shows how $0 and the other $n fields change during the execution | |
of an AWK script. In a sense this is what using AWK is all about. | |
Consider the following awk script, with " 1 2 3 4 5" as input. | |
{ | |
print_fields(); | |
$1 = ""; | |
$3 = ""; | |
print_fields(); | |
$0 = $0; | |
print_fields(); | |
$1 = $1; | |
print_fields(); | |
} | |
Suppose print_fields() prints the value of $0 and each of the fields as an | |
array. The output the script would be this: | |
$0: 1 2 3 4 5 | |
[1,2,3,4,5] | |
$0: 2 4 5 | |
[,2,,4,5] | |
$0: 2 4 5 | |
[2,4,5] | |
$0:2 4 5 | |
[2,4,5] | |
Understanding why this output looks like it does explains A LOT about how AWK | |
works. A couple things to note here: | |
- $0 is the _original_ record, which is different to "fields joined by FS". | |
- Assigning to a field different to $0 causes $0 to be recompiled. In some | |
fake Python syntax this means something like | |
$0 = "{FS}".join(fields) | |
- Assigning to $0 causes field splitting to be executed. Field splitting in | |
AWK repopulates field variables by separating $0 by the current value of FS, | |
ignoring consecutive occurences of FS. | |
[$1..$NF] = $0.split("{FS}") | |
This essentially explains why there is no split/join function for fields in | |
AWK. AWK is actually executing one of these behaviors implicitly, every time | |
we assign to field variables. | |
=========== | |
FULL SOURCE | |
field_test.awk | |
--------------------------------------- | |
function print_fields() | |
{ | |
res = $1; | |
for (i=2; i<=NF; i++) { | |
res = res ","$i; | |
} | |
printf("$0:%s\n[%s]\n", $0, res); | |
print ""; | |
} | |
{ | |
print_fields(); | |
$1 = ""; | |
$3 = ""; | |
print_fields(); | |
// Force field splitting | |
$0 = $0; | |
print_fields(); | |
// Force record compilation | |
$1 = $1; | |
print_fields(); | |
} | |
--------------------------------------- | |
Run with: | |
$ echo " 1 2 3 4 5" | awk -f field_test.awk | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment