Skip to content

Instantly share code, notes, and snippets.

@rcanepa
Last active August 5, 2016 22:50
Show Gist options
  • Save rcanepa/c665d2d44ab247a282a00c52183e3697 to your computer and use it in GitHub Desktop.
Save rcanepa/c665d2d44ab247a282a00c52183e3697 to your computer and use it in GitHub Desktop.
UNIX & Linux Grep
Grep searches strings/patterns inside other strings/text.
Basic usage:
grep some-string some-file
-> This will return all lines of the file named 'some-file' which contains the string 'some-string'.
ls | grep some-string
-> This will return all files which contains the string 'some-string' in their names
Wildcards:
The . wildcard can be used to specify that any character (just one) will match the searched string if everything else match.
grep 'function .e' some-file
-> This will return all lines from the file 'some-file' which contains a string with the form 'function .e'. Match examples:
function revisions {
function delete {
The * wildcard can be used to specify that the character before it can be 0 or more times repeated. Examples:
grep 'h.*h' 'some-file'
-> This will match any string which contains a substring which has an 'h' followed by 0 or more character of any kind, and then and 'h'. Match examples:
hash
the boy is going to the town (Match on: he boy ... to th)
a* matches "", "a", "aa", ...
aa* matches "a", "aa", "aaa", ...
To escape a special character (grep won't interpret its special meaning) the \ character must be used before the target character.
To find all lines which contains the string 'image.gif', the dot character must be escaped:
grep 'image\.gif' 'some-file'
If we have the following list of files:
bower.json
dummyjsfile
gulpfile.js
karma.conf.js
package.json
protractor.conf.js
Executing ls | grep '.js' will return all files, because in this case, the dot is a wildcard that will match any character. If we want to match the dot in the file name, we must escape the dot in the grep statement in this way:
ls | grep '\.js' <- this will return all files which contains the '.js' string as part of their name.
grep 'somefile.ext' will match files like these:
somefile-ext
somefile.ext
somefileaext
somefilebext
The ? character match any string in which the character before an escaped ? is present one or zero times. For example:
grep 'abc\?' <- will match 'abc' and 'ab'
ls | grep '\.jso\?' <- will match files which contains '.js' or '.jso'
An expression surrounded by escaped parentheses is treated by as a single character. So, in combination with the ? character, it is possible to specify optional substring.
grep 'abcd\(mm\)\?xyz' <- will match 'abcdmmxyz' and 'abcdxyz'
The [] can be used to match a character from a list.
grep [hH] <- will match the strings "hello" and "Hello"
grep [0-9] <- will match the strings "1", "2", "a3bc", but it won't match "a" or "house"
There are some alternative forms:
[[:alpha:]] is the same as [a-zA-Z]
[[:upper:]] is the same as [A-Z]
[[:lower:]] is the same as [a-z]
[[:digit:]] is the same as [0-9]
[[:alnum:]] is the same as [0-9a-zA-Z]
[[:space:]] matches any white space including tabs
The [] can be used to find not matches too. This can be done putting a ^ in the first position inside the [].
grep ([^x]*)a <- will match '()a', 'x(b)ax', '(dsakjheqw123123)a' but it won't match something like '(x)a' (anything with an x between the parentheses).
To match a specific number of repetitions of a pattern, the pattern must be followed by \{n\}, in which n is the number of repetitions.
grep '[[:digit:]]\{3\}' <- will match any string with at least 3 consecutive numbers
grep '\+[[:digit:]]\{3\}-[[:digit:]]\{5\}' <- will match strings like '... +562-12356 ...'
The $ character matches the end of the line. The ^ character matches the beginning of the line.
grep '^[[:alnum:]]\+@[[:alnum:]]\+\.[[:alpha:]]\+$' <- will match any string that has 1 or more characters before an @, and then 1 or more characters after the @, and that is ended by a . and any number of characters.
Match examples:
[email protected]
[email protected]
[email protected]
Don't match examples:
@xac.com (empty string before @)
[email protected] (empty string after @ and before .)
as@1as. (empty string after .)
[email protected] (not [[:alpha:]] character after .)
ll | grep '^[d].*[[:space:]]\+\.' <- files that start with . (hidden) and in which their permission string starts with a 'd'
drwxr-xr-x 30 rcanepa staff 1.0K Aug 5 17:41 .
drwxr-xr-x 7 rcanepa staff 238B Feb 10 14:04 ..
drwxr-xr-x 15 rcanepa staff 510B Aug 5 18:41 .git
drwxr-xr-x 10 rcanepa staff 340B Feb 10 14:04 .idea
drwxr-xr-x 4 rcanepa staff 136B Aug 3 16:22 .tmp
To match a string between two patterns, the escaped | character must be used.
ll | grep '\(\.js$\)\|\(\.sh$\)' <- will match files terminated by .js or .sh
-rwxr-xr-x 1 rcanepa staff 512B Mar 10 19:22 deploy.sh
-rw-r--r-- 1 rcanepa staff 232B Feb 10 14:04 gulpfile.js
-rw-r--r-- 1 rcanepa staff 243B Feb 10 14:04 karma.conf.js
-rw-r--r-- 1 rcanepa staff 708B Feb 10 14:04 protractor.conf.js
-rwxr-xr-x 1 rcanepa staff 769B Aug 5 17:00 stats.sh
ls | grep 'a.*\(\.js\|\.sh$\)$' <- match files names that contains an 'a' and end by '.js' or '.sh'
karma.conf.js
protractor.conf.js
stats.sh
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment