Skip to content

Instantly share code, notes, and snippets.

@paul-d-ray
Created May 26, 2025 21:57
Show Gist options
  • Save paul-d-ray/4290bbd159420a5d4959030fee9851d8 to your computer and use it in GitHub Desktop.
Save paul-d-ray/4290bbd159420a5d4959030fee9851d8 to your computer and use it in GitHub Desktop.
Using RG and PDF files

RG and PDF files

To seach inside pdf files, use rga

RGA

If you want to search inside PDFs (and many other file types like Office documents, archives, e-books, etc.), you can use ripgrep-all (rga), which is a wrapper around ripgrep that adds support for these formats.

rga is available for Windows and can be installed using package managers like Chocolatey or Scoop.

rga -i -e 'duffel' --no-ignore --hidden -g '*.pdf' 'h:/personal/invoices/rei'

RG and Ignore options

Using this command line, this will not look inside any folder named version.

rg -i  -g "*.nu" -g "!**/versions/**" --pcre2 --regexp '^(?!#).*nuls' 'D:\work\nu' 'C:\Users\Paul_Ray\AppData\Roaming'

Component-by-Component Explanation

  • rg: This invokes the ripgrep command-line tool, designed for searching files for patterns using regular expressions.

  • -i: Flag: --ignore-case Purpose: Makes the search case-insensitive. For example, the pattern nuls will match NULS, nuls, NuLs, etc.

  • -g "*.nu": Flag: --glob Purpose: Filters files to include only those with the .nu extension (e.g., script.nu, example.nu). The glob pattern .nu matches any file name ending in .nu. Details: The quotes around ".nu" ensure the glob pattern is interpreted correctly, especially in shells that might expand wildcards (e.g., Bash).

  • -g "!/versions/": Flag: --glob Purpose: Excludes files in any directory named versions (or subdirectories thereof) from the search. Details: The ! prefix negates the glob, meaning files matching this pattern are skipped. The ** syntax is a recursive wildcard, so !/versions/ excludes files in any versions directory at any depth (e.g., D:\work\nu\versions, D:\work\nu\project\versions, etc.).

  • --pcre2: Flag: Enables the PCRE2 (Perl-Compatible Regular Expressions) regex engine. Purpose: Allows the use of advanced regex features like negative lookaheads, which are used in the provided regex pattern. Ripgrep’s default regex engine is simpler, but PCRE2 supports more complex patterns.

  • --regexp '^(?!#).nuls': Flag: --regexp (or -e) Purpose: Specifies the regular expression to search for in the files. Regex Breakdown: ^: Matches the start of a line. (?!#): A negative lookahead that ensures the line does not start with a # (useful for skipping comments in many file formats, like scripts where # denotes a comment). .: Matches any characters (zero or more) after the start of the line. nuls: Matches the literal string nuls. Combined Effect: Matches any line that contains nuls (case-insensitive due to -i) and does not start with #. For example, it would match This is nuls or nuls example, but not # nuls comment.

  • 'D:\work\nu': Argument: Specifies the first directory to search. Purpose: Ripgrep will recursively search all files in D:\work\nu (and its subdirectories) that match the glob pattern *.nu and are not excluded by !/versions/. Details: The backslashes () are Windows-style paths. Ripgrep accepts both backslashes and forward slashes, but forward slashes (D:/work/nu) are more portable.

  • 'C:\Users\Paul_Ray\AppData\Roaming': Argument: Specifies the second directory to search. Purpose: Ripgrep will also recursively search all files in C:\Users\Paul_Ray\AppData\Roaming (and its subdirectories) that match the glob pattern *.nu and are not excluded by !/versions/. Details: Like the first path, this uses Windows-style backslashes and is enclosed in quotes to handle potential spaces or special characters.

  • What the Command Does This command searches for lines containing the string nuls (case-insensitive) in .nu files located in either D:\work\nu or C:\Users\Paul_Ray\AppData\Roaming, excluding any files in directories named versions. Lines starting with # (e.g., comments) are ignored due to the regex ^(?!#).*nuls. The PCRE2 engine ensures the negative lookahead in the regex is processed correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment