Skip to content

Instantly share code, notes, and snippets.

@rmtbb
Created September 3, 2025 22:23
Show Gist options
  • Save rmtbb/d55638e758ad656eb40741dd60a39e5f to your computer and use it in GitHub Desktop.
Save rmtbb/d55638e758ad656eb40741dd60a39e5f to your computer and use it in GitHub Desktop.
List Git Files - List raw GitHub URLs for files in a repo or subfolder, filtered by extensions
#!/usr/bin/env bash
# listgitfiles.sh
# List raw GitHub URLs for files in a repo or subfolder, filtered by extensions.
# Full README is embedded and printed only with: listgitfiles.sh -h|--help
set -euo pipefail
# ------------------------- Minimal Help (usage) -------------------------
print_min_help() {
cat <<'USAGE'
Usage:
listgitfiles.sh <repo_or_url> [extensions]
Examples:
listgitfiles.sh https://github.com/rmtbb/superjackthegame/tree/main/heroimages
listgitfiles.sh rmtbb/superjackthegame@main "png,webp"
listgitfiles.sh rmtbb/superjackthegame/heroimages "jpg,jpeg"
Notes:
- [extensions] is comma-separated (no dots), case-insensitive.
- Defaults: png,jpg,jpeg,gif,webp,svg,avif
- Requires: curl, jq
For full help: listgitfiles.sh --help
USAGE
}
# ------------------------- Full README (for --help) -------------------------
print_full_help() {
cat <<'README'
listgitfiles.sh — List raw GitHub file URLs from a repo or subfolder
USAGE
listgitfiles.sh <repo_or_url> [extensions]
ARGUMENTS
<repo_or_url>
Any of:
- owner/repo
- owner/repo@branch
- owner/repo/path/to/folder
- owner/repo/path@branch
- https://github.com/owner/repo
- https://github.com/owner/repo/tree/<branch>
- https://github.com/owner/repo/tree/<branch>/path/to/folder
[extensions] (optional)
Comma-separated list of file extensions (no dots), case-insensitive.
Default: png,jpg,jpeg,gif,webp,svg,avif
EXAMPLES
# All images in a folder (branch inferred from URL)
listgitfiles.sh "https://github.com/rmtbb/superjackthegame/tree/main/heroimages"
# Only png + webp in that folder
listgitfiles.sh "https://github.com/rmtbb/superjackthegame/tree/main/heroimages" "png,webp"
# Whole repo on a branch, filter to jpg/jpeg
listgitfiles.sh "rmtbb/superjackthegame@main" "jpg,jpeg"
# Subfolder with implicit default branch
listgitfiles.sh "rmtbb/superjackthegame/heroimages"
OUTPUT
Newline-separated list of raw content URLs, e.g.:
https://raw.githubusercontent.com/<owner>/<repo>/<branch>/<path/to/file.ext>
REQUIREMENTS
- curl
- jq
- Optional: GITHUB_TOKEN (for higher API rate limits)
HOW IT WORKS
- Discovers/uses branch (from URL or @branch, else repo default via API).
- Fetches a recursive tree for the branch:
https://api.github.com/repos/<owner>/<repo>/git/trees/<branch>?recursive=1
- Filters by (optional) subfolder prefix and file extensions.
- Prints raw URLs via https://raw.githubusercontent.com/.
TIPS
- Set GITHUB_TOKEN to avoid strict unauthenticated rate limits:
export GITHUB_TOKEN=ghp_xxx...
- Pipe to a file:
listgitfiles.sh <repo> > urls.txt
README
}
err() { printf "Error: %s\n" "$*" >&2; }
need_tool() {
command -v "$1" >/dev/null 2>&1 || { err "Required tool '$1' not found in PATH."; exit 1; }
}
# ------------------------- Dependencies -------------------------
need_tool curl
need_tool jq
# ------------------------- Flags -------------------------
if [[ "${1:-}" == "-h" || "${1:-}" == "--help" ]]; then
print_full_help
exit 0
fi
# ------------------------- Args & Basic Validation -------------------------
if [[ $# -lt 1 || $# -gt 2 ]]; then
err "Incorrect usage."
echo
print_min_help
exit 2
fi
INPUT="$1"
EXTS_RAW="${2:-png,jpg,jpeg,gif,webp,svg,avif}"
# Normalize extensions -> regex like \.(png|jpg|jpeg)$ (case-insensitive at jq)
IFS=',' read -r -a EXTS_ARR <<< "$EXTS_RAW"
if [[ ${#EXTS_ARR[@]} -eq 0 ]]; then
err "Could not parse extensions list."
echo
print_min_help
exit 2
fi
EXT_REGEX="\\.($(printf "%s|" "${EXTS_ARR[@]}" | sed 's/|$//'))$"
# ------------------------- Parse repo/url input -------------------------
OWNER=""
REPO=""
BRANCH=""
PATH_PREFIX=""
trim_slashes() { sed 's#^/*##; s#/*$##'; }
if [[ "$INPUT" =~ ^https?://github\.com/([^/]+)/([^/]+)($|/.*) ]]; then
OWNER="${BASH_REMATCH[1]}"
REPO="${BASH_REMATCH[2]}"
REST="${BASH_REMATCH[3]}"
# /tree/<branch>[/path...]
if [[ "$REST" =~ ^/tree/([^/]+)(/.*)?$ ]]; then
BRANCH="${BASH_REMATCH[1]}"
PATH_PREFIX="${BASH_REMATCH[2]:-/}"
PATH_PREFIX="$(echo "$PATH_PREFIX" | trim_slashes)"
else
BRANCH=""
PATH_PREFIX=""
fi
else
# Non-URL: owner/repo[/path][@branch]
MAIN_PART="$INPUT"
if [[ "$INPUT" == *@* ]]; then
MAIN_PART="${INPUT%@*}"
BRANCH="${INPUT##*@}"
fi
IFS='/' read -r -a PARTS <<< "$MAIN_PART"
if [[ ${#PARTS[@]} -lt 2 ]]; then
err "Could not parse owner/repo from input: '$INPUT'"
echo
print_min_help
exit 2
fi
OWNER="${PARTS[0]}"
REPO="${PARTS[1]}"
if [[ ${#PARTS[@]} -gt 2 ]]; then
PATH_PREFIX="$(printf "/%s" "${PARTS[@]:2}" | sed 's#^/##')"
else
PATH_PREFIX=""
fi
fi
# ------------------------- GitHub API helper -------------------------
api_get() {
local url="$1"
if [[ -n "${GITHUB_TOKEN:-}" ]]; then
curl -fsSL -H "Authorization: Bearer $GITHUB_TOKEN" -H "Accept: application/vnd.github+json" "$url"
else
curl -fsSL -H "Accept: application/vnd.github+json" "$url"
fi
}
# Discover default branch if not given
if [[ -z "$BRANCH" ]]; then
BRANCH="$(api_get "https://api.github.com/repos/$OWNER/$REPO" | jq -r '.default_branch // empty')"
if [[ -z "$BRANCH" ]]; then
# Fallback probe main/master
for b in main master; do
if api_get "https://api.github.com/repos/$OWNER/$REPO/git/trees/$b?recursive=1" >/dev/null 2>&1; then
BRANCH="$b"
break
fi
done
fi
if [[ -z "$BRANCH" ]]; then
err "Could not discover default branch for $OWNER/$REPO."
exit 3
fi
fi
# ------------------------- Fetch tree & print raw URLs -------------------------
TREE_JSON="$(api_get "https://api.github.com/repos/$OWNER/$REPO/git/trees/$BRANCH?recursive=1")" || {
err "Failed to fetch repository tree. Check repo/branch and your network."
exit 4
}
if [[ "$(echo "$TREE_JSON" | jq -r 'has("tree")')" != "true" ]]; then
MESSAGE="$(echo "$TREE_JSON" | jq -r '.message // empty')"
if [[ -n "$MESSAGE" ]]; then
err "GitHub API error: $MESSAGE"
else
err "Unexpected API response; no 'tree' key."
fi
exit 4
fi
jq -r \
--arg owner "$OWNER" \
--arg repo "$REPO" \
--arg branch "$BRANCH" \
--arg path_prefix "${PATH_PREFIX}" \
--arg path_prefix_trimmed "$(echo "${PATH_PREFIX}" | trim_slashes)" \
--arg ext_regex "$EXT_REGEX" \
'
.tree[]
| select(.type=="blob")
| .path as $p
| select(
($p | test($ext_regex; "i"))
and
(
($path_prefix == "") or
($p | startswith($path_prefix_trimmed))
)
)
| "https://raw.githubusercontent.com/\($owner)/\($repo)/\($branch)/\($p)"
' <<< "$TREE_JSON"
@rmtbb
Copy link
Author

rmtbb commented Sep 3, 2025

listgitfiles.sh - by Remote BB

List raw GitHub file URLs from a repo or subfolder, filtered by extensions.

Usage

listgitfiles.sh <repo_or_url> [extensions]

Arguments

  • <repo_or_url>

    • Any of:
      • owner/repo
      • owner/repo@branch
      • owner/repo/path/to/folder
      • owner/repo/path@branch
      • https://github.com/owner/repo
      • https://github.com/owner/repo/tree/<branch>
      • https://github.com/owner/repo/tree/<branch>/path/to/folder
  • [extensions] (optional)
    Comma-separated list of file extensions (no dots), case-insensitive.
    Default: png,jpg,jpeg,gif,webp,svg,avif

Examples

# All images in a folder (branch inferred from URL)
listgitfiles.sh "https://github.com/rmtbb/superjackthegame/tree/main/heroimages"

# Only png + webp in that folder
listgitfiles.sh "https://github.com/rmtbb/superjackthegame/tree/main/heroimages" "png,webp"

# Whole repo on a branch, filter to jpg/jpeg
listgitfiles.sh "rmtbb/superjackthegame@main" "jpg,jpeg"

# Subfolder with implicit default branch
listgitfiles.sh "rmtbb/superjackthegame/heroimages"

Output

Newline-separated list of raw content URLs, e.g.:

https://raw.githubusercontent.com/<owner>/<repo>/<branch>/<path/to/file.ext>

Requirements

  • curl
  • jq
  • Optional: GITHUB_TOKEN (for higher API rate limits)

How it works

  • Discovers/uses branch (from URL or @branch, else repo default via API).
  • Fetches a recursive tree for the branch:
    https://api.github.com/repos/<owner>/<repo>/git/trees/<branch>?recursive=1
    
  • Filters by (optional) subfolder prefix and file extensions.
  • Prints raw URLs via https://raw.githubusercontent.com/.

Tips

  • Set GITHUB_TOKEN to avoid strict unauthenticated rate limits:
    export GITHUB_TOKEN=ghp_xxx...
  • Pipe to a file:
    listgitfiles.sh <repo> > urls.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment