Last active
January 13, 2024 14:41
-
-
Save Wattenberger/77242e463b2b850aaddd02d08b158e9a to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# This script will download the contents of a GitHub repo | |
# and place them in a local directory. | |
# | |
# Usage: | |
# download-repo.sh <repo> <output-path> <nested-path> <branch-name> | |
# | |
# Example: | |
# download-repo.sh wattenberger/kumiko ./kumiko-assets master public/assets | |
# | |
# You'll get rate-limited by GitHub, so create a PAT here: | |
# https://github.com/settings/tokens | |
# This will also let you download from private repos. | |
GITHUB_TOKEN="YOUR_TOKEN_HERE" | |
repo=$1 | |
# split repo name to username and repository name | |
repo_name=`echo $repo | cut -d/ -f2` | |
repo_user=`echo $repo | cut -d/ -f1` | |
output_path=$2 | |
nested_path=$3 | |
branch_name=$4 | |
# if no branch_name is given, use main or master or the first one listed | |
if [ -z "$branch_name" ]; then | |
# get branches from repo | |
branches_string=`curl -s -H "Authorization: token $GITHUB_TOKEN" https://api.github.com/repos/${repo_user}/${repo_name}/branches` | |
branches=`echo ${branches_string} | jq -r '.[] | .name'` | |
if [[ ${branches} == *"main"* ]]; then | |
branch_name="main" | |
elif [[ ${branches} == *"master"* ]]; then | |
branch_name="master" | |
else | |
branch_name=`echo ${branches_string} | jq -r '.[0] | .name'` | |
fi | |
echo "fetching from branch ${branch_name}" | |
fi | |
# if no output_path is given, use the repo name | |
if [ -z "$output_path" ]; then | |
output_path="./${repo_name}" | |
fi | |
url="https://api.github.com/repos/${repo}/git/trees/${branch_name}?recursive=1" | |
# fetch repo data | |
full_tree_string=`curl -s -H "Authorization: token ${GITHUB_TOKEN}" "${url}"` | |
# get paths where type is not tree | |
paths=`echo ${full_tree_string} | jq -r '.tree[] | select(.type != "tree") | .path'` | |
# if no paths found, exit | |
if [ -z "${paths}" ]; then | |
echo "No files found in this repo, more info at ${url}" | |
exit 1 | |
fi | |
# filter out lines that don't start with nested_path and remove nested_path prefix | |
paths=`echo "${paths}" | grep -E "^${nested_path}" | sed "s|^${nested_path}||g"` | |
number_of_paths=`echo "${paths}" | wc -l | sed "s/^[ \t]*//"` | |
echo "Found ${number_of_paths} files, fetching contents..." | |
mkdir -p "${output_path}/" | |
set -o noclobber | |
# fetch contents for each line in paths | |
for path in ${paths}; do | |
echo "Fetching ${path}..." | |
url="https://raw.githubusercontent.com/${repo}/master/${nested_path}${path}" | |
path_without_filename=$(dirname "/${path}") | |
full_path="${output_path}${path_without_filename}" | |
mkdir -p "${full_path}/" | |
# download and save file from url | |
curl -s -H "Authorization: token ${GITHUB_TOKEN}" "${url}" > "${output_path}/${path}" | |
done | |
echo "All set! 🌈" |
Also I think that you cannot download assets from private repositories via https://raw.githubusercontent.com, if you hit the raw button on a file in a private repository, a ?token=...
gets appended to the https://raw.githubusercontent.com/{repo}/{branch}/{path}
URL. Without it you'll get a 404
I tested it and it works with private repositories 🎉
ah thanks for the flag - update the code so that it looks for a specified, or main, or master branch
works now 🎉
I also totally missed that I have to set the GITHUB_TOKEN
value. If I leave the default value it will not send an unauthenticated request, it will send invalid authentication.
And download from private repositories worked too, so please ignore my comment above. Very cool!
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi Amelia, I tried
but it only created the empty "data" folder, it didn't download the file
octokit.csv
file?I also get a
jq
errorI also tried the same command from your tweet, but got the same result