Last active
March 4, 2025 10:15
-
-
Save milanboers/f34cdfc3e1ad9ba02ee8e44dae8e093f to your computer and use it in GitHub Desktop.
Clone all repositories of a Github user
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
curl -s https://api.github.com/users/milanboers/repos | grep \"clone_url\" | awk '{print $2}' | sed -e 's/"//g' -e 's/,//g' | xargs -n1 git clone |
In my case, the pages have 30 entries, not 100. In any case, I think this solution is independent of that:
gubAll() {
page=1
repos=0
entries=`curl -s https://api.github.com/users/$1/repos?page=$page | grep \"clone_url\"`
while [[ -n $entries ]]
do
# echo "$entries"
repos=$(( $repos + $(echo "$entries" | wc -l) ))
echo "$entries" | awk '{print $2}' | sed -e 's/"//g' -e 's/,//g' | xargs -n1 git clone
page=$(( $page + 1 ))
entries=`curl -s https://api.github.com/users/$1/repos?page=$page | grep \"clone_url\"`
done
echo Pages: $(( $page -1 )), repos: $repos.
}
Then:
$ gubAll milanboers
will grab all 32 repos (at this point in time) from milanboers
.
This simultaneously downloads as many repos as possible from any organization using parallel
ORGANIZATION=myorg
NUM_REPOS=2000
gh repo list $ORGANIZATION -L $NUM_REPOS --json name |
jq -r --arg org $ORGANIZATION '$org + "/" + .[].name' |
parallel gh repo clone
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
((page_count = public_repos / 100 + 1))
+10 for this smart move, you fetched the API repo amount and then use it for pagination to expand the crawl limits.
Will try your tool at some time, atm no ongoing tasks for that. But still good job! 👍