-
-
Save clrung/75459a9fe954313c57f69d6cdfd502ec to your computer and use it in GitHub Desktop.
#!/bin/bash | |
# Usage: clone_all_repos.sh [organization] <output directory> | |
ORG=$1 | |
PER_PAGE=100 | |
GIT_OUTPUT_DIRECTORY=${2:-"/tmp/${ORG}_repos"} | |
if [ -z "$GITHUB_TOKEN" ]; then | |
echo -e "Variable GITHUB_TOKEN isn't set! Please specify your GitHub token.\n\nMore info: https://help.github.com/articles/creating-a-personal-access-token-for-the-command-line/" | |
exit 1 | |
fi | |
if [ -z "$ORG" ]; then | |
echo "Variable ORG isn't set! Please specify the GitHub organization." | |
exit 1 | |
fi | |
mkdir -p $GIT_OUTPUT_DIRECTORY | |
echo "Cloning repos in $ORG to $GIT_OUTPUT_DIRECTORY/..." | |
for ((PAGE=1; ; PAGE+=1)); do | |
REPO_COUNT=0 | |
ERROR=0 | |
while read REPO_NAME ; do | |
((REPO_COUNT++)) | |
echo -n "Cloning $REPO_NAME to $GIT_OUTPUT_DIRECTORY/$REPONAME... " | |
git clone https://github.com/$ORG/$REPO_NAME.git $GIT_OUTPUT_DIRECTORY/$REPO_NAME >/dev/null 2>&1 || | |
{ echo -e "ERROR: Unable to clone!" ; ERROR=1 ; continue ; } | |
echo "done" | |
done < <(curl -u :$GITHUB_TOKEN -s "https://api.github.com/orgs/$ORG/repos?per_page=$PER_PAGE&page=$PAGE" | jq -r ".[]|.name") | |
if [ $ERROR -eq 1 ] ; then exit 1 ; fi | |
if [ $REPO_COUNT -ne $PER_PAGE ] ; then exit 0 ; fi | |
done |
@hotelzululima Hello! Sorry this didn't work with user repos. When I changed L26 to the following, it cloned your repos as expected.
done < <(curl -u :$GITHUB_TOKEN -s "https://api.github.com/users/$ORG/repos?per_page=$PER_PAGE&page=$PAGE" | jq -r ".[]|.name")
(I changed https://api.github.com/orgs/
-> https://api.github.com/users/
)
$ ./clone_all_repos.sh hotelzululima archives
Cloning repos in hotelzululima to archives/...
starting to clone 0bin to archives/0bin... done
starting to clone 0x00sec_code to archives/0x00sec_code... done
starting to clone 3D-Printing to archives/3D-Printing... done
starting to clone 3DRSoloHacks to archives/3DRSoloHacks... done
starting to clone 3DRSoloHacks-1 to archives/3DRSoloHacks-1... done
starting to clone 3snake to archives/3snake... done
starting to clone a2sv to archives/a2sv... done
...
and it stopped at 300 repos... :/ am I running into another API limitation?
hzl
@hotelzululima Sorry about that! It's working on my machine (if I had a dollar for every time I said that...).
$ ./clone_all_repos.sh hotelzululima archives
Cloning repos in hotelzululima to archives/...
Cloning 0bin to archives/... done
Cloning 0x00sec_code to archives/... done
Cloning 3D-Printing to archives/... done
...
Cloning echo-dot to archives/... done
Cloning ecu-tool to archives/... done
Cloning eda2 to archives/... done
The script currently cloned 320 repos (eda2
is #320), so I stopped the script's execution here.
Since you said it stopped at 300 repos, it must have had trouble fetching the next page for some reason. Try replacing the main while
loop with this, which will print each repo's name rather than git clone
it, a costly operation:
while read REPO_NAME ; do
((REPO_COUNT++))
echo "$REPO_COUNT: $REPO_NAME"
done < <(curl -u :$GITHUB_TOKEN -s "https://api.github.com/users/$ORG/repos?per_page=$PER_PAGE&page=$PAGE" | jq -r ".[]|.name")
Execution should look like this:
$./clone_all_repos.sh hotelzululima archives
Cloning repos in hotelzululima to archives/...
1: 0bin
2: 0x00sec_code
...
20: eda2
$REPO_COUNT
resets after each page, which is why we see the repo count go from 1 to 100 and then loop back to 1.
One thing that comes to mind is that your GitHub token faced some sort of rate limiting. You can perform this curl
to see if that's the case:
$ curl -H "Authorization: token $GITHUB_TOKEN" https://api.github.com/rate_limit
On 1/28/20 12:30 PM, Christopher Rung wrote:
|url -H "Authorization: token $GITHUB_TOKEN"
https://api.github.com/rate_limit|
curl -H "Authorization: token $GITHUB_TOKEN" https://api.github.com/rate_limit
{
"resources": {
"core": {
"limit": 5000,
"remaining": 5000,
"reset": 1580248075
},
"search": {
"limit": 30,
"remaining": 30,
"reset": 1580244535
},
"graphql": {
"limit": 5000,
"remaining": 5000,
"reset": 1580248075
},
"integration_manifest": {
"limit": 5000,
"remaining": 5000,
"reset": 1580248075
},
"source_import": {
"limit": 100,
"remaining": 100,
"reset": 1580244535
}
},
"rate": {
"limit": 5000,
"remaining": 5000,
"reset": 1580248075
}
}
Doesn't look like you've been rate limited (remaining
== limit
above).
Did you try simply echo
ing the repos rather than clone? Maybe try the modified while
loop I posted earlier.
echoing the repos made it all the way through..
cloning them only gets to 100 repos...
sigh
hzl
ps sorry about the disconnect earlier.. had to rescue a clients network..
turns out when running bash -vx the control flow differed between the count version and the clone version.. and the issue was
{ echo -e "ERROR: Unable to clone!" ; ERROR=1 ; continue ; }
combined with:
if [ $ERROR -eq 1 ] ; then exit 1 ; fi
ended the clone op early.. by changing the line to read :
{ echo -e "ERROR: Unable to clone!" ; ERROR=2 ; continue ; }
and depending on:
if [ $REPO_COUNT -ne $PER_PAGE ] ; then exit 0 ; fi
to break out of the git clone loop the clone all repos succeeds(least its still running...)
hzl
thanx for helping me to safeguard my "bookmarks" ..(my precious)
now since I omitted doing a git remote origin on most of these I am wondering on how to scrape it out of github(have to start reading
that api documentation next :) and add it to all the repos being git cloned..
hzl
This script is awesome. Do you guys mind if I redistribute the gist as part of a repo for bulk cloning git repos (under MIT license)?
have you tried something like this
curl -s https://api.github.com/orgs/blockchain-etl/repos | jq -r ".[].clone_url" | xargs -L1 git clone
as you already have jq installed
Do you guys mind if I redistribute the gist as part of a repo for bulk cloning git repos (under MIT license)?
Hello @aselunar, sorry for the delay and I appreciate the compliment! Yes, that is fine with me - go ahead and use this as you wish.
@banerRana, yes, that will work for any org that has less than 100 repos. This script handles pagination.
This script clones all repos in an organization. It iterates through each page in the repository list, when it parses the repository's name and feeds it into a loop, which clones the repo.
The script finishes successfully when the number of repos on the current page does not match the
PER_PAGE
variable, since this implies there were less than the maximum number of repos left to show. A consequence of this is that the script will never finish if the total number of repos in your organization is evenly divisible byPER_PAGE
.If there is an issue cloning, the script will output
ERROR: Unable to clone!
next to the problematic repo, and the script will halt before the next page of repos is fetched.Arguments
Dependencies
$GITHUB_TOKEN
env var setjq
brew install jq
To clone repos in a user account, please change
https://api.github.com/orgs/
tohttps://api.github.com/users/
(thanks, @hotelzululima!)TODO