Skip to content

Instantly share code, notes, and snippets.

@xiongnemo
Created March 24, 2021 04:05
Show Gist options
  • Save xiongnemo/376512b81e8fbdd99ae2ad38a5a23778 to your computer and use it in GitHub Desktop.
Save xiongnemo/376512b81e8fbdd99ae2ad38a5a23778 to your computer and use it in GitHub Desktop.
Gitlab scraper: all public/internal user's repo
#!/bin/bash
# bash ./gitlab_scraper.bash <your gitlab private token> <gitlab instance base>
# make sure you have jq and git installed
# use ssh_url_to_repo instead of http_url_to_repo to clone with ssh scheme
PRIVATE_TOKEN=$1
GITLAB_BASE=$2
# Gitlab do not have an endpoint that return total user count.
# so we have to manually instruct the script to crawl given PAGE_COUNT of users.
PAGE_COUNT=114514
for (( page=1; page<=$PAGE_COUNT; page++ )); do
for user in $(curl -H "PRIVATE-TOKEN: $PRIVATE_TOKEN" "$GITLAB_BASE/api/v4/users?page=$page&per_page=100" | jq .[].username | tr -d '"'); do
mkdir $user
cd $user
# assume repo count user each user lower than 100
for repo in $(curl -H "PRIVATE-TOKEN: $PRIVATE_TOKEN" "$GITLAB_BASE/api/v4/users/$user/projects/?page=1&per_page=100" | jq .[].http_url_to_repo | tr -d '"'); do
git clone $repo
done
cd ..
done
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment