Skip to content

Instantly share code, notes, and snippets.

@thibaudcolas
Last active June 10, 2024 08:47
Show Gist options
  • Save thibaudcolas/9d9071ddf1e5646a509db48a3b4c3a4a to your computer and use it in GitHub Desktop.
Save thibaudcolas/9d9071ddf1e5646a509db48a3b4c3a4a to your computer and use it in GitHub Desktop.
Local code search across repositories

Local code search

Sample scripts to fetch recently-used repositories in bulk from GitHub and GitLab.

With basic filtering of specific repositories, and of specific files within repositories.

Search

Use your local copy of the code with ag The Silver Searcher, ripgrep, or your IDE.

jquery.floatThead.js
*.snap
*.map
CHANGELOG.md
README.md
.git
*.less
*.min.js
*.*.min.js
*.*.*.min.js
*.*.*.*.min.js
*.*.*.*.*.min.js
*.min.css
*.*.min.css
*.*.*.min.css
*.*.*.*.min.css
*.*.*.*.*.min.css
*.csv
*.bundle.min.js
bootstrap.js
bootstrap.css
bootstrap.custom.css
bootstrap.less
**/bootstrap/**
bootstrap.scss
bootstrap.bundle.min.js
*.bundle.js
*.bundle.js
*.*.bundle.js
modernizr.js
app.js
jquery.js
draftail.js
tiny_mce.js
select2.js
handsontable-6.2.2.full.min.js
package-lock.json
**/fixtures/**
**/migrations/**
**/south_migrations/**
**/dist/**
**/vendor/**
yarn.lock
# https://docs.gitlab.com/ee/api/README.html#personalproject-access-tokens
# Create a personal access token with scope "API" at:
# https://<gitlab host>/profile/personal_access_tokens.
export GITLAB_PRIVATE_TOKEN=todo
# Define which patterns you’d like to ignore when fetching repositories from GitLab.
export GITLAB_IGNORE_REPOSITORIES='(pattern-to-ignore|another-pattern-to-ignore)'
# Define which patterns you’d like to ignore when fetching repositories from wagtail on GitHub.
export GITHUB_IGNORE_REPOSITORIES='(pattern-to-ignore|another-pattern-to-ignore)'
#!/usr/bin/env bash
# See https://sipb.mit.edu/doc/safe-shell/.
set -euf -o pipefail
# Fetch most active public repositories from github/wagtail.
curl https://api.github.com/orgs/wagtail/repos?per_page=50&sort=updated > github-public-latest.json
# Convert into intermediary list of what to clone.
gron github-public-latest.json | grep .git_url | cut -d '"' -f 2 | grep -E -v $GITHUB_IGNORE_REPOSITORIES > github-to-fetch.txt
# Create clone script for final check.
cat github-to-fetch.txt | cut -d '/' -f 5 | cut -d '.' -f -1 | awk -F ':' '{ print "git clone --depth 1 [email protected]:wagtail/"$1".git", "wagtail/"$1 }' > github-clone.sh
# Clone.
bash github-clone.sh
#!/usr/bin/env bash
# See https://sipb.mit.edu/doc/safe-shell/.
set -euf -o pipefail
# Fetch most active repositories from our GitLab.
curl --header "Private-Token: $GITLAB_PRIVATE_TOKEN" https://<gitlab host>/api/v4/projects?simple=true&per_page=100&order_by=last_activity_at > gitlab-latest.json
# Convert into intermediary list of what to clone.
cat 100-latest-gitlab.json | jq '.[] | .namespace.kind + "|" + .ssh_url_to_repo' | grep -v 'user|' | cut -d '"' -f 2 | cut -d '|' -f 2 | grep -E -v $GITLAB_IGNORE_REPOSITORIES > gitlab-to-fetch.txt
# Create folder structure.
cat gitlab-to-fetch.txt | cut -d ':' -f 2 | rev | cut -d '/' -f 2- | rev | xargs -I {} mkdir -p gitlab/{}
# Create clone script for final check.
cat gitlab-to-fetch.txt | rev | cut -d '.' -f 2- | rev | awk -F ':' '{ print "git clone --depth 1 "$1":"$2".git", "gitlab/"$2 }' > gitlab-clone.txt
# Clone.
bash gitlab-clone.txt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment