Skip to content

Instantly share code, notes, and snippets.

@petenelson
Last active January 22, 2023 19:57
Show Gist options
  • Save petenelson/418c55e8ff7b8ad6570e1d8f30a54396 to your computer and use it in GitHub Desktop.
Save petenelson/418c55e8ff7b8ad6570e1d8f30a54396 to your computer and use it in GitHub Desktop.
Bash script: Backup GitHub Repos to S3
#!/bin/bash
DATE=$(date "+%Y-%m-%d")
GITHUB_OWNER=petenelson
BACKUPS_DIR=~/backups
TAR_FILE=github-repos-$DATE.tar.gz
S3_BUCKET=s3://github-offsite-backup
# Creates a directory if it doesn't exist
# $1: dir path
create_dir() {
if [ ! -d $1 ]; then
mkdir $1
fi
}
# Clones a repo locally
# $1: git repo name
# $2: owner name
# $3: backups dir
clone_repo() {
# Change to backups dir
cd $3
# Remove repo dir
rm -rf $1
# Clone the repo
git clone git://github.com/$2/$1.git
# Change to the repo dir
cd $1
# Fetch all branches
git fetch origin
}
# Get a list of GitHub repos that are not forks and clones them
# $1: owner name
# $2: backups dir
clone_owners_repos() {
# Get an array of repo names
REPOS=( $( curl -s https://api.github.com/users/$1/repos | jq -r '.[] | select( .fork == false ) | .name' ) )
# Loop through each repo name and clone it locally
for i in "${REPOS[@]}"
do
clone_repo $i $1 $2 # repo name, owner name, backups dir
done
}
# Create archive of all the repos
# $1: backups dir
# $2: TAR file name
create_tarchive() {
cd $1
touch $2
tar czfv $2 .
}
# Stick is up in S3
# $1: backups dir
# $2: file name
# $3: S3 bucket
copy_file_to_s3() {
cd $1
/usr/local/bin/aws s3 cp $2 $3
}
cleanup() {
cd $1
rm *.gz
}
# Run all the commands
create_dir $BACKUPS_DIR
clone_owners_repos $GITHUB_OWNER $BACKUPS_DIR
create_tarchive $BACKUPS_DIR $TAR_FILE
copy_file_to_s3 $BACKUPS_DIR $TAR_FILE $S3_BUCKET
cleanup $BACKUPS_DIR
@bswatson
Copy link

bswatson commented Feb 2, 2017

Confirm you are in the proper directory before running this rm -rf. If the cd ~/backups fails for any reason, this will execute in the cwd instead.

@bswatson
Copy link

bswatson commented Feb 2, 2017

Might make sense to add this as a variable and check for it's existence as a pre-cursor to executing the script.

@bswatson
Copy link

bswatson commented Feb 2, 2017

Easier to read all of the variables in the script if they are added at the very top.

@bswatson
Copy link

bswatson commented Feb 2, 2017

General comment to break out each distinct task into its own function, then run all of the functions at the very end. You could even run them as func1() && func2() && func3() which would require the previous function returning true before moving on to the next.

One step further would be func1() && func2() && func3() || cleanup_failed_run() which would allow you to handle cleaning up directories, files, etc in case one of the chained functions fails.

@lkwdwrd
Copy link

lkwdwrd commented Feb 2, 2017

I would also write this script to set OWNER to $1, basically a parameter passed to the script. Makes it configuration-less. Probably want to check the first passed arg $1 exists before using it. For your own convenience you can then create an alias for yourself which runs the script passing in your name as the first arg.

.bashrc

alias github_bu='cd /dir/to/standard/location/ && ./backup_github_to_s3.sh petenelson'

Then anywhere in terminal

$ github_bu

@petenelson
Copy link
Author

Thanks @bswatson @lkwdwrd, going to do some refactoring. FYI, I have this in my user's bin folder within the path, so no need to alias it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment