When cloning a Git repository, you acquire all files contained within that repository. However, often we are only interested in a small subset of files and/or directories, thus having the entire repository can be overwhelming.
Nevertheless, since Git v1.7.0
, a feature called Sparse Checkout was introduced to address this issue. Spase checkout which change the working tree from having all tracked files present to only having a subset of those files. It can also switch which subset of files are present, or undo and go back to having all tracked files present in the working copy.
The steps to do a sparse clone are as follows:
mkdir <repo>
cd <repo>
git init
git remote add -f origin <url>
This creates an empty repository with your remote, and fetches all objects but doesn't check them out. Then do:
git config core.sparseCheckout true
Now you need to define which files/folders you want to actually check out. This is done by listing them in .git/info/sparse-checkout, eg:
echo "some/dir/" >> .git/info/sparse-checkout
echo "another/sub/tree" >> .git/info/sparse-checkout
Last but not least, update your empty repo with the state from the remote:
git pull origin master
You will now have files "checked out" for some/dir and another/sub/tree on your file system (with those paths still), and no other paths present.
You might want to have a look at the extended tutorial and you should probably read the official documentation for sparse checkout and read-tree.
As a shell function:
function git_sparse_clone() (
rurl="$1" localdir="$2" && shift 2
mkdir -p "$localdir"
cd "$localdir"
git init
git remote add -f origin "$rurl"
git config core.sparseCheckout true
# Loops over remaining args
for i; do
echo "$i" >> .git/info/sparse-checkout
done
git pull origin master
)