Skip to content

Instantly share code, notes, and snippets.

@dantonnoriega
Created September 2, 2021 19:00
Show Gist options
  • Save dantonnoriega/6f7417232386b7cc8fa77950691e5b4b to your computer and use it in GitHub Desktop.
Save dantonnoriega/6f7417232386b7cc8fa77950691e5b4b to your computer and use it in GitHub Desktop.
Steps to use the git-lfs Batch API to download content from GitHub (enterprise) when the raw content is actually on git-lfs
# github + git lfs api contenct extraction
# GOAL: pull a large file that was sent to git lfs (cannot download directly from github.com)
## git-lfs API: https://github.com/git-lfs/git-lfs/tree/main/docs/api
## github API: https://docs.github.com/en/rest/reference/repos#get-repository-content
OWNER=dantonnoriega
REPO=some-repo
BRANCH=develop
# -----------------
# (1) get the raw info about the object ("Accept: application/vnd.github.v3.raw")
# - we get back a response toplined "version https://git-lfs.github.com/spec/v1"
# - this informs us that the data are in git-lfs (this makes it tricky to download)
# - note that if you try the `download_url` link from github, you'll get back nothing (because its not on github)
# -----------------
curl -s -u $GITHUB_PIE_PAT:x-oauth-basic -H "Accept: application/vnd.github.v3.raw" \
-X GET "https://github.com/api/v3/repos/$OWNER/$REPO/contents/$BRANCH/data/aggregates.csv.gz"
#> version https://git-lfs.github.com/spec/v1
#> oid sha256:29201c0ab4ba47be37ed5bf6b350XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
#> size 755
# -----------------
# (2) authenticate vs git-lfs to authorize a download
# - "AXXXXXXXXXXXXXXXXXXXXXXXXXXXX" is a stand in
# -----------------
ssh [email protected] git-lfs-authenticate $OWNER/$REPO.git download
#> {
#> "href": "https://github.com/_lfs/$OWNER/$REPO",
#> "header": {
#> "Authorization": "RemoteAuth AXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
#> },
#> "expires_at": "2021-02-11T23:51:25Z",
#> "expires_in": 599
#> }
# -----------------
# (2) get the raw info about the object
# -----------------
curl -s -u $GITHUB_PIE_PAT:x-oauth-basic -H "Accept: application/vnd.github.v3.raw" \
-X GET "https://github.com/api/v3/repos/$OWNER/$REPO/contents/$BRANCH/data/aggregates.csv.gz"
#> version https://git-lfs.github.com/spec/v1
#> oid sha256:29201c0ab4ba47be37ed5bf6b350XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
#> size 755
# -----------------
# (3) authenticate a download request and get token/href
# -----------------
curl -s -X POST --url "https://github.com/_lfs/$OWNER/$REPO.git/info/lfs/objects/batch" \
-H "Authorization: RemoteAuth AXXXXXXXXXXXXXXXXXXXXXXXXXXXX" \
-H "Accept: application/vnd.git-lfs+json" \
-H "Content-Type: application/vnd.git-lfs+json" \
--data '{
"operation": "download",
"transfer": ["basic"],
"objects": [
{
"oid": "29201c0ab4ba47be37ed5bf6b350XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
"size": 755
}
]}'
#> {
#> "objects": [
#> {
#> "oid": "29201c0ab4ba47be37ed5bf6b350XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
#> "size": 755,
#> "actions": {
#> "download": {
#> "href": "https://media.github.com/lfs/10943703/objects/29201c0ab4ba47be37ed5bf6b350XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
#> "header": {
#> "Authorization": "RemoteAuth AYYYYYYYYYYYYYYYYYYYYYYYYYYYY"
#> }
#> }
#> }
#> }
#> ]
#> }
# -----------------
# (4) take new token (e.g. AYYYYYYYYYYYYYYYYYYYYYYYYYYYY) and the href and download (need --output option)
# -----------------
curl -H "Authorization: RemoteAuth AYYYYYYYYYYYYYYYYYYYYYYYYYYYY" \
--url "https://media.github.com/lfs/10943703/objects/29201c0ab4ba47be37ed5bf6b350XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX" \
--output aggregates.csv.gz
# i named the file but you can see the size is correct!
# Downloads ❯ stat -l aggregates.csv.gz
#> -rw-r--r-- 1 danton staff 755 Feb 11 15:50:49 2021 aggregates.csv.gz
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment