fkraeutli/downloadGitLfsFiles.md

Last active July 29, 2025 10:59

Star (26) You must be signed in to star a gist
Fork (8) You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/fkraeutli/66fa741d9a8c2a6a238a01d17ed0edc5.js"></script>
Save fkraeutli/66fa741d9a8c2a6a238a01d17ed0edc5 to your computer and use it in GitHub Desktop.

Download ZIP

How to download GIT LFS files

Raw

downloadGitLfsFiles.md

How to retrieve GIT LFS files from GitHub

Retrieving non-LFS files

Through the GitHub API it is possible to retrieve individual files from a Git repository via, e.g. curl. To do so, first retrieve the content information for the relevant file (or folder):

curl https://api.github.com/repos/{organisation}/{repository}/contents/{file or folder path}

For private repositories, authenticate using your username and a personal access token

curl -u {username}:{personal access token'} https://api.github.com/repos/{organisation}/{repository}/contents/{file or folder path}

This will return a JSON response:

{
  "name": "README.md",
  "path": "README.md",
  "sha": "41553899f901843f5339794256s2444ed351708a",
  "size": 815,
  "url": "https://api.github.com/repos/{organisation}/{repository}/contents/README.md?ref=main",
  "html_url": "https://github.com/{organisation}/{repository}/blob/main/README.md",
  "git_url": "https://api.github.com/repos/{organisation}/{repository}/git/blobs/41553899f901843f5339794256s2444ed351708a",
  "download_url": "https://raw.githubusercontent.com/{organisation}/{repository}/main/README.md?token=AAL57UOYWVQ56ZZGDGWYUAK76WFNO",
  "type": "file",
  "_links": {
    "self": "https://api.github.com/repos/{organisation}/{repository}/contents/README.md?ref=main",
    "git": "https://api.github.com/repos/{organisation}/{repository}/git/blobs/41553899f901843f5339794256s2444ed351708a",
    "html": "https://github.com/{organisation}/{repository}/blob/main/README.md"
  }
}

The file can then be downloaded using the sha:

curl -u https://api.github.com/repos/{organisation}/{repository}/git/blobs/41553899f901843f5339794256s2444ed351708a

This gives another JSON response with the file contents in base64 encoding:

{
  "sha": "41553899f901843f5339794256s2444ed351708a",
  "node_id": "{node id}",
  "size": 815,
  "url": "https://api.github.com/repos/{organisation}/{repository}/git/blobs/41553899f901843f5339794256s2444ed351708a",
  "content": "{base64 encoded content}",
  "encoding": "base64"
}

Note that for smaller files, the base64 encoded content will already be included in the first call.

Retrieving LFS files

Retrieving an LFS file requires a few extra steps. For LFS files, decoding the base64 string will not return the file's content, but information in the following format:

version https://git-lfs.github.com/spec/v1
oid sha256:{sha}
size {filesize}

Using this information, you need to create a JSON object as follows, filling in the sha and filesize information from the previous step:

{
    "operation": "download", 
    "transfer": ["basic"], 
    "objects": [
        {"oid": "{sha}", "size": "{size}"}
    ]}
}

Pass this object as data parameter to a curl request to the LFS api:

curl -X POST \
-H "Accept: application/vnd.git-lfs+json" \
-H "Content-type: application/json" \
-d '{"operation": "download", "transfer": ["basic"], "objects": [{"oid": "{sha}", "size": {size}}]}' \
https://github.com/{organisation}/{repository}.git/info/lfs/objects/batch

Almost there! This should return a JSON object that tells you where the file is stored:

{
  "objects": [
    {
      "oid": "{sha}",
      "size": {size},
      "actions": {
        "download": {
          "href": "https://github-cloud.s3.amazonaws.com/alambic/media/278163869/a2/42/{sha}?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=XXX%2F20210106%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20210106T104409Z&X-Amz-Expires=3600&X-Amz-Signature=XXX&X-Amz-SignedHeaders=host&actor_id=XXX&key_id=0&repo_id=XXX&token=1",
          "expires_at": "2021-01-06T11:44:09Z",
          "expires_in": 3600
        }
      }
    }
  ]
}

Download the file from the URL stated in the href attribute.

athletic-geek commented Feb 19, 2024 •

edited

Loading

Hi @fkraeutli,
Thanks for sharing the example, but how to deal with the special case using your method?

Say the content of a small file is like the following, How to avoid it to be mistakenly treated as a LFS blob?
I think there should be a API to determine whether a file is ordinary blob or LFS blob.

version https://git-lfs.github.com/spec/v1
oid sha256:{sha}
size {filesize}

nikelborm commented Jan 10, 2025 •

edited

Loading

@athletic-geek
I don't know how relevant this is to you now, but it may also be helpful to others, so I'm posting it here.

In short: there won't be such an API.

Git LFS is a poorly implemented abstraction on top of the git we used to. There are no differences between the real LFS object annotation inside the git blob (file) and the fake one. There are no flags that are attached to git blobs and that can help determine if the content of the blob is a Git LFS annotation or not. Git LFS client turns such blobs from annotations to real big files on the fly based on .gitattributes files. The first problem arises when dealing with the fact that .gitattributes can be placed anywhere: deeply inside the directory structure, in the repo's root, or even outside the repo e.g. inside the user's home folder. There's no pure, independent, and definitive way to safely determine if it's a Git LFS object or not.

You can delete .gitattributes files, commit the changes, clone from a specific branch into a temporary folder using the --branch flag added to the git clone command, and as a result see that all the files that previously were big LFS files are just small text files with Git LFS annotations. That's because they are and always have been.
You can also add some rules to handle LFS annotation files into the .gitattributes file in your home folder, and they'll be applied globally.
You can manually create a file with a valid Git LFS annotation text, commit it, and you shouldn't be surprised to see that GitHub's /contents API will say that the file's size is the size you had put into the manually created file. They couldn't have known if it was real and they won't.

So the only thing left is to check if the file is a valid Git LFS annotation or not and if it is, just assume it IS a file stored inside Git LFS. GitHub's API does it the same way. If for some reason it's extremely important to know for sure, you can also ask the LFS server and determine if it has an object with a specific sha256 hash.

If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a duck.

Duck test

My good enough marker of the file being a valid git LFS annotation is if it matches the following RegExp available here:

const gitLFSInfoRegexp = /^version (?<version>https:\/\/git-lfs\.github\.com\/spec\/v1)\noid sha256:(?<oidSha256>[0-9a-f]{64})\nsize (?<size>[1-9][0-9]{0,11})\n$/m

nikelborm commented Jan 10, 2025 •

edited

Loading

Also, you don't have to make 2 requests to get a base64 encoded string with a Git LFS annotation. You can activate the object media type by adding Accept: application/vnd.github.v3.object header and you'll get it right inside of the /contents API response.

You'll get a response like this one:

{
    name: '100mb_file.txt',
    path: '100mb_file.txt',
    sha: '7557bc11dbc04337d33e6cd7e6b9bfa2d2d00e2b',
    size: 104857600,
    url: 'https://api.github.com/repos/fetch-gh-folder-tests/public-repo/contents/100mb_file.txt?ref=0362e8aec37c9146e1f946b27d98043a823357b7',
    html_url: 'https://github.com/fetch-gh-folder-tests/public-repo/blob/0362e8aec37c9146e1f946b27d98043a823357b7/100mb_file.txt',
    git_url: 'https://api.github.com/repos/fetch-gh-folder-tests/public-repo/git/blobs/7557bc11dbc04337d33e6cd7e6b9bfa2d2d00e2b',
    download_url: 'https://media.githubusercontent.com/media/fetch-gh-folder-tests/public-repo/0362e8aec37c9146e1f946b27d98043a823357b7/100mb_file.txt',
    type: 'file',
    content: 'dmVyc2lvbiBodHRwczovL2dpdC1sZnMuZ2l0aHViLmNvbS9zcGVjL3YxCm9p\n' +
      'ZCBzaGEyNTY6Y2VlNDFlOThkMGE2YWQ2NWNjMGVjNzdhMmJhNTBiZjI2ZDY0\n' +
      'ZGM5MDA3ZjdmMWM3ZDdkZjY4YjhiNzEyOTFhNgpzaXplIDEwNDg1NzYwMAo=\n',
    encoding: 'base64',
    _links: {
      self: 'https://api.github.com/repos/fetch-gh-folder-tests/public-repo/contents/100mb_file.txt?ref=0362e8aec37c9146e1f946b27d98043a823357b7',
      git: 'https://api.github.com/repos/fetch-gh-folder-tests/public-repo/git/blobs/7557bc11dbc04337d33e6cd7e6b9bfa2d2d00e2b',
      html: 'https://github.com/fetch-gh-folder-tests/public-repo/blob/0362e8aec37c9146e1f946b27d98043a823357b7/100mb_file.txt'
    }
  }

fkraeutli/downloadGitLfsFiles.md

How to retrieve GIT LFS files from GitHub

Retrieving non-LFS files

Retrieving LFS files

athletic-geek commented Feb 19, 2024 •

edited

Loading

Uh oh!

nikelborm commented Jan 10, 2025 •

edited

Loading

Uh oh!

nikelborm commented Jan 10, 2025 •

edited

Loading

Uh oh!

fkraeutli/downloadGitLfsFiles.md

How to retrieve GIT LFS files from GitHub

Retrieving non-LFS files

Retrieving LFS files

athletic-geek commented Feb 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nikelborm commented Jan 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nikelborm commented Jan 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

athletic-geek commented Feb 19, 2024 •

edited

Loading

nikelborm commented Jan 10, 2025 •

edited

Loading

nikelborm commented Jan 10, 2025 •

edited

Loading