Through the GitHub API it is possible to retrieve individual files from a Git repository via, e.g. curl
. To do so, first retrieve the content information for the relevant file (or folder):
curl https://api.github.com/repos/{organisation}/{repository}/contents/{file or folder path}
For private repositories, authenticate using your username and a personal access token
curl -u {username}:{personal access token'} https://api.github.com/repos/{organisation}/{repository}/contents/{file or folder path}
This will return a JSON response:
{
"name": "README.md",
"path": "README.md",
"sha": "41553899f901843f5339794256s2444ed351708a",
"size": 815,
"url": "https://api.github.com/repos/{organisation}/{repository}/contents/README.md?ref=main",
"html_url": "https://github.com/{organisation}/{repository}/blob/main/README.md",
"git_url": "https://api.github.com/repos/{organisation}/{repository}/git/blobs/41553899f901843f5339794256s2444ed351708a",
"download_url": "https://raw.githubusercontent.com/{organisation}/{repository}/main/README.md?token=AAL57UOYWVQ56ZZGDGWYUAK76WFNO",
"type": "file",
"_links": {
"self": "https://api.github.com/repos/{organisation}/{repository}/contents/README.md?ref=main",
"git": "https://api.github.com/repos/{organisation}/{repository}/git/blobs/41553899f901843f5339794256s2444ed351708a",
"html": "https://github.com/{organisation}/{repository}/blob/main/README.md"
}
}
The file can then be downloaded using the sha
:
curl -u https://api.github.com/repos/{organisation}/{repository}/git/blobs/41553899f901843f5339794256s2444ed351708a
This gives another JSON response with the file contents in base64 encoding:
{
"sha": "41553899f901843f5339794256s2444ed351708a",
"node_id": "{node id}",
"size": 815,
"url": "https://api.github.com/repos/{organisation}/{repository}/git/blobs/41553899f901843f5339794256s2444ed351708a",
"content": "{base64 encoded content}",
"encoding": "base64"
}
Note that for smaller files, the base64 encoded content will already be included in the first call.
Retrieving an LFS file requires a few extra steps. For LFS files, decoding the base64 string will not return the file's content, but information in the following format:
version https://git-lfs.github.com/spec/v1
oid sha256:{sha}
size {filesize}
Using this information, you need to create a JSON object as follows, filling in the sha
and filesize
information from the previous step:
{
"operation": "download",
"transfer": ["basic"],
"objects": [
{"oid": "{sha}", "size": "{size}"}
]}
}
Pass this object as data parameter to a curl request to the LFS api:
curl -X POST \
-H "Accept: application/vnd.git-lfs+json" \
-H "Content-type: application/json" \
-d '{"operation": "download", "transfer": ["basic"], "objects": [{"oid": "{sha}", "size": {size}}]}' \
https://github.com/{organisation}/{repository}.git/info/lfs/objects/batch
Almost there! This should return a JSON object that tells you where the file is stored:
{
"objects": [
{
"oid": "{sha}",
"size": {size},
"actions": {
"download": {
"href": "https://github-cloud.s3.amazonaws.com/alambic/media/278163869/a2/42/{sha}?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=XXX%2F20210106%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20210106T104409Z&X-Amz-Expires=3600&X-Amz-Signature=XXX&X-Amz-SignedHeaders=host&actor_id=XXX&key_id=0&repo_id=XXX&token=1",
"expires_at": "2021-01-06T11:44:09Z",
"expires_in": 3600
}
}
}
]
}
Download the file from the URL stated in the href
attribute.
Hi @fkraeutli,
Thanks for sharing the example, but how to deal with the special case using your method?
Say the content of a small file is like the following, How to avoid it to be mistakenly treated as a LFS blob?
I think there should be a API to determine whether a file is ordinary blob or LFS blob.