-
-
Save virgiliu/eefbadef4de9d2ecb2e01020ae471892 to your computer and use it in GitHub Desktop.
// Run this in the content download page and it will trigger download for everything | |
var sleep = (milliseconds) => { | |
return new Promise(resolve => setTimeout(resolve, milliseconds)) | |
} | |
var waitTime = 1500; //ms | |
var x = $( "button:contains('Download')" ); | |
for(var i = 0; i < x.length ; i++) | |
{ | |
(function(idx) { | |
// Wait needed because browser blocks network calls if you make too many too fast | |
sleep(i * waitTime).then(() => { | |
x[idx].click(); | |
}); | |
})(i) | |
} | |
HTML has changed, here's a fix:
var x = $("a.button:contains('Download')");
Thanks @virgiliu and @bwiedmann !
HTML can change depending on your language, in the case of Spanish:
var x = $("a.button:contains('Descargar')");
Hope you can make tutorial on YouTube how to download the entire library...
that's not possible. this script works only for things that you already bought and have in your library.
the goal of the script is to download your entire purchase history for archival/backup purposes, it can't bypass the payment process or download packages that you don't already own.
it could be possible to remake a script that does that but its kinda sketchy too if you get cought using it
Here is an alternative if you wanted to do this from Python:
import urllib.request
from bs4 import BeautifulSoup
import argparse
import os
def download_files(download_folder):
base_url = "<The base url with /r/>"
with urllib.request.urlopen(base_url) as url:
s = url.read()
soup = BeautifulSoup(s, "html.parser")
buttons = soup.body.find_all(attrs={"class": "button"})
# Skip first one, its just a manual.
for button in buttons:
if "Download" in button.text:
# Find the h4 tag in the previous div
h4_tag = (
button.find_previous("div")
.find_previous("div")
.find("h4")
.text.lower()
.replace(" ", "_")
.replace(",", "")
)
if h4_tag:
file_name = f"{h4_tag}.zip"
print(f"Downloading {file_name}")
file_url = base_url + "/" + button["data-resource-id"]
file_path = os.path.join(download_folder, file_name)
urllib.request.urlretrieve(file_url, file_path)
print(f"Downloaded {file_path}")
if __name__ == "__main__":
parser = argparse.ArgumentParser(
description="Download files to a specified folder."
)
parser.add_argument(
"download_folder", type=str, help="Folder path to download files"
)
args = parser.parse_args()
download_files(args.download_folder)
Another python lib that can download a single product or entire library: https://github.com/obsessedcake/gumroad-utils
- it also preserves directory structure.
- unfortunately doesn't support parallel download.
Alternative approach that i consider way more stable would be:
- extract the data links from the page (same as above, run in browser)
var x = $( "button:contains('Download')" ); var result = [] for(var i = 0; i < x.length ; i++) { result.push(x[i].getAttribute("data-resource-id")) } console.log(result)
This will output a list of the data links. Right click on the object to copy it as a text json.
- Now all you need to do is prepend them with
https://app.gumroad.com/r/<your_id>/
and you have functional direct download links. Drop into your download manager of choice (or wget, curl, whatever), and you have guaranteed successful download. You might run into minor problems with filenames as this link will redirect you to a true URL, but I'm sure basically every tool in existence has a way to handle that.EDIT: might as well post a full lazy script:
var x = $("button:contains('Download')"); var result = []; for (var i = 0; i < x.length; i++) { result.push(x[i].getAttribute("data-resource-id")); } var currentUrl = window.location.href; var newBaseUrl = currentUrl.replace("/d/", "/r/"); var newUrls = []; result.forEach(function (resourceId) { newUrls.push(newBaseUrl + "/" + resourceId); }); var blob = new Blob([newUrls.join("\n")], { type: "text/plain" }); var a = document.createElement("a"); a.href = URL.createObjectURL(blob); a.download = "urls.txt"; a.style.display = "none"; document.body.appendChild(a); a.click(); document.body.removeChild(a); URL.revokeObjectURL(a.href);
This script creates the URLs for you, puts them into a file and triggers a download for this file (saves it). This assumes you're on the page
https://app.gumroad.com/d/<your_id>
I get "HTTP request sent, awaiting response... 404 Not Found" errors. What can I do?
Alternative approach that i consider way more stable would be:
- extract the data links from the page (same as above, run in browser)
var x = $( "button:contains('Download')" ); var result = [] for(var i = 0; i < x.length ; i++) { result.push(x[i].getAttribute("data-resource-id")) } console.log(result)
This will output a list of the data links. Right click on the object to copy it as a text json.
- Now all you need to do is prepend them with
https://app.gumroad.com/r/<your_id>/
and you have functional direct download links. Drop into your download manager of choice (or wget, curl, whatever), and you have guaranteed successful download. You might run into minor problems with filenames as this link will redirect you to a true URL, but I'm sure basically every tool in existence has a way to handle that.EDIT: might as well post a full lazy script:
var x = $("button:contains('Download')"); var result = []; for (var i = 0; i < x.length; i++) { result.push(x[i].getAttribute("data-resource-id")); } var currentUrl = window.location.href; var newBaseUrl = currentUrl.replace("/d/", "/r/"); var newUrls = []; result.forEach(function (resourceId) { newUrls.push(newBaseUrl + "/" + resourceId); }); var blob = new Blob([newUrls.join("\n")], { type: "text/plain" }); var a = document.createElement("a"); a.href = URL.createObjectURL(blob); a.download = "urls.txt"; a.style.display = "none"; document.body.appendChild(a); a.click(); document.body.removeChild(a); URL.revokeObjectURL(a.href);
This script creates the URLs for you, puts them into a file and triggers a download for this file (saves it). This assumes you're on the page
https://app.gumroad.com/d/<your_id>
i have a few questions since i am a new guy and i dont know how to use most stuff. Do i paste the script in the console? ; how do i prepend the thing that i pasted into the url? ; should the url be the one with the model i wanna get?
Since GR is committing a tumblr, I was really hoping this would work or the python one mentioned above, but neither seem too, at least for me.
I can easily run this in console on my library page and all I get is an empty text file. Am I just stupid or is this no longer functional. I tried the python options but I honestly have no idea what I'm doing there.
I think I got it. Gumroad changed the link structure by adding "product_files?product_file_ids[]=". I simply tweaked the script above a bit so that it outputs a list of download links. Paste them in a downloader and voila
Thanks for the original script since I certainly can't write lol
var x = $("a.button:contains('Download')");
var result = [];
for (var i = 0; i < x.length; i++) {
result.push(x[i].getAttribute("data-resource-id"));
}
var currentUrl = window.location.href;
var newBaseUrl = currentUrl.replace("/d/", "/r/");
var newUrls = [];
result.forEach(function (resourceId) {
newUrls.push(newBaseUrl + "/product_files?product_file_ids%5B%5D=" + resourceId);
});
var blob = new Blob([newUrls.join("\n")], { type: "text/plain" });
var a = document.createElement("a");
a.href = URL.createObjectURL(blob);
a.download = "urls.txt";
a.style.display = "none";
document.body.appendChild(a);
a.click();
document.body.removeChild(a);
URL.revokeObjectURL(a.href);
With Gumroad going down on NSFW content, I tried to do something myself.
async function fetchUrls(link) {
return await fetch(link)
.then(res => res.text())
.then(text => {
let parser = new DOMParser();
let doc = parser.parseFromString(text, "text/html");
var script = doc.querySelector("script[data-component-name]");
var links = Array.from(JSON.parse(script.innerText).content.content_items).map((item) => { return "https://app.gumroad.com" + item.download_url });
return links;
});
}
Promise.all(Array.from(document.querySelectorAll("article a"))
.filter((link) => { return link.href.includes("/d/") })
.map((link) => { return link.href })
.map((link) => {
return fetchUrls(link);
})).then(function(urls){
var blob = new Blob([urls.flat(1).join("\n")], {type: "text/plain;charset=utf-8"});
var url = window.URL || window.webkitURL;
var link = url.createObjectURL(blob);
var a = document.createElement("a");
a.download = "liens_downloads_gumroad.txt";
document.body.appendChild(a);
a.href = link;
a.click();
a.remove();
});
Hope that helps people.
@Fhurai Thanks for this. I did some modifications and this will create a txt file that has all links a bit more organized for you. You shouldn't pop this list into a downloader though -- I did this because I didn't want all my assets in one folder and this allows me to get the original store page and the links associated with it so I can then download those to a specific folder. Hope this helps!
For anyone who stumbles across this and has no idea what to do with this, I made a little post here: https://blog.wzyss.dev/easily-archive-your-gumroad-library/
async function fetchUrls(link) {
// Fetch and process URLs from the given link
return fetch(link)
.then(res => res.text())
.then(text => {
let parser = new DOMParser();
let doc = parser.parseFromString(text, "text/html");
var script = doc.querySelector("script[data-component-name]");
var links = Array.from(JSON.parse(script.innerText).content.content_items).map((item) => { return "https://app.gumroad.com" + item.download_url });
// Return both the original link and the associated download URLs
return {link, downloads: links};
});
}
Promise.all(Array.from(document.querySelectorAll("article a"))
.filter((link) => link.href.includes("/d/"))
.map((a) => a.href)
.map((link) => {
// Fetch URLs and maintain their association with the original link
return fetchUrls(link);
}))
.then(function(results) {
// Process results to group downloads by their originating link
let groupedDownloads = results.reduce((acc, {link, downloads}) => {
acc[link] = downloads;
return acc;
}, {});
// Prepare data for export
let exportData = Object.entries(groupedDownloads).map(([page, downloads]) => {
return `${page}\n${downloads.join("\n")}`;
}).join("\n\n");
// Create a blob and download it
var blob = new Blob([exportData], {type: "text/plain;charset=utf-8"});
var url = window.URL || window.webkitURL;
var downloadLink = url.createObjectURL(blob);
var a = document.createElement("a");
a.download = "categorized_downloads_gumroad.txt";
document.body.appendChild(a);
a.href = downloadLink;
a.click();
a.remove();
});
Much appreciated guys!
In the mean time, I resorted to download manually, but I'll be sure to try the updated scripts next time.
@itswzyss Good idea. Went through little tests with my file. Once a number of links attained, Gumroad decide to cut the flow and stop the connexion. So, I decided to reduce the file in multiple files, once per artist (even if you have multiple purchases). 🤣
One thing more. Content creator can use external links to host their files instead of gumroad. I alert of that in the files & console.
let promises = await Promise.all(Array.from(document.querySelectorAll("article a.stretched-link")) // Get promises with purchases download links.
.map((link) => { return link.href })
.map((link) => {
return fetch(link) // Get link from purchase link.
.then(res => res.text())
.then(text => {
let parser = new DOMParser(); // Create DOMContent from fetch content with download links.
let doc = parser.parseFromString(text, "text/html");
var script = doc.querySelector("script[data-component-name]");// Get script in which the download content JS is.
if(JSON.parse(script.innerText).content.content_items.length === 0)
console.log(JSON.parse(script.innerText).creator.name + " use an external hosting service. Please watch their files to get the purchased download links"); // Alert in console for external hosting services.
return {
artist: JSON.parse(script.innerText).creator.name,
links: (JSON.parse(script.innerText).content.content_items.length > 0 ?
JSON.parse(script.innerText).content.content_items.map((item) => { return "https://app.gumroad.com" + item.download_url }) :
["external link in following page : " + link])
};// Return both the artist and the associated download URLs (if content is in external website from gumroad, the page will be alerted).
});
}));
let timer = 0; // Timer to delay the download (to avoid download throttle).
promises // Need the promises to be resolved from here.
.reduce((acc, d) => {
const found = acc.find(a => a.artist === d.artist);
const value = d.links.flat(Infinity);
if (!found) acc.push({ artist: d.artist, links: [value] })
else found.links.push(value);
return acc;
}, [])// Regroup links per artist.
.sort(function (a, b) {
return a.artist.localeCompare(b.artist);
})// Sort artist per name.
.forEach((data) => {
setTimeout(function () {
var blob = new Blob([data.links.flat(Infinity).join("\n")], { type: "text/plain;charset=utf-8" });
var url = window.URL || window.webkitURL;
var link = url.createObjectURL(blob);// Creation of download link.
var a = document.createElement("a");
a.download = "downloads_" + data.artist + "_gumroad.txt";
document.body.appendChild(a);
a.href = link;// Creation of the download button.
a.click(); // Click to begin download.
a.remove(); // Remove the download button.
}, timer += 1500);// Delay to avoid download throttle.
});// From this, download
```
I wrote a thing https://github.com/InfiniteCanvas/gumload though it doesn't have a proper readme yet. You just install the requirements, edit the config.json for your use (get the app session and guid from your cookies) and then it should work. You can set refresh to false after the first run, since it doesn't need to refetch everything on subsequent runs. Useful for when there are download errors. It will only redownload stuff with mismatching sizes.
It basically has the same setup as https://github.com/obsessedcake/gumroad-utils and I inspired a soup json parsing bit from it into my code. For some reason this repo didn't work for me or I wouldn't have written my own thing.
Well I gave it a try, I ran "pip3 install -r requirements.txt" just to be safe, then adjusted the config file.
My config.json
{
"threads": 5,
"only_specified_creators": false,
"match_size_using_content_info": false,
"db_path": "gumload.json",
"refresh": true,
"folder": "J:\Gumroad\",
"userAgent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36",
"_gumroad_app_session": "Redacted",
"_gumroad_guid": "redacted",
"creators": [
]
}
Console results.
C:\Users\Kevin\Desktop\gumload-master>main.py
Updating Library...
Processing creators: 100%|████████████████████████████████████████████████████████████| 85/85 [00:00<00:00, 646.28it/s]
Updating January 2015 Batch 1 : 100%|███████████████████████████| 291/291 [00:02<00:00, 124.84it/s]
Downloading everything from sakimichanpatreon[4760375177590]
ETC
Downloading everything from 风挽[5017216242919]
Downloading 0 files
So what am I doing wrong?
Well I gave it a try, I ran "pip3 install -r requirements.txt" just to be safe, then adjusted the config file.
My config.json { "threads": 5, "only_specified_creators": false, "match_size_using_content_info": false, "db_path": "gumload.json", "refresh": true, "folder": "J:\Gumroad", "userAgent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36", "_gumroad_app_session": "Redacted", "_gumroad_guid": "redacted", "creators": [ ] }
So what am I doing wrong?
Nothing, I forgot to actually fetch stuff when no creators were specified. I've only done it with specifying creators, since I didn't want to download TBs of data. I'm gonna fix it now.
Well I gave it a try, I ran "pip3 install -r requirements.txt" just to be safe, then adjusted the config file.
My config.json { "threads": 5, "only_specified_creators": false, "match_size_using_content_info": false, "db_path": "gumload.json", "refresh": true, "folder": "J:\Gumroad", "userAgent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36", "_gumroad_app_session": "Redacted", "_gumroad_guid": "redacted", "creators": [ ] }
So what am I doing wrong?Nothing, I forgot to actually fetch stuff when no creators were specified. I've only done it with specifying creators, since I didn't want to download TBs of data. I'm gonna fix it now.
Ah okay. XD
I forgot to tell, I fixed it about an hour ago lol
I tried it out and it should work. Probably..
I forgot to tell, I fixed it about an hour ago lol I tried it out and it should work. Probably..
Yup, looks like its working, thanks a bunch!!!!
I tried @Kawaru86 's gumload plugin, and what I got was this:
Updating Library...
Failed to update library due to ['NoneType' object has no attribute 'string']
Downloading 0 files
My config.json:
{
"threads": 5,
"only_specified_creators": true,
"match_size_using_content_info": true,
"db_path": "gumload.json",
"refresh": true,
"folder": "E:\_GumroadResults\",
"userAgent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36 OPR/107.0.0.0",
"_gumroad_app_session": "redacted",
"_gumroad_guid": "redacted",
"creators": [
]
}
Maybe it's because the creators array was empty... How do I get the right values to put in for name, id and count? I'd also like to note that I tried the stuff in the article that itswzyss made, but JDownloader didn't work with the text files outputted by either script.
Try changing using
"only_specified_creators": false
instead of "only_specified_creators": true
@InfiniteCanvas thanks for your work, the script works flawlessly!
Hi all! I've update https://github.com/obsessedcake/gumroad-utils. If anyone interested, take a look 😄
@InfiniteCanvas @Kawaru86 Thanks, it worked!
Hi all! I've update https://github.com/obsessedcake/gumroad-utils. If anyone interested, take a look 😄
@obsessedcake Thanks for building and sharing that. For me the original script was a one-time thing which I made public by mistake, but since people started talking I decided to leave it up even though I don't have the time to maintain it 😅
Can anyone please help me to download from the beginning and how to code it please.. that'd be great help thankyou
Hi all! I've update https://github.com/obsessedcake/gumroad-utils. If anyone interested, take a look 😄
I've tried using the library but can't download any files via the URL I'm getting an error
What is this error and is there any way to fix it
Alternative approach that i consider way more stable would be:
This will output a list of the data links. Right click on the object to copy it as a text json.
https://app.gumroad.com/r/<your_id>/
and you have functional direct download links. Drop into your download manager of choice (or wget, curl, whatever), and you have guaranteed successful download. You might run into minor problems with filenames as this link will redirect you to a true URL, but I'm sure basically every tool in existence has a way to handle that.EDIT: might as well post a full lazy script:
This script creates the URLs for you, puts them into a file and triggers a download for this file (saves it). This assumes you're on the page
https://app.gumroad.com/d/<your_id>