Skip to content

Instantly share code, notes, and snippets.

@virgiliu
Created November 1, 2020 14:13
Show Gist options
  • Save virgiliu/eefbadef4de9d2ecb2e01020ae471892 to your computer and use it in GitHub Desktop.
Save virgiliu/eefbadef4de9d2ecb2e01020ae471892 to your computer and use it in GitHub Desktop.
Gumroad bulk download
// Run this in the content download page and it will trigger download for everything
var sleep = (milliseconds) => {
return new Promise(resolve => setTimeout(resolve, milliseconds))
}
var waitTime = 1500; //ms
var x = $( "button:contains('Download')" );
for(var i = 0; i < x.length ; i++)
{
(function(idx) {
// Wait needed because browser blocks network calls if you make too many too fast
sleep(i * waitTime).then(() => {
x[idx].click();
});
})(i)
}
@skroed
Copy link

skroed commented Nov 28, 2023

Here is an alternative if you wanted to do this from Python:

import urllib.request
from bs4 import BeautifulSoup
import argparse
import os


def download_files(download_folder):
    base_url = "<The base url with /r/>"
    with urllib.request.urlopen(base_url) as url:
        s = url.read()

    soup = BeautifulSoup(s, "html.parser")
    buttons = soup.body.find_all(attrs={"class": "button"})

    # Skip first one, its just a manual.
    for button in buttons:
        if "Download" in button.text:
            # Find the h4 tag in the previous div
            h4_tag = (
                button.find_previous("div")
                .find_previous("div")
                .find("h4")
                .text.lower()
                .replace(" ", "_")
                .replace(",", "")
            )
            if h4_tag:
                file_name = f"{h4_tag}.zip"
                print(f"Downloading {file_name}")
                file_url = base_url + "/" + button["data-resource-id"]
                file_path = os.path.join(download_folder, file_name)
                urllib.request.urlretrieve(file_url, file_path)
                print(f"Downloaded {file_path}")


if __name__ == "__main__":
    parser = argparse.ArgumentParser(
        description="Download files to a specified folder."
    )
    parser.add_argument(
        "download_folder", type=str, help="Folder path to download files"
    )
    args = parser.parse_args()

    download_files(args.download_folder)

@obsessedcake
Copy link

obsessedcake commented Dec 9, 2023

Another python lib that can download a single product or entire library: https://github.com/obsessedcake/gumroad-utils

  • it also preserves directory structure.
  • unfortunately doesn't support parallel download.

@CedricMi
Copy link

Alternative approach that i consider way more stable would be:

  1. extract the data links from the page (same as above, run in browser)
var x = $( "button:contains('Download')" );

var result = []

for(var i = 0; i < x.length ; i++) {
    result.push(x[i].getAttribute("data-resource-id"))
}

console.log(result)

This will output a list of the data links. Right click on the object to copy it as a text json.

  1. Now all you need to do is prepend them with https://app.gumroad.com/r/<your_id>/ and you have functional direct download links. Drop into your download manager of choice (or wget, curl, whatever), and you have guaranteed successful download. You might run into minor problems with filenames as this link will redirect you to a true URL, but I'm sure basically every tool in existence has a way to handle that.

EDIT: might as well post a full lazy script:

var x = $("button:contains('Download')");

var result = [];

for (var i = 0; i < x.length; i++) {
  result.push(x[i].getAttribute("data-resource-id"));
}

var currentUrl = window.location.href;
var newBaseUrl = currentUrl.replace("/d/", "/r/");
var newUrls = [];

result.forEach(function (resourceId) {
  newUrls.push(newBaseUrl + "/" + resourceId);
});

var blob = new Blob([newUrls.join("\n")], { type: "text/plain" });

var a = document.createElement("a");
a.href = URL.createObjectURL(blob);
a.download = "urls.txt";
a.style.display = "none";
document.body.appendChild(a);
a.click();

document.body.removeChild(a);
URL.revokeObjectURL(a.href);

This script creates the URLs for you, puts them into a file and triggers a download for this file (saves it). This assumes you're on the page https://app.gumroad.com/d/<your_id>

I get "HTTP request sent, awaiting response... 404 Not Found" errors. What can I do?

@KhyDoesntKnowStuffYet
Copy link

Alternative approach that i consider way more stable would be:

  1. extract the data links from the page (same as above, run in browser)
var x = $( "button:contains('Download')" );

var result = []

for(var i = 0; i < x.length ; i++) {
    result.push(x[i].getAttribute("data-resource-id"))
}

console.log(result)

This will output a list of the data links. Right click on the object to copy it as a text json.

  1. Now all you need to do is prepend them with https://app.gumroad.com/r/<your_id>/ and you have functional direct download links. Drop into your download manager of choice (or wget, curl, whatever), and you have guaranteed successful download. You might run into minor problems with filenames as this link will redirect you to a true URL, but I'm sure basically every tool in existence has a way to handle that.

EDIT: might as well post a full lazy script:

var x = $("button:contains('Download')");

var result = [];

for (var i = 0; i < x.length; i++) {
  result.push(x[i].getAttribute("data-resource-id"));
}

var currentUrl = window.location.href;
var newBaseUrl = currentUrl.replace("/d/", "/r/");
var newUrls = [];

result.forEach(function (resourceId) {
  newUrls.push(newBaseUrl + "/" + resourceId);
});

var blob = new Blob([newUrls.join("\n")], { type: "text/plain" });

var a = document.createElement("a");
a.href = URL.createObjectURL(blob);
a.download = "urls.txt";
a.style.display = "none";
document.body.appendChild(a);
a.click();

document.body.removeChild(a);
URL.revokeObjectURL(a.href);

This script creates the URLs for you, puts them into a file and triggers a download for this file (saves it). This assumes you're on the page https://app.gumroad.com/d/<your_id>

i have a few questions since i am a new guy and i dont know how to use most stuff. Do i paste the script in the console? ; how do i prepend the thing that i pasted into the url? ; should the url be the one with the model i wanna get?

@Kawaru86
Copy link

Since GR is committing a tumblr, I was really hoping this would work or the python one mentioned above, but neither seem too, at least for me.
I can easily run this in console on my library page and all I get is an empty text file. Am I just stupid or is this no longer functional. I tried the python options but I honestly have no idea what I'm doing there.

@retden
Copy link

retden commented Mar 16, 2024

I think I got it. Gumroad changed the link structure by adding "product_files?product_file_ids[]=". I simply tweaked the script above a bit so that it outputs a list of download links. Paste them in a downloader and voila
Thanks for the original script since I certainly can't write lol

var x = $("a.button:contains('Download')");

var result = [];

for (var i = 0; i < x.length; i++) {
  result.push(x[i].getAttribute("data-resource-id"));
}

var currentUrl = window.location.href;
var newBaseUrl = currentUrl.replace("/d/", "/r/");
var newUrls = [];

result.forEach(function (resourceId) {
  newUrls.push(newBaseUrl + "/product_files?product_file_ids%5B%5D=" + resourceId);
});

var blob = new Blob([newUrls.join("\n")], { type: "text/plain" });

var a = document.createElement("a");
a.href = URL.createObjectURL(blob);
a.download = "urls.txt";
a.style.display = "none";
document.body.appendChild(a);
a.click();

document.body.removeChild(a);
URL.revokeObjectURL(a.href);

@Fhurai
Copy link

Fhurai commented Mar 17, 2024

With Gumroad going down on NSFW content, I tried to do something myself.

async function fetchUrls(link) {
    return await fetch(link)
    .then(res => res.text())
    .then(text => {
        let parser = new DOMParser();
        let doc = parser.parseFromString(text, "text/html");
        var script = doc.querySelector("script[data-component-name]");
        var links = Array.from(JSON.parse(script.innerText).content.content_items).map((item) => { return "https://app.gumroad.com" + item.download_url });
        return links;
    });
}
Promise.all(Array.from(document.querySelectorAll("article a"))
    .filter((link) => { return link.href.includes("/d/") })
    .map((link) => { return link.href })
    .map((link) => {
        return  fetchUrls(link);
    })).then(function(urls){
		var blob = new Blob([urls.flat(1).join("\n")], {type: "text/plain;charset=utf-8"});
	
		var url = window.URL || window.webkitURL;
		var link = url.createObjectURL(blob);
		
		var a = document.createElement("a");
		a.download = "liens_downloads_gumroad.txt";
		document.body.appendChild(a);
		a.href = link;
		a.click();
		a.remove();
		
	});

Hope that helps people.

@itswzyss
Copy link

itswzyss commented Mar 17, 2024

@Fhurai Thanks for this. I did some modifications and this will create a txt file that has all links a bit more organized for you. You shouldn't pop this list into a downloader though -- I did this because I didn't want all my assets in one folder and this allows me to get the original store page and the links associated with it so I can then download those to a specific folder. Hope this helps!

For anyone who stumbles across this and has no idea what to do with this, I made a little post here: https://blog.wzyss.dev/easily-archive-your-gumroad-library/

async function fetchUrls(link) {
    // Fetch and process URLs from the given link
    return fetch(link)
        .then(res => res.text())
        .then(text => {
            let parser = new DOMParser();
            let doc = parser.parseFromString(text, "text/html");
            var script = doc.querySelector("script[data-component-name]");
            var links = Array.from(JSON.parse(script.innerText).content.content_items).map((item) => { return "https://app.gumroad.com" + item.download_url });
            // Return both the original link and the associated download URLs
            return {link, downloads: links};
        });
}

Promise.all(Array.from(document.querySelectorAll("article a"))
    .filter((link) => link.href.includes("/d/"))
    .map((a) => a.href)
    .map((link) => {
        // Fetch URLs and maintain their association with the original link
        return fetchUrls(link);
    }))
    .then(function(results) {
        // Process results to group downloads by their originating link
        let groupedDownloads = results.reduce((acc, {link, downloads}) => {
            acc[link] = downloads;
            return acc;
        }, {});

        // Prepare data for export
        let exportData = Object.entries(groupedDownloads).map(([page, downloads]) => {
            return `${page}\n${downloads.join("\n")}`;
        }).join("\n\n");

        // Create a blob and download it
        var blob = new Blob([exportData], {type: "text/plain;charset=utf-8"});
        var url = window.URL || window.webkitURL;
        var downloadLink = url.createObjectURL(blob);
        var a = document.createElement("a");
        a.download = "categorized_downloads_gumroad.txt";
        document.body.appendChild(a);
        a.href = downloadLink;
        a.click();
        a.remove();
    });

@CedricMi
Copy link

Much appreciated guys!
In the mean time, I resorted to download manually, but I'll be sure to try the updated scripts next time.

@Fhurai
Copy link

Fhurai commented Mar 17, 2024

@itswzyss Good idea. Went through little tests with my file. Once a number of links attained, Gumroad decide to cut the flow and stop the connexion. So, I decided to reduce the file in multiple files, once per artist (even if you have multiple purchases). 🤣
One thing more. Content creator can use external links to host their files instead of gumroad. I alert of that in the files & console.

let promises = await Promise.all(Array.from(document.querySelectorAll("article a.stretched-link")) // Get promises with purchases download links.
    .map((link) => { return link.href })
    .map((link) => {
        return fetch(link) // Get link from purchase link.
            .then(res => res.text())
            .then(text => {

                let parser = new DOMParser(); // Create DOMContent from fetch content with download links.
                let doc = parser.parseFromString(text, "text/html");

                var script = doc.querySelector("script[data-component-name]");// Get script in which the download content JS is.
                
                if(JSON.parse(script.innerText).content.content_items.length === 0) 
                    console.log(JSON.parse(script.innerText).creator.name + " use an external hosting service. Please watch their files to get the purchased download links"); // Alert in console for external hosting services.

                return {
                    artist: JSON.parse(script.innerText).creator.name,
                    links: (JSON.parse(script.innerText).content.content_items.length > 0 ?
                        JSON.parse(script.innerText).content.content_items.map((item) => { return "https://app.gumroad.com" + item.download_url }) :
                        ["external link in following page : " + link])
                };// Return both the artist and the associated download URLs (if content is in external website from gumroad, the page will be alerted).
            });
    }));

let timer = 0; // Timer to delay the download (to avoid download throttle).

promises // Need the promises to be resolved from here.
    .reduce((acc, d) => {
        const found = acc.find(a => a.artist === d.artist);
        const value = d.links.flat(Infinity);
        if (!found) acc.push({ artist: d.artist, links: [value] })
        else found.links.push(value);
        return acc;
    }, [])// Regroup links per artist.
    .sort(function (a, b) {
        return a.artist.localeCompare(b.artist);
    })// Sort artist per name.
    .forEach((data) => {

        setTimeout(function () {
            var blob = new Blob([data.links.flat(Infinity).join("\n")], { type: "text/plain;charset=utf-8" });
            var url = window.URL || window.webkitURL;
            var link = url.createObjectURL(blob);// Creation of download link.

            var a = document.createElement("a");
            a.download = "downloads_" + data.artist + "_gumroad.txt";
            document.body.appendChild(a);
            a.href = link;// Creation of the download button.


            a.click(); // Click to begin download.
            a.remove(); // Remove the download button.
        }, timer += 1500);// Delay to avoid download throttle.
    });// From this, download
    ```

@InfiniteCanvas
Copy link

I wrote a thing https://github.com/InfiniteCanvas/gumload though it doesn't have a proper readme yet. You just install the requirements, edit the config.json for your use (get the app session and guid from your cookies) and then it should work. You can set refresh to false after the first run, since it doesn't need to refetch everything on subsequent runs. Useful for when there are download errors. It will only redownload stuff with mismatching sizes.
It basically has the same setup as https://github.com/obsessedcake/gumroad-utils and I inspired a soup json parsing bit from it into my code. For some reason this repo didn't work for me or I wouldn't have written my own thing.

@Kawaru86
Copy link

Kawaru86 commented Mar 18, 2024

Well I gave it a try, I ran "pip3 install -r requirements.txt" just to be safe, then adjusted the config file.

My config.json
{
"threads": 5,
"only_specified_creators": false,
"match_size_using_content_info": false,
"db_path": "gumload.json",
"refresh": true,
"folder": "J:\Gumroad\",
"userAgent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36",
"_gumroad_app_session": "Redacted",
"_gumroad_guid": "redacted",
"creators": [
]
}

Console results.

C:\Users\Kevin\Desktop\gumload-master>main.py
Updating Library...
Processing creators: 100%|████████████████████████████████████████████████████████████| 85/85 [00:00<00:00, 646.28it/s]
Updating January 2015 Batch 1 : 100%|███████████████████████████| 291/291 [00:02<00:00, 124.84it/s]
Downloading everything from sakimichanpatreon[4760375177590]
ETC
Downloading everything from 风挽[5017216242919]
Downloading 0 files

So what am I doing wrong?

@InfiniteCanvas
Copy link

Well I gave it a try, I ran "pip3 install -r requirements.txt" just to be safe, then adjusted the config file.

My config.json { "threads": 5, "only_specified_creators": false, "match_size_using_content_info": false, "db_path": "gumload.json", "refresh": true, "folder": "J:\Gumroad", "userAgent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36", "_gumroad_app_session": "Redacted", "_gumroad_guid": "redacted", "creators": [ ] }

So what am I doing wrong?

Nothing, I forgot to actually fetch stuff when no creators were specified. I've only done it with specifying creators, since I didn't want to download TBs of data. I'm gonna fix it now.

@Kawaru86
Copy link

Well I gave it a try, I ran "pip3 install -r requirements.txt" just to be safe, then adjusted the config file.
My config.json { "threads": 5, "only_specified_creators": false, "match_size_using_content_info": false, "db_path": "gumload.json", "refresh": true, "folder": "J:\Gumroad", "userAgent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36", "_gumroad_app_session": "Redacted", "_gumroad_guid": "redacted", "creators": [ ] }
So what am I doing wrong?

Nothing, I forgot to actually fetch stuff when no creators were specified. I've only done it with specifying creators, since I didn't want to download TBs of data. I'm gonna fix it now.

Ah okay. XD

@InfiniteCanvas
Copy link

I forgot to tell, I fixed it about an hour ago lol
I tried it out and it should work. Probably..

@Kawaru86
Copy link

I forgot to tell, I fixed it about an hour ago lol I tried it out and it should work. Probably..

Yup, looks like its working, thanks a bunch!!!!

@AzureArtism
Copy link

AzureArtism commented Mar 21, 2024

I tried @Kawaru86 's gumload plugin, and what I got was this:
Updating Library...
Failed to update library due to ['NoneType' object has no attribute 'string']
Downloading 0 files

My config.json:
{
"threads": 5,
"only_specified_creators": true,
"match_size_using_content_info": true,
"db_path": "gumload.json",
"refresh": true,
"folder": "E:\_GumroadResults\",
"userAgent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36 OPR/107.0.0.0",
"_gumroad_app_session": "redacted",
"_gumroad_guid": "redacted",
"creators": [

]
}

Maybe it's because the creators array was empty... How do I get the right values to put in for name, id and count? I'd also like to note that I tried the stuff in the article that itswzyss made, but JDownloader didn't work with the text files outputted by either script.

@InfiniteCanvas
Copy link

Try changing using

"only_specified_creators": false

instead of "only_specified_creators": true

@Kawaru86
Copy link

Kawaru86 commented Mar 22, 2024 via email

@rursache
Copy link

rursache commented Apr 5, 2024

@InfiniteCanvas thanks for your work, the script works flawlessly!

@obsessedcake
Copy link

Hi all! I've update https://github.com/obsessedcake/gumroad-utils. If anyone interested, take a look 😄

@AzureArtism
Copy link

@InfiniteCanvas @Kawaru86 Thanks, it worked!

@virgiliu
Copy link
Author

virgiliu commented May 2, 2024

Hi all! I've update https://github.com/obsessedcake/gumroad-utils. If anyone interested, take a look 😄

@obsessedcake Thanks for building and sharing that. For me the original script was a one-time thing which I made public by mistake, but since people started talking I decided to leave it up even though I don't have the time to maintain it 😅

@Hs211221
Copy link

Can anyone please help me to download from the beginning and how to code it please.. that'd be great help thankyou

@H1mawari
Copy link

H1mawari commented Sep 24, 2024

Hi all! I've update https://github.com/obsessedcake/gumroad-utils. If anyone interested, take a look 😄

I've tried using the library but can't download any files via the URL I'm getting an error
image
What is this error and is there any way to fix it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment