-
-
Save wvengen/27162f92acadfaf3ac6b782b9a018285 to your computer and use it in GitHub Desktop.
#!/bin/sh | |
# | |
# Generates PDF from Publitas images (online folder service) | |
# Stores generated PDF and JSON (which may contains links). | |
# | |
# Requirements: | |
# - wget https://www.gnu.org/software/wget/ | |
# - jq https://stedolan.github.io/jq/ | |
# - imagemagick https://www.imagemagick.org/ | |
# | |
# You may need to remove the PDF-related security policy for ImageMagick for this to work. | |
# | |
if [ ! "$2" ]; then | |
echo "Usage: $0 <publitas_folder_url> <output_name>" | |
exit 1 | |
fi | |
URL="$1" | |
OUT="$2" | |
DIR=`mktemp -d --suffix=.getpublitas` | |
wget -q -O /dev/stdout "$URL" | sed 's/^\s*var\s\+data\s\+=\s\+\(.*\);\s*$/\1/p;d' > "$DIR/$NAME.json" | |
cat "$DIR/$NAME.json" | jq -r '.spreads[].pages[].images | .at2400 // .at2000 // .at1600 // .at1200 // .at1000' >"$DIR/img_urls" | |
i=1 | |
for u in `cat "$DIR/img_urls"`; do | |
echo "$u" >"$DIR/cur_url" # use file to be able to use base | |
wget -q --base="$URL" -O `printf "$DIR/image-page-%04d.jpg" $i` -i "$DIR/cur_url" | |
i=$(( $i + 1 )) | |
done | |
convert "$DIR/image-page-*.jpg" "$OUT.pdf" | |
cp "$DIR/$NAME.json" "$OUT.json" | |
rm -Rf "$DIR" |
Hi @Zerovelocity275 & @luduma,
I would like to point out that this script is not necessary anymore, since you can just add /unsupported
to the url and download the pdf from Publitas themselves.
Hi @Zerovelocity275 & @luduma, I would like to point out that this script is not necessary anymore, since you can just add
/unsupported
to the url and download the pdf from Publitas themselves.
Oh, thank you so much, that's great.
you can just add
/unsupported
to the url and download the pdf from Publitas themselves.
Hi @GlowingBulb , I'm doing that and it just says: Whoops! Something went wrong... We're sorry, but this part is no longer available., so they patched it right? idk if I'm doing it right, im adding it at the end of url
Hi @Cristark02, As far as I know it still works. Make sure that you add the /unsupported
to the end of the "root" url like this:
https://view.publitas.com/four-hands/fourhands_fall23/page/1
↓
https://view.publitas.com/four-hands/fourhands_fall23/unsupported
Hi @Zerovelocity275 & @luduma, I would like to point out that this script is not necessary anymore, since you can just add
/unsupported
to the url and download the pdf from Publitas themselves.
Great, still works! Thank you!
Hi @Cristark02, As far as I know it still works. Make sure that you add the
/unsupported
to the end of the "root" url like this:https://view.publitas.com/four-hands/fourhands_fall23/page/1
↓https://view.publitas.com/four-hands/fourhands_fall23/unsupported
I logged in just to say I'd kiss you if I had you in front of me.
Thanks a lot
Did you eventually manage to solve this? I am also trying to download the biology books :).