Skip to content

Instantly share code, notes, and snippets.

@aunetx
Last active May 26, 2021 13:05
Show Gist options
  • Save aunetx/cef8a913e4faee437608b1299edf5b14 to your computer and use it in GitHub Desktop.
Save aunetx/cef8a913e4faee437608b1299edf5b14 to your computer and use it in GitHub Desktop.
A little script to download manga from wakascan.com and convert them to pdf
#!/bin/bash
set -e
PDF_DIR="$HOME/Documents/books/berserk"
[ $# == 1 ] || (echo "expected one argument, the URL of the manga's volume.
example: 'https://wakascan.com/manga/berserk/vol-05/'" && exit 1)
DIR=$(dirname "$0")
IMG_DIR="$DIR/manga_images"
rm -rf "$IMG_DIR"
mkdir "$IMG_DIR"
page_url="$1/"
url=`curl -s $page_url | grep "image-0" | sed 's/<img id="image-0" data-image-paged="0" src="//' | sed 's/" class="wp-manga-chapter-img">//'`
name=`curl -s $page_url | grep "chapter-heading" | sed 's/<h1 id="chapter-heading">//' | sed 's.</h1>..'`
url_suffix=${url: -4}
cd "$IMG_DIR"
# Download images
page=1
while : ; do
len=`python -c "import math; print(int(math.log10($page)))"`
url_prefix=${url:: -5 - len}
page_url="$url_prefix$page$url_suffix"
[ $page -eq 0 ] || echo "downloading page $page..."
if ! wget -q "$page_url"; then
echo "$page pages downloaded, done."
break
fi
((page=page+1))
done
# Convert to PDF
if [ $url_suffix == ".png" ]; then
echo "png source"
convert "./*.png" "../manga.pdf"
elif [ $url_suffix == ".jpg" ]; then
echo "jpeg source"
find -iname "*.jpg" | parallel -I'{}' convert {} {}.pdf
pdfunite *.pdf "../manga.pdf"
rm *.pdf
else
echo "unknown source"
false
fi
mv ../manga.pdf "$PDF_DIR"/"$name".pdf
rm -rf "$IMG_DIR"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment