Skip to content

Instantly share code, notes, and snippets.

@coryalder
Created August 17, 2011 07:37
Show Gist options
  • Save coryalder/1151042 to your computer and use it in GitHub Desktop.
Save coryalder/1151042 to your computer and use it in GitHub Desktop.
Extract html book files from the safari books online iPad app "Safari To Go" using php.

Save books out of Safari Books Online

From http://objectivesea.tumblr.com/post/9033067018/safaribooks

This is hard. I spent way too much time figuring this out, because I was annoyed that a book I bought (Addison-Wesley) was available online for free, except only for 45 days after which payment was required. So I made this hack... probably useful to no one else, but here it is.

Requirements:

  1. iPad.
  2. Safari To Go (the Safari Books Online iPad app).
  3. a Mac (could be done on a PC, but you'd have to tweak the php script).

First, get Safari To Go on your iPad. Log in, and add all books you want to download to your "Offline Bookbag" (hint: tap the heart when viewing a book).

Now go get iPhone Explorer. This allows you to pull files off your iPad. You'll need to grab the Safari.sqlite file from

/<your ipad>/Apps/Safari To Go/Documents/Safari.sqlite

You also need to grab the images for each individual book. They're in

/<your ipad>/Apps/Safari To Go/Documents/<some long number>-plain-folder/

Grab the folder called "images" for each book you're saving.

Now download this php script I wrote, which extracts all the offline books from the "Safari.sqlite" file. Place it in the same folder, and run it from the terminal (hint: cd ~/Desktop/ and then php safariparse.php will run it if it's on your desktop).

You should now have a number of folders for each of the books you've extracted. Hooray. Now comes the manual step. You have to manually match up the image folder with the right html files. It's not too hard, as I've added a hint folder to each of the book folders. You should see something like your-book-title/images/<some long number>. Find the matching image folder from the iPhone Explorer step earlier, and place so graphics are at the path your-book-name/images/<some long number>/graphics/01_01.jpg where the html can find it.

You now have an offline backup of any Safari Books Online book you own on your Mac. Win.

<?php
// code to accompany this blog post http://objectivesea.tumblr.com/post/9033067018/safaribooks
try
{
//create or open the database
$database = new SQLite3('/Users/cory/Desktop/Safari.sqlite');
if ($database) echo "Database connection open...\n";
$booksQuery = 'SELECT * from ZBOOK where ZOFFLINESTATUS = "offline"';
if($result = $database->query($booksQuery)) {
echo "Found " . $result->numColumns() . " books stored offline\n";
while($row = $result->fetchArray()) {
echo "Saving book " . $row['ZTITLE'] . "\n";
save_a_book($row['ZTITLE'], $row['Z_PK'], $database, $row['ZFPID']);
}
} else echo "Database failed to return results for query: " . $booksQuery;
}
catch(Exception $e)
{
die($error);
}
function save_a_book($name, $book_id, $database, $fpid) {
$top = '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"/></head><body>';
$bottom = '</body></html>';
$query = 'SELECT ZHTMLBODY, Z_PK, ZORDERINDEX from ZPAGE where ZBOOK = "' . $book_id . '" ORDER BY ZORDERINDEX';
if($result = $database->query($query)) {
// make dir
mkdir_recursive(sanitize($name, TRUE));
$counter = 1;
while($row = $result->fetchArray()) {
$myFile = "./" . sanitize($name, TRUE) . "/pg_" . $counter . ".html";
$fh = fopen($myFile, 'w') or die("can't open file ". $row['Z_PK']);
$stringData = $top . $row['ZHTMLBODY'] . $bottom;
fwrite($fh, $stringData);
fclose($fh);
$counter++;
print("Wrote page #: {$row['Z_PK']} to file {$myFile}\n");// . "Director: {$row['Director']} <br />" . "Year: {$row['Year']} <br /><br />");
}
mkdir_recursive(sanitize($name, TRUE) . "/images/" . $fpid);
}
}
function mkdir_recursive($pathname)
{
is_dir(dirname($pathname)) || mkdir_recursive(dirname($pathname));
return is_dir($pathname) || @mkdir($pathname);
}
function sanitize($string = '', $is_filename = FALSE)
{
// Replace all weird characters with dashes
$string = preg_replace('/[^\w\-'. ($is_filename ? '~_\.' : ''). ']+/u', '-', $string);
// Only allow one dash separator at a time (and make string lowercase)
return mb_strtolower(preg_replace('/--+/u', '-', $string), 'UTF-8');
}
?>
<?php
// same as safariparse.php, except it outputs one big html file.
// instead of putting each individual chapter in a separate html file.
try
{
//create or open the database
$database = new SQLite3('/Users/cory/Desktop/Safari.sqlite');
if ($database) echo "Database connection open...\n";
$booksQuery = 'SELECT * from ZBOOK where ZOFFLINESTATUS = "offline"';
if($result = $database->query($booksQuery)) {
echo "Found " . $result->numColumns() . " books stored offline\n";
while($row = $result->fetchArray()) {
echo "Saving book " . $row['ZTITLE'] . "\n";
save_a_book($row['ZTITLE'], $row['Z_PK'], $database, $row['ZFPID']);
}
} else echo "Database failed to return results for query: " . $booksQuery;
}
catch(Exception $e)
{
die($error);
}
function save_a_book($name, $book_id, $database, $fpid) {
$top = '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"/><title>' . $name . '</title></head><body>';
$bottom = '</body></html>';
$query = 'SELECT ZHTMLBODY, Z_PK, ZORDERINDEX from ZPAGE where ZBOOK = "' . $book_id . '" ORDER BY ZORDERINDEX';
if($result = $database->query($query)) {
// make dir
$myFile = "./" . sanitize($name, TRUE) . ".html";
$fh = fopen($myFile, 'w') or die("can't open file ". $myFile);
fwrite($fh, $top);
mkdir_recursive(sanitize($name, TRUE));
$counter = 1;
while($row = $result->fetchArray()) {
$stringData = $row['ZHTMLBODY'];
fwrite($fh, $stringData);
$counter++;
print("Wrote page #: {$row['Z_PK']} to file {$myFile}\n");// . "Director: {$row['Director']} <br />" . "Year: {$row['Year']} <br /><br />");
}
fwrite($fh, $bottom);
fclose($fh);
mkdir_recursive(sanitize($name, TRUE) . "/images/" . $fpid);
}
}
function mkdir_recursive($pathname)
{
is_dir(dirname($pathname)) || mkdir_recursive(dirname($pathname));
return is_dir($pathname) || @mkdir($pathname);
}
function sanitize($string = '', $is_filename = FALSE)
{
// Replace all weird characters with dashes
$string = preg_replace('/[^\w\-'. ($is_filename ? '~_\.' : ''). ']+/u', '-', $string);
// Only allow one dash separator at a time (and make string lowercase)
return mb_strtolower(preg_replace('/--+/u', '-', $string), 'UTF-8');
}
?>
@gsking
Copy link

gsking commented Dec 10, 2011

Thank you for the excellent tip Cory! I just tried this and it looks like the zhtmlbody is now encrypted. Any advise on how to decrypt the same?

GSK

@bwanaaa
Copy link

bwanaaa commented Jun 29, 2013

Yes, I'd like to know this as well. The stuff in the body of the HTML is gobbledy gook. Base64?

@jgilmour
Copy link

It looks like it's encrypting it in base64, using a combination of your password for the account, the string "String" and the string "Class". I'm not too great at decompiling, but the LoginTask.class files and SecurityUtilities.class shed some light on it.

@hesyar
Copy link

hesyar commented Dec 2, 2014

Does this trick still works?

@hellboy81
Copy link

The same question. Can I dowload books with Safari Free Trial account?

@dreambit
Copy link

Did someone figure out how to decode html files?

@chenditc
Copy link

chenditc commented Jun 8, 2016

You can try this chrome extension: https://github.com/chenditc/safari-download

@dreambit
Copy link

chenditc, thx, it's a nice alternative.

@ricardojimmy254
Copy link

Bro this is gonna be really tricky but can you use this for the "Safari" that is provided by universities?. Like I'm sure you might be aware that a good majority of college and universities these days in the U.S. and Canada have a safari package given free to students but it only has e-books and just like the normal company site you can save em to desktop or pdf ;(

@markojak
Copy link

Anyone know if this still works? I really want to be able to view my safari online books on multiple devices and and their format is quite annoying.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment