Skip to content

Instantly share code, notes, and snippets.

@ssddanbrown
Last active October 14, 2024 17:02
Show Gist options
  • Save ssddanbrown/45acb913a7b873240b2d89781e74a7a4 to your computer and use it in GitHub Desktop.
Save ssddanbrown/45acb913a7b873240b2d89781e74a7a4 to your computer and use it in GitHub Desktop.
BookStack-Export-Books

Requirements

You will need php (~7.1+) installed on the machine you want to run this script on. You will also need BookStack API credentials (TOKEN_ID & TOKEN_SECRET) at the ready.

Note

This has quickly been thrown together and is not an officially supported part of BookStack.

Running

# Downloading the script
curl https://gist.githubusercontent.com/ssddanbrown/45acb913a7b873240b2d89781e74a7a4/raw/export-books.php > export-books.php

# Setup
# ALTERNATIVELY: Open the script and edit the variables at the top.
export BS_URL=https://bookstack.example.com # Set to be your BookStack base URL
export BS_TOKEN_ID=abc123 # Set to be your API token_id
export BS_TOKEN_SECRET=123abc # Set to be your API token_secret

# Running the script
php export-books.php <format> <output_dir>

# Examples

## Export as plaintext to an existing "out" directory
php export-books.php plaintext ./out

## Export as pdf to the current directory
php export-books.php pdf ./

## Export as HTML to an existing "html" directory
php export-books.php html ./html
#!/usr/bin/env php
<?php
// API Credentials
// You can either provide them as environment variables
// or hard-code them in the empty strings below.
$apiUrl = getenv('BS_URL') ?: ''; // http://bookstack.local/
$clientId = getenv('BS_TOKEN_ID') ?: '';
$clientSecret = getenv('BS_TOKEN_SECRET') ?: '';
// Export Format & Location
// Can be provided as a arguments when calling the script
// or be hard-coded as strings below.
$exportFormat = $argv[1] ?? 'pdf';
$exportLocation = $argv[2] ?? './';
// Script logic
////////////////
$books = getAllBooks();
$outDir = realpath($exportLocation);
$extensionByFormat = [
'pdf' => 'pdf',
'html' => 'html',
'plaintext' => 'txt',
];
foreach ($books as $book) {
$id = $book['id'];
$extension = $extensionByFormat[$exportFormat] ?? $exportFormat;
$content = apiGet("api/books/{$id}/export/{$exportFormat}");
$outPath = $outDir . "/{$book['slug']}.{$extension}";
file_put_contents($outPath, $content);
}
/**
* Get all books from the system API.
*/
function getAllBooks() {
$count = 100;
$offset = 0;
$total = 0;
$allBooks = [];
do {
$endpoint = 'api/books?' . http_build_query(['count' => $count, 'offset' => $offset]);
$resp = apiGetJson($endpoint);
// Only set total on first request, due to API bug:
// https://github.com/BookStackApp/BookStack/issues/2043
if ($offset == 0) {
$total = $resp['total'] ?? 0;
}
$newBooks = $resp['data'] ?? [];
array_push($allBooks, ...$newBooks);
$offset += $count;
} while ($offset < $total);
return $allBooks;
}
/**
* Make a simple GET HTTP request to the API.
*/
function apiGet(string $endpoint): string {
global $apiUrl, $clientId, $clientSecret;
$url = rtrim($apiUrl, '/') . '/' . ltrim($endpoint, '/');
$opts = ['http' => ['header' => "Authorization: Token {$clientId}:{$clientSecret}"]];
$context = stream_context_create($opts);
return file_get_contents($url, false, $context);
}
/**
* Make a simple GET HTTP request to the API &
* decode the JSON response to an array.
*/
function apiGetJson(string $endpoint): array {
$data = apiGet($endpoint);
return json_decode($data, true);
}
/**
* DEBUG: Dump out the given variables and exit.
*/
function dd(...$args) {
foreach ($args as $arg) {
var_dump($arg);
}
exit(1);
}
@Karl-Qlogic
Copy link

I couldn't figure out how to use PHP. I kept getting errors. But with the help of CHATGPT i managed to get this powershell script that downloads all books as HTML. Unfortunately I wasn't able to encode it in a way where characters like "ÅÄÖ" is displayed correctly.

# Define the BookStack API URL
$apiUrl = "XXXX"

# Define the API credentials
$clientId = "XXXX"
$clientSecret = "XXX"

# Define the export format and location
$exportFormat = "html"
$exportLocation = "C:\BookstackBackup"

# Get all books from the BookStack API
function Get-AllBooks {
    $count = 100
    $offset = 0
    $allBooks = @()

    do {
        $endpoint = "api/books?count=$count&offset=$offset"
        $resp = (Invoke-WebRequest -Uri "$apiUrl/$endpoint" -Headers @{Authorization = "Token $($clientId):$($clientSecret)" }).Content | ConvertFrom-Json


        $newBooks = $resp.data
        $allBooks += $newBooks
        $offset += $count
    } while ($offset -lt $resp.total)

    return $allBooks
}


# Get all books from the BookStack API
$allBooks = Get-AllBooks

# Loop through each book and export it
foreach ($book in $allBooks) {
    # Define the book ID and file extension
    $id = $book.id
    $extension = "html"

    # Get the book content from the API
    $content = [System.Text.Encoding]::GetEncoding(28591).GetString((Invoke-WebRequest -Uri "$apiUrl/api/books/$id/export/$exportFormat" -Headers @{Authorization = "Token $($clientId):$($clientSecret)" }).Content)


    # Create the output file path
    $outPath = "$exportLocation\$($book.slug).$extension"

    # Write the book content to the output file
    Set-Content -Path $outPath -Value $content
}


@cdrfun
Copy link

cdrfun commented Jan 2, 2024

Thanks for the Poewershell Script, @Karl-Qlogic, it was a good template for my use case. Your issue can be solved by using UTF8 when getting the content and when saving it:

$content = [System.Text.Encoding]::UTF8.GetString((Invoke-WebRequest -Uri "$apiUrl/api/books/$id/export/$exportFormat" -TimeoutSec 60 -Headers @{Authorization = "Token $($clientId):$($clientSecret)" }).Content)
Set-Content -Path $outPath -Value $content -Encoding UTF8

@FaySmash
Copy link

This is great and should be natively integrated into bookstack!

@NiklasPieper
Copy link

Hello everyone,

A few months ago, I set up this method for our BookStack server, and it worked perfectly until recently.

A few days ago, I switched our BookStack server to HTTPS, using a self-signed certificate since we only need internal access.

I have already updated the BS_URL in the script accordingly and added our certificate to the trusted certificates on my system. However, when I run the script, I get the following error:

% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 2409 100 2409 0 0 158 0 0:00:15 0:00:15 --:--:-- 580 PHP Warning: Module "curl" is already loaded in Unknown on line 0 PHP Warning: file_get_contents(): SSL operation failed with code 1. OpenSSL Error messages: error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed in /home/adminl/Export_BookStack/export-books.php on line 72 PHP Warning: file_get_contents(): Failed to enable crypto in /home/adminl/Export_BookStack/export-books.php on line 72 PHP Warning: file_get_contents(https://bookstack.mvg-online.de/api/books?count=100&offset=0): Failed to open stream: operation failed in /home/adminl/Export_BookStack/export-books.php on line 72 PHP Fatal error: Uncaught TypeError: apiGetJson(): Return value must be of type array, null returned in /home/adminl/Export_BookStack/export-books.php:81 Stack trace: #0 /home/adminl/Export_BookStack/export-books.php(48): apiGetJson() #1 /home/adminl/Export_BookStack/export-books.php(20): getAllBooks() #2 {main} thrown in /home/adminl/Export_BookStack/export-books.php on line 81

Can anyone explain what adjustments I need to make to my BookStack server to get this function working again?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment