Skip to content

Instantly share code, notes, and snippets.

@umidjons
Last active December 6, 2023 21:41
Show Gist options
  • Save umidjons/11037635 to your computer and use it in GitHub Desktop.
Save umidjons/11037635 to your computer and use it in GitHub Desktop.
Creating PDF thumbnails in PHP

Creating PDF thumbnails in PHP

Install Ghostscript

Download and install right version of ghostscript. In my case my PHP was x86 architecture, so I download Ghostscript 9.14 for Windows (32 bit).

Enable ImageMagick

Check, is imagick extension available and loaded.

This line should be present in your php.ini:

extension=php_imagick.dll

Also, check php_imagick.dll in PHP's ext directory.

PHP function

<?php
	function genPdfThumbnail($source, $target)
	{
		//$source = realpath($source);
		$target = dirname($source).DIRECTORY_SEPARATOR.$target;
		$im     = new Imagick($source."[0]"); // 0-first page, 1-second page
		$im->setImageColorspace(255); // prevent image colors from inverting
		$im->setimageformat("jpeg");
		$im->thumbnailimage(160, 120); // width and height
		$im->writeimage($target);
		$im->clear();
		$im->destroy();
	}

Call that function:

<?php
genPdfThumbnail('/uploads/my.pdf','my.jpg'); // generates /uploads/my.jpg
@lsiden
Copy link

lsiden commented Oct 17, 2016

This is very sweet. Thank you. Nevertheless, if the original PDF file is large (I have one around 36M), this method gets excruciatingly slow. I think that the problem is that Image Magick still parses the entire PDF, even if you tell it that you want only the first page. Fortunately, not all the files on our server are that large. I'm wondering if there is a way to tell Image Magick to stop reading after it's found the first page (or whatever page you request).

@chriswheeler
Copy link

chriswheeler commented Jan 8, 2018

@lsiden - Did you find a way to do this? I've just come across the same issue with a 668 page PDF killing my image preview script...

@PhilipClarke
Copy link

Use an PHP's external program execution functions to call pdftoppm which has the ability select pages thereby avoiding the whole "loading everything into memory problem", it's in poppler-utils on Ubuntu

example:

pdftoppm -l 1 -scale-to 150 -jpeg example.pdf > thumb.jpg

which translates to take the last page as 1 (you can specify a first page), scale the largest size to 150 pixels and throw out a jpeg.

https://linux.die.net/man/1/pdftoppm if windows is required then the poppler libraries are listed as containing pdftocairo which use the same arguments and has some windows specific printing options.

@2x2p
Copy link

2x2p commented Apr 9, 2019

2019-04-09 adaptation from https://gist.github.com/umidjons/11037635

added error capture when a source is not a pdf or not a file path, else function outputs the new file path.
added optional size and page number, defaults to 256 pix made from first page in the pdf.
added page orentation and relative resize of height vs width as whatever is smaller than the maximum given size.
added the option of putting the output jpeg into a sub folder, auto creating directory if not already existent.
added correction of black zones on transparency in a pdf.

to call my version of the function implement it for example like this:

$pdf_thumb = genPdfThumbnail('/nfs/vsp/servername/u/username/public_html/any.pdf','thumbs/any.jpg')

to get page 2 as 150 pixel jpeg implement it for example as below.
however if the pdf does not have more than 1 page, be assured errors will occur.

$pdf_thumb = genPdfThumbnail('/nfs/vsp/servername/u/username/public_html/any.pdf','any_p02.pdf.jpeg',150,'02')

to see the result from the function thereafter implement this:

if($pdf_thumb){ echo $pdf_thumb; }else{ echo 'pdf source error'; }

the function itself:

<?
function genPdfThumbnail ( $source, $target, $size=256, $page=1 ){

	if(file_exists($source) && !is_dir($source)): // source path must be available and not be a directory
		if(mime_content_type($source) != 'application/pdf'):
			return FALSE;				// source is not a pdf file returns a failure
		endif;
		
		$sepa	=	'/';				// using '/' as file separation for nfs on linux.
		$target	= 	dirname($source).$sepa.$target;
		$size	= 	intval($size); 			// only use as integer, default is 256
		$page	= 	intval($page); 			// only use as integer, default is 1
		
		$page--;					// default page 1, must be treated as 0 hereafter
		if ($page<0) 	{$page=0;}			// we cannot have negative values

		$img	= 	new Imagick($source."[$page]"); // [0] = first page, [1] = second page
		
		$imH 	= 	$img->getImageHeight();
		$imW 	= 	$img->getImageWidth();
		if ($imH==0) 	{$imH=1;}			// if the pdf page has no height use 1 instead
		if ($imW==0) 	{$imW=1;}			// if the pdf page has no width use 1 instead

		$sizR	=	round($size*(min($imW,$imH)/max($imW,$imH))); // relative pixels of the shorter side 
		
		$img	->	setImageColorspace(255); 		// prevent image colors from inverting
		$img	->	setImageBackgroundColor('white'); 	// set background color and flatten
		$img	= 	$img->flattenImages(); 			// prevents black zones on transparency in pdf
		$img	->	setimageformat('jpeg');
		
		if ($imH == $imW){$img->thumbnailimage($size,$size);}	// square page 
		if ($imH < $imW) {$img->thumbnailimage($size,$sizR);}	// landscape page orientation
		if ($imH > $imW) {$img->thumbnailimage($sizR,$size);}	// portrait page orientation
		
		if(!is_dir(dirname($target))){mkdir(dirname($target),0777,true);} // if not there make target directory 

		$img->	writeimage($target);
		$img->	clear();
		$img->	destroy();
	
		if(file_exists( $target )){ return $target; } // return the path to the new file for further processing
	endif;

	return FALSE; 	// the source file was not available, or Imagick didn't create a file, so returns a failure
}
?>

@2x2p
Copy link

2x2p commented Apr 9, 2019

below two questions to my own concept above:

A. Is it technically possible that a provided PDF contains zero pages, if so: how can we test this?
B. Can the img obtained from the PDF be 0 in width or height? As this would cause a NAN error.

@sanzhardanybayev
Copy link

2019-04-09 adaptation from https://gist.github.com/umidjons/11037635

added error capture when a source is not a pdf or not a file path, else function outputs the new file path.
added optional size and page number, defaults to 256 pix made from first page in the pdf.
added page orentation and relative resize of height vs width as whatever is smaller than the maximum given size.
added the option of putting the output jpeg into a sub folder, auto creating directory if not already existent.
added correction of black zones on transparency in a pdf.

to call my version of the function implement it for example like this:

$pdf_thumb = genPdfThumbnail('/nfs/vsp/servername/u/username/public_html/any.pdf','thumbs/any.jpg')

to get page 2 as 150 pixel jpeg implement it for example as below.
however if the pdf does not have more than 1 page, be assured errors will occur.

$pdf_thumb = genPdfThumbnail('/nfs/vsp/servername/u/username/public_html/any.pdf','any_p02.pdf.jpeg',150,'02')

to see the result from the function thereafter implement this:

if($pdf_thumb){ echo $pdf_thumb; }else{ echo 'pdf source error'; }

the function itself:

<?
function genPdfThumbnail ( $source, $target, $size=256, $page=1 ){

	if(file_exists($source) && !is_dir($source)): // source path must be available and not be a directory
		if(mime_content_type($source) != 'application/pdf'):
			return FALSE;				// source is not a pdf file returns a failure
		endif;
		
		$sepa	=	'/';				// using '/' as file separation for nfs on linux.
		$target	= 	dirname($source).$sepa.$target;
		$size	= 	intval($size); 			// only use as integer, default is 256
		$page	= 	intval($page); 			// only use as integer, default is 1
		
		$page--;					// default page 1, must be treated as 0 hereafter
		if ($page<0) 	{$page=0;}			// we cannot have negative values

		$img	= 	new Imagick($source."[$page]"); // [0] = first page, [1] = second page
		
		$imH 	= 	$img->getImageHeight();
		$imW 	= 	$img->getImageWidth();
		if ($imH==0) 	{$imH=1;}			// if the pdf page has no height use 1 instead
		if ($imW==0) 	{$imW=1;}			// if the pdf page has no width use 1 instead

		$sizR	=	round($size*(min($imW,$imH)/max($imW,$imH))); // relative pixels of the shorter side 
		
		$img	->	setImageColorspace(255); 		// prevent image colors from inverting
		$img	->	setImageBackgroundColor('white'); 	// set background color and flatten
		$img	= 	$img->flattenImages(); 			// prevents black zones on transparency in pdf
		$img	->	setimageformat('jpeg');
		
		if ($imH == $imW){$img->thumbnailimage($size,$size);}	// square page 
		if ($imH < $imW) {$img->thumbnailimage($size,$sizR);}	// landscape page orientation
		if ($imH > $imW) {$img->thumbnailimage($sizR,$size);}	// portrait page orientation
		
		if(!is_dir(dirname($target))){mkdir(dirname($target),0777,true);} // if not there make target directory 

		$img->	writeimage($target);
		$img->	clear();
		$img->	destroy();
	
		if(file_exists( $target )){ return $target; } // return the path to the new file for further processing
	endif;

	return FALSE; 	// the source file was not available, or Imagick didn't create a file, so returns a failure
}
?>

You're the best! Exactly what I need! I had a tough situation when I could only use a SharedHosting. Can't express how much thankful I am!

@JeRoNZ
Copy link

JeRoNZ commented Sep 15, 2020

Thanks for this - proved very useful.

@inglesuniversal
Copy link

NOTES:

GHOSTSCRIPT use to extract image from PDF

https://stackoverflow.com/questions/52998331/imagemagick-security-policy-pdf-blocking-conversion

$ gs --version

If yes, just remove this whole following section from

$ sudo vim /etc/ImageMagick-6/policy.xml

<--- REMOVE THIS LINES --->
<policy domain="coder" rights="none" pattern="PS" />
<policy domain="coder" rights="none" pattern="PS2" />
<policy domain="coder" rights="none" pattern="PS3" />
<policy domain="coder" rights="none" pattern="EPS" />
<policy domain="coder" rights="none" pattern="PDF" />
<policy domain="coder" rights="none" pattern="XPS" />


$ sudo reboot 

@MostafaAttia
Copy link

MostafaAttia commented Dec 3, 2021

Why not accomplish that directly with ImageMagic?

<?php

$im = new imagick('file.pdf[0]');

$im->setImageFormat('jpg');

header('Content-Type: image/jpeg');

echo $im;

?>

@inglesuniversal
Copy link

@MostafaAttia let's give it a try...

@PEM-FR
Copy link

PEM-FR commented Mar 21, 2022

Last famous words... :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment