Update in 2024
So in 2024 in actually having a good time using mupdf.
About my experience with mupdf:
It's written in C, but you can use it in many ways. Command line, python lib (pymupdf), js, and others.
I still coundn't find issues that I had with other tools and weird broken PDFs, so I'd say pretty good.
Update: Seems like jpdf2html5 is now as pointed in the comments https://www.idrsolutions.com/buildvu/
(Maybe in the future I will explain better about what happens when you convert your document to pdf, so for now, just keep the originals safe.)
Depending on the PDFs you could just extract the text, with simple tools like pdftotext
, comes with Poppler Tools.
Want to convert a simple page? You could just simply load some vector editing software, that accepts PDF as an input and export in SVG. Or maybe and online converter. A single document, not too large, you probably could choose this option also.
If you want to make an software that needs to proccess lots of PDFS, then you want something else.
I've been testing a lot of softwares, I won't remember everything, I should thought of writing something back then when I've spent days looking for the perfect converter.
If your PDF it's simple, not malformed, clean, and it was created by a trusted software.
You could maybe try to simply use something like inkscape, you can convert using command line tool, and to automatize the work is easy. Some people claim it's ok, for me it didn't work. But that's because.. weird-malformed-complex-pdfs...
There's other tools that works better with PDFs you can't choose how they where created. PDF with lots of tables, graphic components, embed fonts, fonts missing and so on.
It worth mention that if you have malformed files, fonts missing, and other defects you could try to correct those errors with. PDFTOCAIRO tool. CPDF from Coherent PDF tools, is great too.
- jpdf2html5 Despite the name, it does conversion too SVG too.
- I've tested, it does a very good job, I kind don't even need to optimize the SVG files after the conversion.
- But ... Java, it's command line, but it's a .jar, so if you were looking for a beautiful compiled binary, that's not the case.
- Other thing, it's expensive, I know I said it was great, but dude, usually conversion to SVG is one of many steps you will make to create a final product. Do you know what I mean? You need other softwares, and will have other costs. Anyway if you think their price is good for you, that's ok.
- pdf2svg I know, not a big project, not a lot of contributors.
- It just converts! xD
- I work with shitty PDFs and only a handful did not accepted to be successfully converted, basically PDFs with copy protection that "scramble words".
- Files are big, but optimize SVGs it's not a very difficult job, check this out.
- If you want to build, and know your way to Docker, take a look on my Dockerfiles.
Thanks douglas for the epic documentation right there
It's truly wonderful how google can find these gists and point users to it
very useful