Skip to content

Instantly share code, notes, and snippets.

@jone
Created March 14, 2014 10:46
Show Gist options
  • Save jone/9545556 to your computer and use it in GitHub Desktop.
Save jone/9545556 to your computer and use it in GitHub Desktop.
Tika traceback for password protected files.
Exception in thread "main" org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@7c6572b
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:141)
at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:417)
at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:111)
Caused by: java.lang.NullPointerException
at org.apache.poi.poifs.crypt.Decryptor.hashPassword(Decryptor.java:102)
at org.apache.poi.poifs.crypt.AgileDecryptor.verifyPassword(AgileDecryptor.java:66)
at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:227)
at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:161)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
... 5 more
Exception in thread "main" org.apache.tika.exception.TikaException: Unable to extract PDF content
at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:88)
at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:154)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:141)
at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:417)
at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:111)
Caused by: org.apache.pdfbox.exceptions.WrappedIOException: Error decrypting document, details:
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:327)
at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:72)
... 7 more
Caused by: org.apache.pdfbox.exceptions.CryptographyException: Error: The supplied password does not match either the owner or user password in the document.
at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.prepareForDecryption(StandardSecurityHandler.java:262)
at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.decryptDocument(StandardSecurityHandler.java:154)
at org.apache.pdfbox.pdmodel.PDDocument.openProtection(PDDocument.java:1504)
at org.apache.pdfbox.pdmodel.PDDocument.decrypt(PDDocument.java:914)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:323)
... 8 more
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment