Skip to content

Instantly share code, notes, and snippets.

@literalplus
Last active January 21, 2024 02:25
Show Gist options
  • Save literalplus/045c4d090e2fe742157b4c903a984d24 to your computer and use it in GitHub Desktop.
Save literalplus/045c4d090e2fe742157b4c903a984d24 to your computer and use it in GitHub Desktop.
PDF/A compliance for LaTeX files isn't as simple as they all say apparently. This copied-from-various-internet-sources hack file seems to fix it for me.
% ref: https://www.mathstat.dal.ca/~selinger/pdfa/
% inspired by: https://github.com/op3/latex-pdfa-howto/tree/master
% inspired by: https://gitlab.com/ThomasAUZINGER/vutinfth/-/merge_requests/3/diffs
% VALIDATION TOOL TO CHECK - https://verapdf.org/software/
% Adding graphics, fonts, etc. may break PDF/A, please re-check before submitting
% PACKAGE DEPENDENCIES
\usepackage{hyperxmp} % Write hyperref props to XMP instead of needing separate file / weird workarounds as with pdfx
\usepackage{colorprofiles} % Includes all the tasty colour profiles we define below wutg the weird PDF streams
% Provide RGB colour profile; This fixes the warning about DeviceRGB
% https://github.com/op3/latex-pdfa-howto/blob/master/pdfacode.tex
\immediate\pdfobj stream attr{/N 3^^J/Alternate/DeviceRGB} file{sRGB.icc}
\pdfcatalog{
/PageMode /UseNone
/OutputIntents [
<<
/Type /OutputIntent
/S /GTS_PDFA1
/DestOutputProfile \the\pdflastobj\space 0 R
/OutputConditionIdentifier (sRGB IEC61966-2.1 (Equivalent to www.srgb.com 1998 HP profile))
/Info(sRGB IEC61966-2.1 (Equivalent to www.srgb.com 1998 HP profile))
/RegistryName (http://www.color.org/)
>>
]
}%
% https://tex.stackexchange.com/questions/576/how-to-generate-pdf-a-and-pdf-x/349521#349521
\pdfobjcompresslevel=0 % compression can cause issues with PDF/A
\pdfinclusioncopyfonts=1 % If positive, this parameter forces pdfTEX to include fonts from a pdf file loaded with \pdfximage, even if those fonts are available on disk. Bigger files might be created, but included pdf files are sure to be embedded with the adequate fonts; indeed, the fonts on disk might be different from the embedded ones, and glyphs might be missing.
% Copy & paste from the PDF with correct Unicode ... no more dots before the u
\input{glyphtounicode.tex} % Doesn't seem to be necessary any more in recent LaTeX versions, but shouldn't hurt https://github.com/latex3/latex2e/issues/465
\input{glyphtounicode-cmr.tex} % ^ same
\pdfgentounicode=1 % ^ same
%\pdfinterwordspaceon % DO NOT ADD THIS - CAUSES "For every font embedded in a conforming file and used for rendering, the glyph width information in the font dictionary and in the embedded font program shall be consistent." - This *should* maybe work out of the box in recent LaTeX versions, but it's not clearly documented anywhere.
% insert CMYK colour profile
% https://tex.stackexchange.com/questions/227429/pdf-a-with-cmyk-how
% https://tex.stackexchange.com/questions/464079/pdfx-cmyk-color-profile-in-pdf-a
% " I just realised that the name of the profile file given in the pdfx doc is actually wrong. The file installed by package colorprofiles is FOGRA39L_coated.icc, not coated_FOGRA39L_argl.icc. After correcting this, the file compiles flawlessly. But preflight does indeed complain although the output profile is now set correctly. "
\immediate\pdfobj stream attr{/N 4} file{FOGRA39L_coated.icc}
\pdfcatalog{%
/OutputIntents [ <<
/Type /OutputIntent
/S/GTS_PDFA1
/DestOutputProfile \the\pdflastobj\space 0 R
/OutputConditionIdentifier (Coated FOGRA39)
/Info(FOGRA39L)
>> ]
}

Base-level PDF/A compliance for LaTeX files isn't as simple as they all say apparently. This copied-from-various-internet-sources hack file seems to fix it for me, with TeXLive 2023 on Arch Linux.

Attribution is provided in source code comments! Since all borrowed content is from StackExchange, it is licensed CC-BY-SA (details). This means that anything else added by me + the curation (= the entire pdfacompliance.tex) also must carry that license. The parts that are not related to the StackExchange code may be used freely as specified by the Unlicense.

Place the file in your working directory and include it in your LaTeX document, as demonstrated in example.tex.

Having the colour profile raw stuff floating around in your document seems werid, and I feel like there must be a better solution, but I wasn't able to find one.

This results in a PDF/A-2u compliant file, as verified by VeraPDF, which is a dedicated open source validation tool for PDF/A. It can be made compliant with any of the PDF/A-2* family by adjusting the conformance level it claims.

Note that if you use graphics, fonts, etc., the conformance may break ... so make sure to verify and re-verify with VeraPDF.

\documentclass[draft,final]{vutinfth} % https://gitlab.com/ThomasAUZINGER/vutinfth/-/tree/master ... or any common class, I just happened to use this one
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage[usenames,dvipsnames,table,hyperref,cmyk]{xcolor}
\usepackage[pdfa]{hyperref}
% ... rest of your preamble
\input{pdfacompliance} % PDF/A compliance
\hypersetup{
unicode,
pdfapart=2, % PDF/A "part of the standard" (This setup is compliant with PDF/A-2)
pdfaconformance=U, % PDF/A "conformance level" - For part 2 (in ascending order of strictness): B - archival, A - accessibility, U - unicode embedded (not possible for scanned files)
% empty lines are illegal here, and weird errors happen if you don't comment them
% the following section is optional but recommended (at least title and author)
pdftitle={Your thing},
pdfauthor={Your name},
pdflang={en-GB},
pdfkeywords={Keyword, Wow, Multiple},
pdfpublisher={Your value here},
% v the rest here you shouldn't need, but makes the links more normal
colorlinks,
breaklinks,
linkcolor={red!50!black},
citecolor={red!60!black},
urlcolor={blue!40!black},
bookmarksnumbered,
linkbordercolor = {Melon},
hypertexnames=false,% use guessable names for links
}
\begin{document}
PDF/A compliance!
\end{document}
<report>
<buildInformation>
<releaseDetails id="core" version="1.24.1" buildDate="2023-06-22T10:38:00+02:00"/>
<releaseDetails id="validation-model" version="1.24.1" buildDate="2023-06-22T11:37:00+02:00"/>
<releaseDetails id="gui" version="1.24.1" buildDate="2023-06-22T14:19:00+02:00"/>
</buildInformation>
<jobs>
<job>
<item size="140793">
<name>/home/lit/ideap/TUW-DA/thesis-prefix-crab/myexample.pdf</name>
</item>
<validationReport jobEndStatus="normal" profileName="PDF/A-2U validation profile" statement="PDF file is compliant with Validation Profile requirements." isCompliant="true">
<details passedRules="144" failedRules="0" passedChecks="886" failedChecks="0"/>
</validationReport>
<duration start="1705803204421" finish="1705803204440">00:00:00.019</duration>
</job>
</jobs>
<batchSummary totalJobs="1" failedToParse="0" encrypted="0" outOfMemory="0" veraExceptions="0">
<validationReports compliant="1" nonCompliant="0" failedJobs="0">1</validationReports>
<featureReports failedJobs="0">0</featureReports>
<repairReports failedJobs="0">0</repairReports>
<duration start="1705803204418" finish="1705803204442">00:00:00.024</duration>
</batchSummary>
</report>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment