Skip to content

Instantly share code, notes, and snippets.

@ctrl-freak
Created May 19, 2026 02:47
Show Gist options
  • Select an option

  • Save ctrl-freak/0c2e3b1f2c1b2cc7285c22f1d32dc669 to your computer and use it in GitHub Desktop.

Select an option

Save ctrl-freak/0c2e3b1f2c1b2cc7285c22f1d32dc669 to your computer and use it in GitHub Desktop.
Parse Australian Immunisation Register Immunisation History Statement from PDF to CSV
/*
"C:\Program Files\gs\gs10.03.1\bin\gswin64c.exe" -dBATCH -dNOPAUSE -sDEVICE=txtwrite -r300 -dUseCropBox -sOutputFile="{3}\{2}.txt" "{1}\{2}"
{1} Process Property 'PDFs Location' from 'Vaccinations'
{2} Document Property - Disk - File Name
{3} Process Property 'Text Location' from 'Vaccinations'
*/
// Read "{3}\{2}.txt" into outData
def contents = outData.toString();
def matchesEmployee = contents =~ /(?m)For:\r\n(.+)/;
def matchesVaccs = contents =~ /([0-9][0-9]\s[a-zA-Z]{3}\s[0-9]{4})\t([a-zA-Z0-9\- ]+)\t([a-zA-Z \-]+)/;
def matchesAsAt = contents =~ /(?m)As at:\r\n([0-9][0-9]\s[a-zA-Z]+\s[0-9]{4})/;
def matchesDOB = contents =~ /(?m)Date of birth:\r\n([0-9][0-9]\s[a-zA-Z]+\s[0-9]{4})/;
def lastDate;
def dateGiven;
// DEBUG
// contents += matchesEmployee.size()+LINE_SEPARATOR;
// contents += matchesAsAt.size()+LINE_SEPARATOR;
// contents += matchesDOB.size()+LINE_SEPARATOR;
def output = [["AsAt", "Employee", "DOB", "Date", "Immunisation", "Brand"]];
for (matchesVacc in matchesVaccs) {
if (matchesEmployee.size()
&& matchesVacc.size()
&& matchesAsAt.size()
&& matchesDOB.size()
) {
if (matchesVacc[1] == '00 MMM 0000') {
dateGiven = lastDate;
} else {
dateGiven = matchesVacc[1];
}
output.push([
matchesAsAt[0][1],
matchesEmployee[0][1],
matchesDOB[0][1],
dateGiven,
matchesVacc[2],
matchesVacc[3].trim()
]);
lastDate = dateGiven;
} else {}
}
contents = "";
for (line in output) {
contents += "\"" + line.join("\",\"") + "\"" + LINE_SEPARATOR;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment