Skip to content

Instantly share code, notes, and snippets.

View birkin's full-sized avatar

Birkin James Diana birkin

View GitHub Profile
@birkin
birkin / script_hack_django_sqlite.sh
Last active December 12, 2024 22:55
hack to get newish version of django to work with old version of sqlite.
#!/bin/bash
## Purpose:
## This script hacks Django's sqlite package to enable it to work with an old version
## of sqlite3 on an old version of RedHat.
##
## Flow:
## It checks if django and pysqlite3 and pysqlite3-binary are installed in the virtual environment.
## If the required packages are not found, the script alerts the user and exits.
## If the packages are found, it proceeds to update Django's `sqlite3/base.py` file.
@birkin
birkin / render_xml_via_template.py
Created November 15, 2024 19:23
example of rendering xml via django template
"""
Generates XML output using Django's template system.
Accepts command-line arguments to specify a person's name and role.
Usage:
$ uv run "./render_xml_via_template.py" --person "Birkin" --role "cheerleader"
Output:
<?xml version="1.0" encoding="UTF-8"?>
<root>
@birkin
birkin / pdf_conformance_suggestions.md
Created October 21, 2024 19:54
pdf conformance ChatGPT suggestions

My October-2024 question

(continuation from irrelevant code-thread)...

If I had a PDF that conformed to this standard, and wanted to indicate that in metadata, what would be the a common digital-repository metadata xml spec where I'd list this conformance? And please provide an example or two of what the entry might look like.


ChatGPT response...

@birkin
birkin / pdf_edit_via_pikepdf.py
Created October 21, 2024 18:35
edit PDF DecodeParms dict
import pikepdf
def edit_decodeparms(pdf_path, output_path):
with pikepdf.open(pdf_path) as pdf:
for page_num, page in enumerate(pdf.pages, start=1):
resources = page.get('/Resources', {})
xobjects = resources.get('/XObject', {})
for xobj_name, xobj_ref in xobjects.items():
xobj = xobj_ref # Use the object directly
filters = xobj.get('/Filter', [])
@birkin
birkin / pdf_check_via_pikepdf.py
Last active October 25, 2024 02:05
inspect PDF DecodeParms dict
"""
Checks for invalid keys in the DecodeParms dictionary of images in a PDF file.
-------
Usage:
- setup venv
- install pikepdf
% python ./pdf_check_via_pikepdf.py --pdf_path "/path/to/the.pdf"
...or, just...
@birkin
birkin / llm_summarization_notes.md
Last active October 25, 2024 02:52
IPLC Discovery-Day 2024-October notes
@birkin
birkin / make_tsv.py
Created August 27, 2024 19:13
convert two lines of data to tsv
### convert the two rows of data to a tsv file using the python csv module
import csv
original_two_lines = [
['id', 'jacket_id', 'firstname', 'lastname', 'shortID', 'title', 'pub_date', 'image', 'role', 'dept', 'dept2', 'dept3', 'active', 'created_at', 'updated_at'],
['10', '10', 'First', 'Last', 'flast', 'the long title', '2015', 'flast.jpg', 'author', 'Political Science', 'International and Public Affairs', 'y', '', '', '']
]
with open('data.tsv', 'w') as tsvfile:
@birkin
birkin / check_xml_catalog.py
Last active August 19, 2024 18:15
check xml catalog
"""
Validates against internal mods xml-schema, indicating whether xmlcatalog is used.
"""
import os, unittest
from lxml import etree
import requests
class StrictCatalogResolver(etree.Resolver):
@birkin
birkin / validate_xmlschema_no_network.py
Created July 10, 2024 11:25
code to validate xml agains XSD schema with no network access.
import os
from lxml import etree
def validate_xml_with_schema( xml_filepath: str, xsd_filepath: str ) -> None:
"""
Validates an XML file against an XSD schema, without network access.
Confirms that:
- xmlcatalog is routing the schema location to the local file system
- the C `libxml2` library used by lxml does auto-default to the standard server's `xml/catalog` file.
"""
To add to server run_tests to ensure xmlcatalog is properly configured, and properly being called.
A MODS file likely be the best candidate for the `xml_filepath`.
"""
import os
from lxml import etree
def validate_xml_with_schema( xml_filepath: str, xsd_filepath: str ) -> None: