Skip to content

Instantly share code, notes, and snippets.

View joewiz's full-sized avatar

Joe Wicentowski joewiz

  • Arlington, Virginia
View GitHub Profile
@joewiz
joewiz / system-properties.xq
Created March 10, 2022 18:32
Display all system properties and their values - for eXist-db
xquery version "3.1";
(: @see https://exist-db.org/exist/apps/fundocs/index.html?q=util:system-property
: @see https://github.com/eXist-db/exist/blob/develop/exist-core/src/main/resources-filtered/org/exist/system.properties
:)
let $exist-properties :=
(
"vendor",
"vendor-url",
"product-name",
@joewiz
joewiz / merge-fragments.xq
Last active January 5, 2022 17:03
Merge XML fragments into a composite document
xquery version "3.1";
(:~ Merge XML fragments with the same root node into a composite document.
: Based on a question from the exist-open mailing list.
:
: @see https://markmail.org/message/gdrhscaobyya44we
:)
(: Compare the fragment to the composite document. If the node names match,
@joewiz
joewiz / Second Tuesdays.md
Last active November 18, 2021 15:59
Second Tuesdays of the year, with XQuery

Today in exist-open, Slav asked:

Hi, anyone have function for calculate all second Tuesdays for year?

(That is, Patch Tuesdays.)

Below are two approaches—one that solves the problem directly (finding only 2nd Tuesdays) and one more generally (finding any combination of week of the month and day of the week, e.g., 5th Sundays).

@joewiz
joewiz / chmod-recursive.xq
Created June 3, 2021 20:54
Batch change permissions on resources and collections in eXist-db
xquery version "3.1";
import module namespace dbutil="http://exist-db.org/xquery/dbutil";
dbutil:scan(
xs:anyURI("/db/apps/airlock-data"),
function($col, $res) {
if ($res) then
(: Set permissions on resources here :)
(
@joewiz
joewiz / list-packages-with-dependency.xq
Created June 2, 2021 15:50
Find which EXPath packages declare a particular dependency, in eXist-db
xquery version "3.1";
declare namespace pkg="http://expath.org/ns/pkg";
array {
for $app in xmldb:get-child-collections("/db/apps")
let $package-metadata := doc("/db/apps/" || $app || "/expath-pkg.xml")
where $package-metadata//pkg:dependency[@package eq "http://exist-db.org/apps/shared"]
order by $app
return
@joewiz
joewiz / fib-exist.xq
Last active October 10, 2020 16:51 — forked from apb2006/fib.xq
XQuery tail recursive Fibonacci function, with timing, for eXist-db
xquery version "3.1";
(: forked from https://gist.github.com/apb2006/4eef5889017be4a50685a467b2754d27
: with tests returned in the style of https://so.nwalsh.com/2020/10/09-fib :)
declare function local:fib($n as xs:integer, $a as xs:integer, $b as xs:integer){
switch ($n)
case 0 return $a
case 1 return $b
default return local:fib($n - 1, $b, $a + $b)
@joewiz
joewiz / check-text-for-ocr-typo-patterns.xq
Last active April 9, 2021 03:30
Check a text for OCR typo patterns, using XQuery
xquery version "3.1";
(:~
: Find possible OCR errors in a text by checking for patterns that an OCR
: process is known to misread, e.g., "day" misread as "clay", or "France"
: misread as "Prance." If the OCR engine just misread some instances of these
: words but got other instances correct, then this query will highlight
: candidates for correction.
:
: The query lets you configure a source text and define pattern sets to be used.
@joewiz
joewiz / generate-xconfs.xq
Created April 27, 2020 19:49
Generate eXist facet definitions for xconf index configuration files programmatically
xquery version "3.1";
(: WIP! :)
declare boundary-space preserve;
let $xconfs :=
array {
map {
"qname": "tei:div",
@joewiz
joewiz / migrating-from-old-to-new-indexes.md
Last active June 13, 2025 18:22
Converting an eXist application from old-style fields to new, Lucene-based facets and fields

Converting an eXist application from old-style fields to new, Lucene-based facets and fields

This article walks through the process of migrating an eXist application from using old-style fields to using the new, Lucene-based facets and fields. For more information, see the eXist documentation's Lucene article.

Old-style approach

In the old-style approach to fields, fields were constructed and maintained manually via the ft:index() function. To add or update fields for a document, a <doc> element containing <field> elements was passed to this function, along with the URI of the resource to be indexed.

For example, in one application, fields were constructed with in the hsa/modules/index.xq library module, whose index:index-one-document() function constructed the <field> elements and passed them to the ft:index() function:

@joewiz
joewiz / tokenize-sentences-nlp.xq
Last active April 20, 2022 11:23
Split (or "tokenize") a string into "sentences", with XQuery. See https://gist.github.com/joewiz/5889711
xquery version "3.1";
(: Use the eXist Stanford NLP package for sentence tokenization.
: Compared to my original "naïve" approach, this approach takes a quarter the number of lines of XQuery code.
: See https://gist.github.com/joewiz/5889711 :)
import module namespace nlp="http://exist-db.org/xquery/stanford-nlp";
declare function local:tokenize-sentences($text as xs:string) {
local:tokenize-sentences($text, map{})