Skip to content

Instantly share code, notes, and snippets.

@ellispritchard
Last active May 12, 2016 12:15
Show Gist options
  • Save ellispritchard/5546760 to your computer and use it in GitHub Desktop.
Save ellispritchard/5546760 to your computer and use it in GitHub Desktop.
MarkLogic XQuery timezone conversion support: functions to convert times between named zones using the IANA Time Zone Database. Converts compiled tz database files for use with the conversion function.

MarkLogic Timezone support

Usage

timezone.xqy can convert between times in named timezones, using special TimeZoneInfo files produced from UNIX zoneinfo files, e.g. it allows you to look up the equivalent time in New York, based on a time in London.

tz:adjust-dateTime-to-timezone(xs:dateTime('2013-05-09T10:34:00+01:00'),'America/New_York')

=> xs:dateTime('2013-05-09T05:34:00-04:00')

The module also has some heuristics for converting a local-time (time without a timezone offset) to the equivalent offsetted time in a given timezone:

tz:adjust-dateTime-to-timezone(xs:dateTime('2013-05-09T10:34:00'),'America/New_York')

=> xs:dateTime('2013-05-09T10:34:00-04:00')

parse-zoneinfo.xqy parses local UNIX zoneinfo files to produce files suitable for timezone.xqy, e.g. to produce and store files to support timezone Europe/Paris:

zi:insert-timezoneinfo(zi:parse-zoneinfo('Europe/Paris'))

Requirements

  • tz:TimeZoneInfo documents must be loaded into the content database for every time-zone you wish to use.

    • To produce tz:TimeZoneInfo documents automatically using parse-zoneinfo.xqy, compiled IANA timezone information files (man 5 tzfile) must be available on the local file-system: on Linux and MacOS X they live in /usr/(share|lib)/zoneinfo; on non-UNIX systems e.g. Windows, you'll need to get and compile these files from http://www.iana.org/time-zones
  • Requires two element range indexes on the database with the tz:TimeZoneInfo documents:

    {http://pressassociation.com/std/lib/timezone}Starts as xs:dateTime
    {http://pressassociation.com/std/lib/timezone}Ends as xs:dateTime
  • The timezone.xsd schema must be also loaded into the appropriate schemas database.
xquery version "1.0-ml";
module namespace zi = 'http://pressassociation.com/std/lib/timezone/parse';
(: Compile a local UNIX zoneinfo file into TimeZoneInfo documents suitable for timezone.xqy library use.
To generate TimeZoneInfo documents:
import module zi = 'http://pressassociation.com/std/lib/timezone/parse' at '/lib/marklogic-commons/parse-zoneinfo.xqy';
zi:parse-zoneinfo('Asia/Tokyo')
To generate *and* insert TimeZoneInfo documents into current database with unique URIs (e.g. for use by timezone.xqy):
zi:insert-timezoneinfo(zi:parse-zoneinfo('Asia/Tokyo'))
:)
declare namespace tz = 'http://pressassociation.com/std/lib/timezone';
declare variable $g_fsZoneInfoPath := '/usr/share/zoneinfo/'; (: input zoneinfo filesystem path :)
declare variable $g_dbZoneInfoURI := '/zoneinfo/'; (: output db URI prefix :)
declare function zi:parse-zoneinfo($timezone-name as xs:string) as element(tz:TimeZoneInfo)* {
let $zoneinfo := xdmp:external-binary(fn:concat($g_fsZoneInfoPath,$timezone-name))
let $version := zi:unsigned-byte-at($zoneinfo,5)
where xdmp:binary-decode(xdmp:subbinary($zoneinfo,1,4),'ISO-8859-1') = 'TZif' and $version = (0,50)
return
(: let $tzh_ttisgmtcnt := zi:unsigned-int-at($zoneinfo,21) - ref-only - not required :)
(: let $tzh_ttisstdcnt := zi:unsigned-int-at($zoneinfo,25) - ref-only - not required :)
(: let $tzh_leapcnt := zi:unsigned-int-at($zoneinfo,29) - ref-only - not required :)
let $tzh_timecnt := zi:unsigned-int-at($zoneinfo,33)
let $tzh_typecnt := zi:unsigned-int-at($zoneinfo,37)
(: let $tzh_charcnt := zi:unsigned-int-at($zoneinfo,41) - ref-only - not required :)
let $changeTimes :=
for $intTime in zi:int-sequence-starting-at($zoneinfo,45,$tzh_timecnt)
return zi:int-to-dateTime($intTime)
let $localTimeIndex := zi:unsigned-byte-sequence-starting-at($zoneinfo,45+$tzh_timecnt*4,$tzh_timecnt)
let $ttinfoStart := 45+$tzh_timecnt*4+$tzh_timecnt
let $ttinfos := zi:parse-ttinfo($zoneinfo,$ttinfoStart,$tzh_typecnt)
return (
for $i in (1 to $tzh_timecnt)
let $ttinfo := $ttinfos[$localTimeIndex[$i]+1]
return element tz:TimeZoneInfo {
element tz:TimeZone { $timezone-name },
element tz:Name { fn:data($ttinfo/tt_abbr) },
element tz:Abbreviation { fn:data($ttinfo/tt_abbr) },
element tz:DST { fn:data($ttinfo/tt_isdst) },
element tz:OffsetToUTC {
xs:dayTimeDuration(fn:concat($ttinfo/tt_gmtoff[.<0]/fn:concat('-'),'PT',fn:abs($ttinfo/tt_gmtoff),'S'))
},
element tz:DateTimeRange {
element tz:Starts {
$changeTimes[$i]
},
element tz:Ends {
(: NB last value in zoneinfo lasts until the end of (UNIX 32-bit) time; important where there is no DST, e.g. Asia/Tokyo :)
if(fn:exists($changeTimes[$i+1])) then $changeTimes[$i+1] else $g_endOfTime
}
}
}
)
};
(: inserts given TimeZoneInfo documents into the database; returns document URIs :)
declare function zi:insert-timezoneinfo($timezone-info as element(tz:TimeZoneInfo)*) as xs:string* {
zi:insert-timezoneinfo($timezone-info, xdmp:default-permissions())
};
declare function zi:insert-timezoneinfo($timezone-info as element(tz:TimeZoneInfo)*, $permissions as element(sec:permission)*) as xs:string* {
for $tzi in $timezone-info
let $timezone-name := fn:data($tzi/tz:TimeZone)
let $uri := fn:concat('/zoneinfo/',$timezone-name,'/',fn:string-join((fn:tokenize($timezone-name,'/')[fn:last()],fn:string(xs:date($tzi/tz:DateTimeRange/tz:Starts)),$tzi/tz:Abbreviation/fn:string()),'-'),'.xml')
return ($uri,xdmp:document-insert($uri,$tzi,$permissions,'fixture'))
};
(: PRIVATE :)
(: signed 32-bit int conversion constants :)
declare private variable $g_mask32 := xdmp:hex-to-integer('7FFFFFFF');
declare private variable $g_signedBit32 := xdmp:hex-to-integer('80000000');
(: time conversion constants :)
declare private variable $g_wallclockMultiplier64 := 10000000;
declare private variable $g_utcOffset := xs:dayTimeDuration('PT0H');
declare private variable $g_epoch := xs:dateTime('1970-01-01T00:00:00Z');
declare private variable $g_endOfTime := xs:dateTime('2038-01-19T03:14:07');
declare private function zi:unsigned-int-at($binary as binary(), $pos as xs:double) as xs:unsignedInt {
xdmp:hex-to-integer(xs:string(xdmp:subbinary($binary,$pos,4)))
};
declare private function zi:int-at($binary as binary(), $pos as xs:double) as xs:int {
let $v := xdmp:hex-to-integer(xs:string(xdmp:subbinary($binary,$pos,4)))
return if(xdmp:and64($v,$g_signedBit32)) then -(xdmp:and64(xdmp:not64($v),$g_mask32)+1) else $v
};
declare private function zi:unsigned-byte-at($binary as binary(), $pos as xs:double) as xs:unsignedByte {
xdmp:hex-to-integer(xs:string(xdmp:subbinary($binary,$pos,1)))
};
declare private function zi:int-sequence-starting-at($binary as binary(), $start as xs:double, $count as xs:integer) as xs:int* {
for $int in (1 to $count)
return zi:int-at($binary,$start + ($int - 1) * 4)
};
declare private function zi:unsigned-byte-sequence-starting-at($binary as binary(), $start as xs:double, $count as xs:integer) as xs:unsignedByte* {
for $byte in (1 to $count)
return zi:unsigned-byte-at($binary,$start + ($byte - 1))
};
declare private function zi:int-to-dateTime($int as xs:int) as xs:dateTime {
if($int gt 0) then (
fn:adjust-dateTime-to-timezone(xdmp:timestamp-to-wallclock($int * $g_wallclockMultiplier64),$g_utcOffset)
) else (
$g_epoch - xs:dayTimeDuration(fn:concat('PT',fn:abs($int),'S'))
)
};
declare private function zi:parse-ttinfo($binary as binary(), $start as xs:double, $count as xs:integer) as element(ttinfo)* {
let $charstart := $start + 6*$count
for $i in (1 to $count)
let $pos := $start + ($i - 1) * 6
let $tt_gmtoff := zi:int-at($binary,$pos)
let $tt_isdst := zi:unsigned-byte-at($binary,$pos+4)
let $tt_abbrind := zi:unsigned-byte-at($binary,$pos+5)
return element ttinfo {
element tt_gmtoff { $tt_gmtoff },
element tt_isdst { fn:not(fn:not($tt_isdst)) },
element tt_abbr {
xdmp:binary-decode(xdmp:subbinary($binary,$charstart+$tt_abbrind),'UTF-8')
}
}
};
xquery version '1.0-ml';
(: Limited time-zone support: principly for mapping times between UTC and other time-zones, depending on what is available in the database. :)
module namespace tz = 'http://pressassociation.com/std/lib/timezone';
declare variable $UTC-offset := xs:dayTimeDuration('PT0H');
(:
converts the given dateTime to the equivalent dateTime in the given timezone; if dateTime does not have
an offset to UTC, applies heuristics to obtain applicable offset assuming the time is local to
that time-zone.
e.g.
f('2013-04-04T11:25:00Z','Europe/London') => '2013-04-04T12:25:00+01:00' (UTC offset)
f('2013-04-04T11:25:00+01:00','Europe/London') => '2013-04-04T11:25:00+01:00' (BST offset)
f('2013-04-04T11:25:00','Europe/London') => '2013-04-04T11:25:00+01:00' (no offset)
NB requires the following range indexes:
{http://pressassociation.com/std/lib/timezone}Starts as xs:dateTime
{http://pressassociation.com/std/lib/timezone}Ends as xs:dateTime
:)
declare function tz:adjust-dateTime-to-timezone($dateTime as xs:dateTime, $timezone-name as xs:string) as xs:dateTime {
let $timezone := tz:get-timezone-info-at-dateTime($timezone-name, $dateTime)
let $timezone-offset := fn:data($timezone/tz:OffsetToUTC)
return fn:adjust-dateTime-to-timezone($dateTime,$timezone-offset)
};
(:
gets the TimeZoneInfo document for the given timezone at the given dateTime; if dateTime does not have an offset to UTC,
applies heuristics to obtain applicable time-zone assuming the time is local to that time-zone.
:)
declare function tz:get-timezone-info-at-dateTime($timezone-name as xs:string, $dateTime as xs:dateTime) as element(tz:TimeZoneInfo) {
if(fn:empty(fn:timezone-from-dateTime($dateTime))) then (
tz:get-timezone-info-at-dateTime-with-no-offset($timezone-name,$dateTime)
) else (
let $timezone-info := cts:search(/tz:TimeZoneInfo,cts:and-query((
cts:element-value-query(xs:QName('tz:TimeZone'),$timezone-name),
cts:element-range-query(xs:QName('tz:Starts'),'<=',$dateTime),
cts:element-range-query(xs:QName('tz:Ends'),'>',$dateTime)
)))
return tz:exactly-one-timezone($timezone-info,$timezone-name,$dateTime)
)
};
(:
works out what timezone-info should be applicable when a time has no offset to UTC;
potentially inaccurate during hour of switch-over e.g. between 1AM and 2AM,
which is why we have offsets in the first place!
:)
declare function tz:get-timezone-info-at-dateTime-with-no-offset($timezone-name as xs:string, $dateTime as xs:dateTime) as element(tz:TimeZoneInfo) {
let $dateTime-floor := $dateTime - xs:dayTimeDuration('PT12H')
let $dateTime-ceil := $dateTime + xs:dayTimeDuration('PT14H')
let $possible-timezones :=
cts:search(/tz:TimeZoneInfo,cts:and-query((
cts:element-value-query(xs:QName('tz:TimeZone'),$timezone-name),
cts:or-query((
cts:and-query((
cts:element-range-query(xs:QName('tz:Starts'),'<=',$dateTime-floor),
cts:element-range-query(xs:QName('tz:Ends'),'>',$dateTime-floor)
)),
cts:and-query((
cts:element-range-query(xs:QName('tz:Starts'),'<=',$dateTime-ceil),
cts:element-range-query(xs:QName('tz:Ends'),'>',$dateTime-ceil)
))
))
)))
(: more than one matches (because no offset specified): see which one is active at given time, when its offset is applied :)
let $active-timezones :=
for $timezone in $possible-timezones
let $offset := $timezone/tz:OffsetToUTC
let $dateTime := fn:adjust-dateTime-to-timezone($dateTime,$offset)
where $timezone/tz:DateTimeRange/tz:Starts le $dateTime and $timezone/tz:DateTimeRange/tz:Ends gt $dateTime
order by $timezone/tz:DateTimeRange/tz:Starts
return $timezone
(: if time versions end up sitting in different periods, return 1st (only happens during hour of DST change-over, this is why we need offsets!!) :)
return tz:exactly-one-timezone($active-timezones[1],$timezone-name,$dateTime)
};
declare function tz:exactly-one-timezone($timezone-info as element(tz:TimeZoneInfo)*, $timezone-name as xs:string, $dateTime as xs:dateTime) as element(tz:TimeZoneInfo) {
if(fn:count($timezone-info) eq 1) then
$timezone-info
else if(fn:empty($timezone-info)) then
fn:error(xs:QName('tz:TIMEZONE-NOT-FOUND'),fn:concat('TimeZoneInfo not found for ',$timezone-name,' at ',$dateTime))
else
fn:error(xs:QName('tz:TIMEZONE-INVALID'),fn:concat('Too many TimeZoneInfo found for ',$timezone-name,' at ',$dateTime,' : invalid configuration?'))
};
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"
targetNamespace="http://pressassociation.com/std/lib/timezone"
xmlns:tz="http://pressassociation.com/std/lib/timezone">
<xs:element name="TimeZoneInfo">
<xs:annotation>
<xs:documentation>Represents a particular time-offset value in a TimeZone which is applicable for a particular time range.</xs:documentation>
</xs:annotation>
<xs:complexType>
<xs:sequence>
<xs:element name="TimeZone" type="tz:timezone-name-type"/>
<xs:element name="Name" type="xs:string"/>
<xs:element name="Abbreviation" type="xs:string"/>
<xs:element name="DST" type="xs:boolean" default="true" minOccurs="0"/>
<xs:element name="OffsetToUTC" type="xs:duration"/>
<xs:element ref="tz:DateTimeRange"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="DateTimeRange">
<xs:complexType>
<xs:sequence>
<xs:element name="Starts" type="xs:dateTime"/>
<xs:element name="Ends" type="xs:dateTime"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:simpleType name="timezone-name-type">
<xs:restriction base="xs:string"/>
</xs:simpleType>
</xs:schema>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment