Skip to content

Instantly share code, notes, and snippets.

@hoehrmann
hoehrmann / normalize.pl
Created May 26, 2013 10:55
The attached script contains a simple Perl implementation of the NFC and NFD normalization form using data directly derived from the UCD. It includes testing against the NormalizationTest.txt file. Originally http://lists.w3.org/Archives/Public/www-archive/2009Feb/0015.html
#!perl -w
use strict;
use warnings;
use DBI;
use Data::Dumper;
use Storable qw(retrieve nstore);
use IO::File;
# Look up the canonical combining class as in the UCD
our %CCC;
@hoehrmann
hoehrmann / ucdxml2sqlite.pl
Created May 26, 2013 10:56
Unicode Database XML dump to SQLite database. The attached script takes the ucd.all.flat.xml database and turns it into a SQLite database. Originally http://lists.w3.org/Archives/Public/www-archive/2009Feb/0014.html
#!perl -w
use strict;
use warnings;
use DBI;
use IO::File;
use XML::Parser;
my %fields;
my @fields;
@hoehrmann
hoehrmann / listglyphs.cpp
Created May 27, 2013 12:48
The attached file is a simple program that uses the Windows GDI routine EnumFontFamiliesEx and the Uniscribe ScriptGetCMap procedure to enumerate all installed True Type fonts and for each font it lists all glyphs in the BMP it supports. Works only on NT-based Windows. Originally http://lists.w3.org/Archives/Public/www-archive/2007May/0072.html
////////////////////////////////////////////////////////////////////
//
// Copyright (C) 2007 Bjoern Hoehrmann <[email protected]>
//
// This program is free software; you can redistribute it and/or
// modify it under the terms of the GNU General Public License
// as published by the Free Software Foundation; either version 2
// of the License, or (at your option) any later version.
//
// This program is distributed in the hope that it will be useful,
@hoehrmann
hoehrmann / cssanno.c
Created May 27, 2013 12:57
Annotate XML documents with CSS declarations. Originally http://lists.w3.org/Archives/Public/www-archive/2004Jun/0001.html
/*
cssanno.c -- annotate XML documents with CSS declarations
Usage:
cssanno file.xml file.css
cssanno takes an XML document and a CSS style sheet and
annotates the elements with new attributes representing
the style declarations for the current element. Example:
@hoehrmann
hoehrmann / RFC3986.pm
Created May 27, 2013 15:04
Implementation of the parsing and resolution algorithms in RFC 3986. Originally http://lists.w3.org/Archives/Public/www-archive/2011Aug/0001.html as amended by http://lists.w3.org/Archives/Public/www-archive/2011Aug/0003.html
package RFC3986;
use strict;
use warnings;
sub parse {
$_[0] =~ m|^(([^:/?\#]+):)?(//([^/?\#]*))?
([^?\#]*)(\?([^\#]*))?(\#(.*))?$|x;
# $_[0] =~ m|^(([a-zA-Z0-9+.-]+):)?(//([^/?\#]*))?
# ([^?\#]*)(\?([^\#]*))?(\#(.*))?$|x;
@hoehrmann
hoehrmann / urlcrack.c
Last active December 17, 2015 18:59
This command line utility prints out how WinINet, which is used by Internet Explorer, processes URLs, via InternetCombineUrl which combines a base reference with a relative reference, and via InternetCrackUrl, which splits URLs into various components. Originally http://lists.w3.org/Archives/Public/www-archive/2011Aug/0000.html
#undef UNICODE
#include <Windows.h>
#include <WinInet.h>
#include <stdio.h>
#pragma comment(lib, "wininet.lib")
int
main(int argc, char *argv[]) {
@hoehrmann
hoehrmann / Tidydotnet.cpp
Created May 27, 2013 16:49
The attached file contains a preliminary outline of a HTML Tidy wrapper for .NET applications in Managed C++ written in February 2003. Originally http://lists.w3.org/Archives/Public/www-archive/2004Jan/0115.html
// This is the main DLL file.
#include "stdafx.h"
#include "Tidydotnet.h"
#include "t:/tidylib2/include/tidy.h"
#include "t:/tidylib2/include/buffio.h"
using namespace System;
using namespace System::Xml;
using namespace System::Text;
@hoehrmann
hoehrmann / W3C-WD.xml
Created May 29, 2013 22:10
This parses CSS style sheets into an abstract syntax tree, serializes the tree as XML document trying to maintain information such as line and column position of individual style sheet parts, in order to exploit existing XML validation utilities, like RELAX NG schemas, to validate style sheets. Originally http://lists.w3.org/Archives/Public/www-…
<?xml version="1.0"?>
<c2x:stylesheet xmlns:c2x="http://xmlns.bjoern.hoehrmann.de/css2xml/0.1/">
<c2x:ruleset>
<c2x:selectors>
<c2x:selector byte_offset="283" line="11" column="1">
<c2x:simple byte_offset="283" line="11" column="1" combinator="0" type_mask="2"/>
</c2x:selector>
</c2x:selectors>
<c2x:property name="padding" important="false" byte_offset="292" line="12" column="3">
<c2x:terms>
@hoehrmann
hoehrmann / commatools.pl
Created May 31, 2013 11:01
Perl script implementing "Comma tools" that can be chained. Originally http://lists.w3.org/Archives/Public/public-qa-dev/2004May/0023.html
#!perl -w
use strict;
use warnings;
use URI::Escape 'uri_escape';
sub tool
{
my $req = shift;
return unless defined $req;
@hoehrmann
hoehrmann / validate.pl
Created May 31, 2013 11:23
HTML Validation script using only CPAN modules. Originally http://www.w3.org/mid/[email protected]
#!perl -w
BEGIN
{
$ENV{SP_CHARSET_FIXED} = 1;
$ENV{SP_ENCODING} = "UTF-8";
$ENV{SP_BCTF} = "UTF-8";
}
sub ErrorHandler::new {bless {p=>$_[1]}, shift}