Skip to content

Instantly share code, notes, and snippets.

View yorickpeterse's full-sized avatar

Yorick Peterse yorickpeterse

View GitHub Profile
%%machine example;
%%{
cdata_start = '<![CDATA[';
cdata_end = ']]>';
cdata := |*
# Everything that is not "]]>" (must match consecutive characters)
# Sadly this seems to result in "foo]]" being matched for input
# "<![CDATA[foo]]>".
require 'oga'
document = Oga.parse_xml <<-EOF
<pricing_information>
<cents>
10
</cents>
<currency>
EUR
</currency>
<link rel="shortcut icon" href="/favicon2.gif">
<html>
<body></body>
<body background="/Test_2.gif"></body>
<body onload="blink();">
<div align="center"></div>
<TITLE>Site Name</TITLE>
</body>
@yorickpeterse
yorickpeterse / 0001-Added-a-simple-Maybe-monad.patch
Created March 12, 2015 10:20
Simple Maybe monad in Ruby (copied straight from Git, too lazy to gemify for now)
From a31eb2863043937881e51cffb544ef9a641a62af Mon Sep 17 00:00:00 2001
From: Yorick Peterse <SNIP>
Date: Thu, 12 Mar 2015 11:08:46 +0100
Subject: [PATCH] Added a simple Maybe monad.
This is a rather simple implementation of the Maybe monad [1] / Option type [2].
It only provides the absolute bare-minimum instead of providing a fully blown
functional programming library.
Using this class allows you to turn code such as this:
Starting program: /home/yorickpeterse/.rubies/rbx-git/bin/ruby test.rb
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
[New Thread 0x7fffd75ff700 (LWP 350)]
[New Thread 0x7fffd71fe700 (LWP 351)]
[New Thread 0x7ffff45bc700 (LWP 348)]
[New Thread 0x7ffff48f3700 (LWP 347)]
[New Thread 0x7ffff49f4700 (LWP 345)]
[New Thread 0x7ffff4af5700 (LWP 344)]
[New Thread 0x7ffff4bf6700 (LWP 343)]
class Text
def text=(value)
@raw_text = value
@text = nil
end
def text
unless @text
@text = Entities.decode(@raw_text)
@raw_text = nil
# Based on the JSON output as listed at
# http://www.w3.org/TR/html5/syntax.html#named-character-references
DECODE_MAPPING = {
"&Aacute;" => [193].pack('U'),
"&Aacute" => [193].pack('U'),
"&aacute;" => [225].pack('U'),
"&aacute" => [225].pack('U'),
"&Abreve;" => [258].pack('U'),
"&abreve;" => [259].pack('U'),
"&ac;" => [8766].pack('U'),
require 'benchmark/ips'
module Enumerable
def each_cons(num)
n = Rubinius::Type.coerce_to_collection_index num
raise ArgumentError, "invalid size: #{n}" if n <= 0
return to_enum(:each_cons, num) {
if (enum_size = enumerator_size).nil?
nil
git clone [email protected]:YorickPeterse/oga.git
git checkout ruby-ll # if this complains, use git checkout -b ruby-ll origin/ruby-ll
bundle install
gem install oga
rake generate fixtures
# To benchmark the old parser, open benchmark/benchmark_helper.rb
# and replace the require_relative on line 5 with require 'oga'
# Then to benchmark it:
Starting program: /home/yorickpeterse/.rubies/rbx-git/bin/ruby -e Rubinius::VariableScope.new.locals
warning: Could not load shared library symbols for linux-vdso.so.1.
Do you need "set solib-search-path" or "set sysroot"?
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
[New Thread 0x7ffff7f8f700 (LWP 15561)]
[New Thread 0x7ffff4cf7700 (LWP 15562)]
[New Thread 0x7ffff4bf6700 (LWP 15563)]
[New Thread 0x7ffff4af5700 (LWP 15564)]
[New Thread 0x7ffff49f4700 (LWP 15565)]