Skip to content

Instantly share code, notes, and snippets.

@cabo
cabo / mensa.rb
Created November 29, 2011 11:10
Clean up an awful web page
#!/opt/local/bin/ruby1.9
require 'rubygems'
require 'nokogiri'
require 'open-uri'
txt = open("http://www.studentenwerk.bremen.de/files/main_info/essen/plaene/uniessen.php").read
txt.gsub!(/<</, "&laquo;")
txt.gsub!(/>>>/, ">&raquo;")
txt.gsub!(/>>/, "&raquo;")
doc = Nokogiri::HTML(txt)
@cabo
cabo / fnsorter.rb
Created March 3, 2011 18:40
Sort filenames in a natural way
# convert a filename into an array suitable for sorting
def filename_to_sortable(fn)
# convert numeric parts of the string given into actual numbers.
# Keep a string (third element) to disambiguate
fn.scan(/(\D*)(\d*)/).map{ |alpha, numeric| [alpha, numeric.to_i, numeric]}
end
# sort filenames in a "natural" way, keeping numeric parts ascending numerically
def sort_filenames(a)
a.sort_by{ |fn| filename_to_sortable(fn) }
@cabo
cabo / catdocx.rb
Created January 10, 2011 20:43
quickly extract the text from a MOOX ("OOXML") .docx file
#!/opt/local/bin/ruby1.9
require 'zip/zipfilesystem'
require 'nokogiri'
MS_S = "http://schemas.openxmlformats.org/"
MS_W = MS_S + "wordprocessingml/2006/main"
MS_W_OD = MS_S + "officeDocument/2006/relationships/officeDocument"
ARGV.each do |fn|
Zip::ZipFile.open(fn) do |zf|
@cabo
cabo / cleandups.rb
Created October 9, 2010 10:01
Find and clean duplicate files (Ruby)
#!/opt/local/bin/ruby1.9
require 'digest/md5'
require 'shellwords'
# argument processing -- goes through Dir[], so can use '**/*' etc.
ARGV[0] ||= '.'
filenames = ARGV.map do |dirn|
Dir[if File.directory?(dirn)
"#{dirn}/*"
else
dirn
@cabo
cabo / gist:605385
Created September 30, 2010 21:50
Making Emacs German-friendly
(set-language-info-alist
"German8" '((tutorial . "TUTORIAL.de")
(charset unicode)
(coding-system utf-8 iso-latin-1 iso-latin-9)
(coding-priority utf-8 iso-latin-1)
(nonascii-translation . iso-8859-1)
(input-method . "german-postfix")
(unibyte-display . iso-latin-1)
(sample-text . "\
German (Deutsch Nord) Guten Tag
@cabo
cabo / ipv6-address-re.rb
Created February 11, 2010 12:54
IPv6 address RE written in an intention-revealing way
# IPv6 address regular expression done right
# [email protected] 2010-02-11
# from a NANOG thread:
# (corrected version of) http://gist.github.com/294476
# Use the tests in that gist if you actually want to change this
ORIGINAL_IPV6_REGEX = /^\s*((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f