Skip to content

Instantly share code, notes, and snippets.

View butlermh's full-sized avatar

Mark H. Butler butlermh

  • Santa Clara, United States
View GitHub Profile
@butlermh
butlermh / hadoop-lzo-build.sh
Created December 23, 2011 13:31
Snippet from Hadoop-LZO build script
env CFLAGS=-m64 CXXFLAGS=-m64 C_INCLUDE_PATH=$NATIVE/include
LIBRARY_PATH=$NATIVE/lib LD_LIBRARY_PATH=$NATIVE/lib
JAVA_LIBRARY_PATH=$NATIVE/lib ant -Dtest.junit.output.format=xml
-Dtest.output=yes -Dversion=$HADOOP_LZO_VERSION test package published
@butlermh
butlermh / forrest.properties
Created December 23, 2011 14:07
Changes to Forrest.properties so Hadoop will build with JDK 1.6
forrest.validate.sitemap=false
forrest.validate.skins.stylesheets=false
@butlermh
butlermh / hadoop-build.sh
Created December 23, 2011 14:15
Snippet from Hadoop build script
# note you need to set the following environment variables
# NATIVE - the native LZO library location
# ANT17 - your Ant 1.7 installation
# FORREST_HOME - your Apache Forrest 0.8 installation
# HADOOP_VERSION - the version you are giving Hadoop
# JAVA_HOME - your Java isntallation
env CFLAGS=-m64 CXXFLAGS=-m64 C_INCLUDE_PATH=$NATIVE/include
LIBRARY_PATH=$NATIVE/lib LD_LIBRARY_PATH=$NATIVE/lib
@butlermh
butlermh / lucene4cosine.java
Created January 30, 2013 12:32
Using Lucene 4 to calculate cosine similarity
import java.io.IOException;
import java.util.*;
import java.util.Map;
import java.util.Set;
import org.apache.commons.math3.linear.*;
import org.apache.lucene.analysis.*;
import org.apache.lucene.document.*;
import org.apache.lucene.index.*;
import org.apache.lucene.store.*;
@butlermh
butlermh / romeBOMTest.java
Created January 30, 2013 12:38
Using Apache Commons IO to fix BOM problems when using the Rome RSS parser library
package rss;
import org.xml.sax.InputSource;
import java.io.*;
import java.net.*;
import com.sun.syndication.io.*;
import org.apache.commons.io.IOUtils;
@butlermh
butlermh / MinimalTest.java
Last active December 17, 2015 01:39
A minimal test case that shows the problem I am encountered parsing quoted strings with regex expressions in Parboiled.
import static org.junit.Assert.*;
import static org.parboiled.support.ParseTreeUtils.printNodeTree;
import org.junit.Test;
import org.parboiled.BaseParser;
import org.parboiled.Parboiled;
import org.parboiled.Rule;
import org.parboiled.annotations.BuildParseTree;
import org.parboiled.buffers.IndentDedentInputBuffer;
import org.parboiled.errors.ErrorUtils;
@butlermh
butlermh / gist:7260966
Last active December 27, 2015 03:39
Using YUI to display data.taipei.gov.tw dataset
<!DOCTYPE html>
<title>Displaying a remote JSON DataSource in a DataTable</title>
<body class="yui3-skin-sam">
<div id="datatable"></div>
<script src="http://yui.yahooapis.com/3.5.0/build/yui/yui-min.js"></script>
<script>
YUI().use('datatable', 'datasource-get', 'datasource-io', 'datasource-jsonschema', function (Y) {
var src = new Y.DataSource.Get({
source: 'http://data.taipei.gov.tw/opendata/apply/json/NzRBNTc0NDUtMjMxMi00RTk1LTkxMjgtNzgzMzU5MEQzRDc3'
});
@butlermh
butlermh / gist:7704077
Created November 29, 2013 10:41
Python script to convert Apple forum data set http://sifaka.cs.uiuc.edu/~wang296/Data/index.html to GML so it can be imported into Gephi.
import glob
import os
import networkx as nx
from unidecode import unidecode
from dateutil.parser import parse
''' This file requires the networkx and unidecode packages to be installed e.g.
easy_install networkx
easy_install unidecode'''
@butlermh
butlermh / fft01.m
Created January 6, 2014 22:28
Matlab / Octave scripts from Coursera' Computational Methods for Data Analysis week 1
% clear everything
clear all; close all; clc;
%create a gaussian
L=20
n=128;
x2=linspace(-L/2,L/2,n+1);
x=x2(1:n);
@butlermh
butlermh / multiModule.sbt
Created January 28, 2014 16:53
Example sbt build file for multi-module project
import sbt._
import Keys._
import sbtassembly.Plugin._
import AssemblyKeys._
import sbt.Package.ManifestAttributes
import sbt.Logger
import com.typesafe.sbt.SbtAtmos.{Atmos, atmosSettings, AtmosKeys}
object HelloSbtBuild extends Build {