Skip to content

Instantly share code, notes, and snippets.

View jjfiv's full-sized avatar
🦀
RiiR

John Foley jjfiv

🦀
RiiR
View GitHub Profile
@jjfiv
jjfiv / gist:5068356
Created March 1, 2013 22:15
Generic update or insert of a key within a C++ map, efficiently
#include <map>
#include <functional> // C++11
// general method stolen from: http://stackoverflow.com/questions/97050/stdmap-insert-or-stdmap-find
template <class K, class V>
void setOrUpdate(map<K, V> &tree, const K& key, const V& amount, function<void (V&)> updater) {
auto ptr = tree.lower_bound(key);
if(ptr != tree.end() && ptr->first == key) {
// key already exists
updater(ptr->second);
@jjfiv
jjfiv / tsv_to_csv.py
Created November 15, 2013 21:04
TSV to CSV
import sys
with open(sys.argv[1],'r') as fp:
for line in fp:
if "\t" not in line:
continue
cols = [col.strip('"').strip() for col in line.rstrip().split("\t")]
print(','.join(cols))
package edu.umass.ciir.proteus.athena.experiment.iter;
import org.lemurproject.galago.core.retrieval.iterator.*;
import org.lemurproject.galago.core.retrieval.processing.ScoringContext;
import org.lemurproject.galago.core.retrieval.query.AnnotatedNode;
import org.lemurproject.galago.core.retrieval.query.NodeParameters;
import org.lemurproject.galago.tupleflow.Parameters;
import java.io.IOException;
@jjfiv
jjfiv / pom.xml
Created February 13, 2015 16:46
Making an uberjar with Maven
<plugin>
<artifactId>maven-shade-plugin</artifactId>
<version>2.3</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
</execution>
@jjfiv
jjfiv / JSONUtil.java
Created February 27, 2015 21:41
JSON escaping and unescaping that really works, no dependencies.
// BSD License (http://lemurproject.org/galago-license)
package org.lemurproject.galago.utility.json;
public class JSONUtil {
public static String escape(String input) {
StringBuilder output = new StringBuilder();
for(int i=0; i<input.length(); i++) {
char ch = input.charAt(i);
int chx = (int) ch;
@jjfiv
jjfiv / annotate_in_reasonable_time.java
Created July 7, 2015 18:59
Limit Stanford NLP to a duration.
@SuppressWarnings("deprecated")
public static boolean annotateTimed(Annotation ann) {
// Don't time the initial classifier setup.
nlp.get();
Thread annotateThread = new Thread(() -> {
nlp.get().annotate(ann);
});
annotateThread.start();
// Spend at most 4 seconds annotating any document; after this it might as well be hanging!
from bs4 import BeautifulSoup # HTML/XML parsing library
# can change this to do a different book:
bookId = 'cu31924020438929';
# Read in the book through XML libary:
xml = None;
with open(bookId+'_djvu.xml') as inputFile:
xml = BeautifulSoup(inputFile.read(), 'lxml');
import org.w3c.dom.Document;
import org.w3c.dom.NamedNodeMap;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import java.io.File;
import java.io.IOException;
@jjfiv
jjfiv / webpack.config.js
Created June 20, 2016 14:44
Webpack cfg I use with babel for react...
module.exports = {
context: __dirname + "/app",
entry: {
javascript: ["./app.js"],
html: ["./index.html", "./style.css"],
},
output: {
filename: "app.js",
path: __dirname + "/dist",
@Override
public double evaluate(double[] baseline, double[] treatment) {
double[] boostedBaseline = multiply(baseline, boost);
double baseMean = mean(boostedBaseline);
double treatmentMean = mean(treatment);
double difference = treatmentMean - baseMean;
int batch = 10000;
final int maxIterationsWithoutMatch = 1000000;
long iterations = 0;