This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
String xmlFile = "Wikipedia-Category-GlobalWarming-20090919043526.xml"; | |
InputStream is = WikipediaSeederTest.class.getClassLoader().getResourceAsStream(xmlFile); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
A="jar $JOB_JAR com.attinteractive.ar.hadoopjobs.examplejob.Main $1 $2" | |
B="mvn exec:exec -Dexec.args=" | |
exec $B"$A" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# instant.rake | |
# Rake rules for compiling and running trivial Java programs | |
# | |
# Usage: rake com.example.MonkeyShines | |
# Source goes under ./src | |
# Classes end up under ./target | |
require 'rake/clean' | |
libs = FileList["lib/*"] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# i don't have a rails app handy, but I think this works if you have dependt destroy set for the child relationship | |
post.comments.delete(@some_comment) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
package play; | |
import org.jruby.embed.PathType; | |
import org.jruby.embed.ScriptingContainer; | |
public class ClassUseSample { | |
public ScriptingContainer container; | |
public String file; | |
public ClassUseSample(String filename) { |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import org.jruby.embed.PathType; | |
import org.jruby.embed.ScriptingContainer; | |
public class Embedder { | |
public ScriptingContainer container; | |
public String file; | |
public Embedder(String filename) { | |
file = filename; | |
container = new ScriptingContainer(); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Use Jruby to read hadoop sequence files | |
def load_libs(libs) | |
Dir.glob(File.join(libs,"*.jar")).each { |f| | |
require f | |
} | |
end | |
load_libs ENV["HADOOP_HOME"] | |
load_libs File.join(ENV["HADOOP_HOME"], 'lib') |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
engine = Moonstone::Engine.new | |
engine.index(@some_docs) | |
engine.reader do |r| | |
r.terms.for_field('name').sort.should == %w{ burger king depeche mode }.sort | |
end |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# backup your delicious bookmarks | |
curl -k --user `cat password.txt` -o backup.xml -O 'https://api.del.icio.us/v1/posts/all' |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Yes sorry. | |
Basically what we are trying is to constraint the effect of the raw frequency (saturate the frequency). | |
In Lucene this is carried out with the root square of the frequency, another classical approach | |
is to use the log. With both approaches we avoid giving a linear 'importance' to the frequency. | |
BM25 is a bit tricky, it parametrises the 'saturation' of the frequency with a parameter k1, with the | |
equation weight(t)/(weight(t)+k1). Usually k1 is fixed to 2, but it can be fixed by collection. |