Skip to content

Instantly share code, notes, and snippets.

legacyFields = LegacySchema.NARROW_FIELDS.append( LegacySchemaDOB.NARROW_FIELDS );
Pipe joinedPipe = new CoGroup( legacyPipe, LegacySchema.PERSON_ID, legacyDOBPipe, LegacyDOBSchema.LEGACY_PERSON_ID, legacyFields, new LeftJoin() );
# bunch of stats functions for quick usage in arrays... hobbled together from myself and a bunch of blogs...
module Stats
def sumitup
if self.class == Hash
sum = self.inject(0){|total, item| total = total + (item[1] || 0); total}
end
if self.class == Array
sum = self.inject(0){|total, item| total = total + (item || 0); total}
end
[jasonamster] wordcount [master*]$ java -cp build/wordcount.jar:/usr/local/src/hadoop-0.20.2/hadoop-0.20.2-core.jar:/usr/local/src/hadoop-0.20.2/lib/log4j-1.2.15.jar:/projects/cascading/build/cascading-1.1.2-wip-dev.jar:/projects/cascading/build/cascading-core-1.1.2-wip-dev.jar:/projects/cascading/build/cascading-test-1.1.2-wip-dev.jar:/projects/cascading/build/cascading-xml-1.1.2-wip-dev.jar:/projects/cascading/lib/janino-2.5.16.jar:/projects/cascading/lib/jgrapht-jdk1.6.jar:/usr/local/src/hadoop-0.20.2/libcommons-cli-1.2.jar:/usr/local/src/hadoop-0.20.2/libcommons-codec-1.3.jar:/usr/local/src/hadoop-0.20.2/libcommons-el-1.0.jar:/usr/local/src/hadoop-0.20.2/libcommons-httpclient-3.0.1.jar:/usr/local/src/hadoop-0.20.2/libcommons-logging-1.0.4.jar:/usr/local/src/hadoop-0.20.2/libcommons-logging-api-1.0.4.jar:/usr/local/src/hadoop-0.20.2/libcommons-net-1.4.1.jar:/usr/local/src/hadoop-0.20.2/libcore-3.1.1.jar:/usr/local/src/hadoop-0.20.2/libhsqldb-1.8.0.10.LICENSE.txt:/usr/local/src/hadoop-0.20.2/libhsqldb-1.8.0
[jasonamster] wordcount [master*]$ java -cp build/wordcount.jar:/usr/local/src/hadoop-0.20.2/hadoop-0.20.2-core.jar wordcount.Main data/url+page.200.txt output local
Exception in thread "main" java.lang.NoClassDefFoundError: cascading/scheme/Scheme
Caused by: java.lang.ClassNotFoundException: cascading.scheme.Scheme
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
META-INF/
META-INF/MANIFEST.MF
lib/
wordcount/
lib/cascading-core-1.2-wip-dev.jar
log4j.properties
wordcount/Main$ImportCrawlDataAssembly.class
wordcount/Main$WordCountSplitAssembly.class
wordcount/Main.class
lib/cascading-1.1.2-wip-dev.jar
[jasonamster] lib [master*]$ hadoop jar build/wordcount.jar data/url+page.200.txt output local
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/util/PlatformName
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.util.PlatformName
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/util/RunJar
module MyModule
def print_hello
puts "hello"
end
end
class MyClass
include MyModule
end
1) Just getting the install process down....
a) bootstrapping a server is really confusing
- you need to install a bunch of libraries
- you need to install a bunch of gems
- you need to install chef-solo
- you need to tell chef-solo to turn into a chef-server
b) validation.pem
- so basically, you have to realize that this is a temporary key but you kind of have to manually install it on any new machine? No better way to do this?
2) Cookbooks, how do you manage them?
- I had difficulty in understanding that chef server is just a process (or group of them) and then you give it cookbooks to work with
# Taken and Modified from => \
# http://t-a-w.blogspot.com/2010/05/very-simple-parallelization-with-ruby.html
require "rubygems"
require "active_support"
require 'thread'
def Exception.ignoring_exceptions
begin
yield
# Removing libraries, and specifying plugins
config.frameworks -= [:action_controller, :action_view, :action_mailer, :active_resource]
config.plugins = [:acts_as_state_machine, :masochism, :resque]