Nate Busa natbusa

Director of AI at Neom | AI @Stanford | CTO Program @wharton | Technology Strategy, Innovation, Product Development

natbusa / request.curl

Created February 10, 2014 21:58

request

	> GET /api/v1/breakfast?eggs=2&strips=4&slices=1&juices=1&coffee=2 HTTP/1.1
	> User-Agent: curl/7.29.0
	> Host: localhost:8888
	> Accept: /
	> Content-Type: application/json
	>

natbusa / response.curl

Created February 10, 2014 21:59

	< HTTP/1.1 200 OK
	< Server: spray-can/1.2-RC2
	< Date: Mon, 10 Feb 2014 19:36:43 GMT
	< Content-Type: application/json; charset=UTF-8
	< Content-Length: 247
	<
	{
	"main": {
	"eggs": {
	"num": 2

natbusa / wc.bash

Last active August 29, 2015 13:57

word count: linux

cat lorem.txt | tr [:upper:] [:lower:] | sed -E 's/[^[:alpha:]]+/\n/g' | sort | uniq -c | | awk '{ print($2,"\t",$1) }'

natbusa / wc.scala

Last active August 29, 2015 13:57

word count: scala

	//read in
	val text = scala.io.Source.fromFile("lorem.txt").mkString

	val wc = text.
	toLowerCase.
	split("\\W+").
	groupBy(identity).
	mapValues( _.length )

	//writeout

natbusa / wc.py

Created March 19, 2014 14:40

word count: python

	#readin and filter
	txt = [c for c in open('lorem.txt').read().lower() if c.isalpha() or c==' ']

	#groupBy in a dictionary
	wc = dict()
	for w in ''.join(txt).split():
	wc[w] = wc.setdefault(w, 0) + 1

	#output
	for k,v in wc.iteritems():

natbusa / wc.R

Last active August 29, 2015 13:57

Word count in R

	#read in and clean
	x = scan("lorem.txt", what="", sep=" ")
	x.clean = gsub("\\W+","",tolower(x))

	#word count
	wc = table(x.clean)

	#output to stdout
	print(as.data.frame(wc), row.names=rep("", nrow(wc)))

natbusa / wc.hadoop.examples.sh

Last active August 29, 2015 13:57

word count: hadoop example wordcount

	$HADOOP_HOME/bin/hadoop \
	jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.3.0.jar \
	wordcount wordcount-input wordcount-output

	$HADOOP_HOME/bin/hadoop fs -cat wordcount-output

natbusa / wc.mapreduce.sh

Last active August 29, 2015 13:57

word count: mapreduce in java

	#beam it up to the hdfs
	hadoop fs -mkdir wordcount-input/
	hadoop fs -copyFromLocal lorem.txt wordcount-input/

	# Maven project in https://github.com/natalinobusa/wordcount/hadoop-mapreduce-java

	$HADOOP_HOME/bin/hadoop jar target/wordcount-mapreduce-java-1.0-SNAPSHOT.jar \
	com.natalinobusa.WordCount wordcount-input wordcount-mapreduce-java-output

	$HADOOP_HOME/bin/hadoop fs -cat wordcount-mapreduce-java-output

natbusa / wc.mapreduce.java

Last active August 29, 2015 13:57

word count in hadoop: ol' school map reduce in plain java and mpreduce core libraries, in 59 lines of java

	package com.natalinobusa;

	import org.apache.hadoop.fs.Path;
	import org.apache.hadoop.io.IntWritable;
	import org.apache.hadoop.io.LongWritable;
	import org.apache.hadoop.io.Text;
	import org.apache.hadoop.mapreduce.Job;
	import org.apache.hadoop.mapreduce.Mapper;
	import org.apache.hadoop.mapreduce.Reducer;
	import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;

natbusa / wc.HiveQL.sql

Last active August 29, 2015 13:57

word count: hadoop hive using later views and string operators

	-- Hive queries for Word Count
	drop table if exists doc;

	-- 1) create table to load whole file
	create table doc(
	text string
	) row format delimited fields terminated by '\n' stored as textfile;

	--2) loads plain text file
	--if file is .csv then in replace '\n' by ',' in step no 1 (creation of doc table)