This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
var all = { | |
//screen | |
'screen.width' : screen.width, | |
'screen.height' : screen.height, | |
'screen.availWidth' : screen.availWidth, | |
'screen.availHeight' : screen.availHeight, | |
'screen.colorDepth' : screen.colorDepth, | |
'screen.pixelDepth' : screen.pixelDepth, | |
//location | |
'location.href' : location.href, |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@app.route('/word/<keyword>') | |
def fetch_word(keyword): | |
db = get_cassandra() | |
pages = [] | |
results = db.fetchWordResults(keyword) | |
for hit in results: | |
pages.append(db.fetchPageDetails(hit["url"])) | |
return Response(json.dumps(pages), status=200, |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
class AnalyticsActor extends Actor { | |
def actorRefFactory = context | |
val dataActor = actorRefFactory.actorOf(Props[NoSqlActor], "cassandra-client") | |
val statActor = actorRefFactory.actorOf(Props[StatActor], "statistical-engine") | |
def receive = { | |
case (a: String, c: String, ctx: RequestContext) => | |
val f:Future[Result] = |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
class WordCount(args : Args) extends Job(args) { | |
TextLine(args("input")) | |
.read | |
.flatMap('line -> 'word){ line : String => line.split("\\s")} | |
.groupBy('word){group => group.size} | |
.write(Tsv(args("output"))) | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
A = load 'wordcount-input/lorem.txt' as (line:chararray); | |
B = foreach A generate FLATTEN(TOKENIZE(line)) as word; | |
C = foreach B generate LOWER(REPLACE(word,'\\W+','')) as word; | |
D = group C by word; | |
E = foreach D generate group, COUNT(C); | |
store E into 'wordcount-pig-output'; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/sh | |
#check if the directory exists on hdfs | |
$HADOOP_HOME/bin/hadoop fs -ls wordcount-input | |
if [ $? -ne 0 ] | |
then $HADOOP_HOME/bin/hadoop fs -mkdir wordcount-input/ | |
fi | |
#check if the lorem.txt exists on hdfs | |
$HADOOP_HOME/bin/hadoop fs -ls wordcount-input/lorem.txt |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
class ScrubFunction extends BaseOperation implements Function | |
{ | |
public ScrubFunction( Fields fieldDeclaration ) | |
{ | |
super( 1, fieldDeclaration ); | |
} | |
public void operate( FlowProcess flowProcess, FunctionCall functionCall ) | |
{ | |
TupleEntry argument = functionCall.getArguments(); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
import sys | |
# input comes from STDIN (standard input) | |
for line in sys.stdin: | |
#clean and split in words | |
linechars = [c for c in line.lower() if c.isalpha() or c==' '] | |
words = ''.join(linechars).strip().split() | |
#emit the key-balue pairs |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
-- Hive queries for Word Count | |
drop table if exists doc; | |
-- 1) create table to load whole file | |
create table doc( | |
text string | |
) row format delimited fields terminated by '\n' stored as textfile; | |
--2) loads plain text file | |
--if file is .csv then in replace '\n' by ',' in step no 1 (creation of doc table) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
package com.natalinobusa; | |
import org.apache.hadoop.fs.Path; | |
import org.apache.hadoop.io.IntWritable; | |
import org.apache.hadoop.io.LongWritable; | |
import org.apache.hadoop.io.Text; | |
import org.apache.hadoop.mapreduce.Job; | |
import org.apache.hadoop.mapreduce.Mapper; | |
import org.apache.hadoop.mapreduce.Reducer; | |
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; |