Skip to content

Instantly share code, notes, and snippets.

View bcambel's full-sized avatar
🌴
On vacation

Bahadir Cambel bcambel

🌴
On vacation
View GitHub Profile

The Anatomy of an Onyx Program

In this tutorial, we'll take an in-depth view of what's happening when you execute a simple Onyx program. All of the code can be found in the Onyx Starter repository if you'd like to follow along. The code uses the development environment with HornetQ and ZooKeeper running in memory, so you don't need additional dependencies to run the example for yourself on your machine.

The Workflow

At the core of the program is the workflow - the flow of data that we ingest, apply transformations to, and send to an output for storage. In this program, we're going to ingest some sentences from an input source, split the sentence into individual words, play with capitalization, and add a suffix. Finally, we'll send the transformed data to an output source.

Let's examine the workflow pictorially:

@bcambel
bcambel / run.clj
Created February 28, 2016 11:06
Running pace calculator
(defn pace [i]
(cond
;; After 30K, walk 1K after 4K of running
(and (> i 30) (= 0 (mod i 4))) 10
;; every 10K stop 3 minutes
(and (> i 20) (= 0 (mod i 15))) 8.5
:else 6.5
)
)
@bcambel
bcambel / roman.py
Created February 12, 2016 22:52
Convert numbers to Roman numbers
import math
def digit(digits,num, n, upper, outlier, lower):
multip = math.pow(10,n)
if num >= multip:
hundreds = int(num / multip)
if hundreds == 9:
digits.append(lower+upper)
num -= (9 * multip)
@bcambel
bcambel / atoi.py
Created February 9, 2016 17:25
Custom String to Integer
import math
def atoi(s):
return "".join(map(lambda x: str(ord(x)), s))
print [ord(str(i)) for i in range(1,10)]
@bcambel
bcambel / move_to_end.py
Created February 9, 2016 16:57
Move all the zeros in between to the end of the array
def move(arr):
i = 0
idx = 0
while i < len(arr) and idx < 12:
if arr[i] == 0:
arr.append(0)
arr.pop(i)
i -= 1
i += 1
idx += 1
import json
import requests
import tqdm
PROJ_URL = "http://server/p/"
ES_URL = 'http://es/hackersome/github_project/'
def update(proj):
d = requests.get(PROJ_URL + proj + "?json=1").json()
resp = requests.put(ES_URL+str(d['id']), json.dumps(d))
0: jdbc:drill:zk=local> SELECT count(CONVERT_FROM(row_key, 'UTF8')) AS clickid,
. . . . . . . . . . . > CONVERT_FROM(clicks.clickinfo.url, 'UTF8') AS url
. . . . . . . . . . . > FROM hbase.clicks
. . . . . . . . . . . > GROUP BY clicks.clickinfo.url;
+----------+-------------------------+
| clickid | url |
+----------+-------------------------+
| 3 | http://www.google.com |
| 2 | http://www.ask.com |
| 3 | http://www.amazon.com |
@bcambel
bcambel / first_query.sql
Last active December 1, 2015 20:45
Apache Drill query to group by in Apache HBase
0: jdbc:drill:zk=local> SELECT CONVERT_FROM(row_key, 'UTF8') AS clickid,
. . . . . . . . . . . > CONVERT_FROM(clicks.clickinfo.studentid, 'UTF8') AS studentid,
. . . . . . . . . . . > CONVERT_FROM(clicks.clickinfo.`time`, 'UTF8') AS `time`,
. . . . . . . . . . . > CONVERT_FROM(clicks.clickinfo.url, 'UTF8') AS url
. . . . . . . . . . . > FROM hbase.clicks WHERE clicks.clickinfo.url LIKE '%google%';
+----------+------------+---------------------------+------------------------+
| clickid | studentid | time | url |
+----------+------------+---------------------------+------------------------+
| click1 | student1 | 2014-01-01 12:01:01.0001 | http://www.google.com |
| click3 | null | 2014-01-01 01:02:01.0001 | http://www.google.com |
@bcambel
bcambel / hbase.txt
Last active December 1, 2015 20:36
create 'clicks','clickinfo','iteminfo'
put 'clicks','click1','clickinfo:studentid','student1'
put 'clicks','click1','clickinfo:url','http://www.google.com'
put 'clicks','click1','clickinfo:time','2014-01-01 12:01:01.0001'
put 'clicks','click1','iteminfo:itemtype','image'
put 'clicks','click1','iteminfo:quantity','1'
put 'clicks','click2','clickinfo:url','http://www.amazon.com'
put 'clicks','click2','clickinfo:time','2014-01-01 01:01:01.0001'
put 'clicks','click2','iteminfo:itemtype','image'
put 'clicks','click2','iteminfo:quantity','1'
@bcambel
bcambel / drill_setup.sh
Created November 29, 2015 22:18
Download and start a embedded drill instance
#!/bin/bash
wget http://getdrill.org/drill/download/apache-drill-1.3.0.tar.gz
tar -xvzf apache-drill-1.3.0.tar.gz
cd apache-drill-1.3.0
./bin/drill-embedded