Skip to content

Instantly share code, notes, and snippets.

View belablotski's full-sized avatar

Aliaksei Belablotski belablotski

  • Microsoft
  • Bellevue, WA
View GitHub Profile
@belablotski
belablotski / AvbTabulator.scala
Last active December 14, 2015 19:45
Spark DataFrame tabular representation
/**
* Tabular representation of Spark dataset.
* Idea and initial implementation is from http://stackoverflow.com/questions/7539831/scala-draw-table-to-console.
* Usage:
* 1. Import source to spark-shell:
* set HADOOP_HOME=D:\Java\extra_conf
* cd D:\Java\spark-1.4.1-bin-hadoop2.6\bin
* spark-shell.cmd --master local[2] --packages com.databricks:spark-csv_2.10:1.3.0 -i /path/to/AvbTabulator.scala
* 2. Tabulator usage:
* import org.apache.spark.sql.hive.HiveContext
@belablotski
belablotski / AutoMpgSparkParserLoader.scala
Last active December 12, 2015 04:21
Parsing classic "auto-mpg" stats dataset, convert it to JSON, parallelize in Spark RDD and then create Spark DataFrame
package com.beloblotskiy.scalascratchpad.temp
/**
* Convert text to JSON.
* Parsing classic "auto-mpg" dataset: https://archive.ics.uci.edu/ml/datasets/Auto+MPG
* @author Aliaksei Belablotski
*/
object TextToJson {
val text = """18.0 8 307.0 130.0 3504. 12.0 70 1 "chevrolet chevelle malibu"
15.0 8 350.0 165.0 3693. 11.5 70 1 "buick skylark 320"
<!DOCTYPE html>
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta charset="utf-8">
<!-- Based on https://github.com/cpettitt/dagre-d3/blob/master/demo/etl-status.html -->
<!-- Sample data are here: https://gist.github.com/beloblotskiy/06583a59c3005d6225835084be35641b -->
<title>Order-Customer-Inventory Jobs Graph</title>
<script src="./js/d3/d3.min.js" charset="utf-8"></script>
// Sample data for https://gist.github.com/beloblotskiy/190be806cb26316b3185a891075380e2
var JobGraph_workers_ORDER_CUSTOMER_INVENTORY = {
"ORDER_DATA_EXTRACTOR_1": {
"execModule": "MSSQL Ext",
"consumers": 1, /* not used */
"jobDesc": "Cyclic, interval: 15 min",
"isWarn": false,
"parentJobs": [],
"inputThroughput": 50 /* not used */ },