Programing / Best Practices:
http://www.slideshare.net/JenAman/rearchitecting-spark-for-performance-understandability-63065166
http://www.slideshare.net/MaksudIbrahimov/spark-performance-tuning-maksud-ibrahimov
http://www.slideshare.net/SparkSummit/spark-summit-eu-talk-by-qifan-pu
http://www.slideshare.net/julesdamji/jump-start-with-apache-spark-20-on-databricks-70214386
https://robertovitillo.com/2015/06/30/spark-best-practices/
https://github.com/beeva/beeva-best-practices/blob/master/big_data/spark/README.md
http://spark.apache.org/docs/latest/tuning.html
https://www.gitbook.com/book/databricks/databricks-spark-knowledge-base/details
https://www.linkedin.com/pulse/9-tips-best-practices-apache-spark-kumar-chinnakali
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
http://127.0.0.1:4040/api/v1/applications/local-1485042930899/stages/24 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
case class Person (name: String, age: Int) | |
val people = List(Person("Guilherme", 35), Person("Isabela", 6), Person("Daniel", 3)) | |
val rdd = sc.parallelize(people) | |
val df = rdd.toDF | |
val ds = rdd.toDS | |
//count letters | |
rdd.flatMap(p => p.name.toUpperCase.groupBy(n => n).mapValues(_.size)).reduceByKey(_ + _).foreach(println) | |
rdd.flatMap(p => p.name.toUpperCase).map(c => (c,1)).reduceByKey(_ + _).foreach(println) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
defaults write -g InitialKeyRepeat -int 15 | |
defaults write -g KeyRepeat -int 45 | |
defaults write -g ApplePressAndHoldEnabled -bool false | |
defaults write NSGlobalDomain KeyRepeat -int 2 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
sc.setLogLevel("ERROR") | |
//rename columns | |
val tmpDf = df.toDF(df.columns.map(x => x.toUpperCase): _*) | |
val dfNew = df.columns.foldLeft(df)((df, col) => df.withColumnRenamed(col, col + "x")) | |
val newSchema = StructType(df.schema.map(c => StructField(c.name+"xx", c.dataType, c.nullable))) | |
val dfNew = spark.createDataFrame(df.rdd, newSchema) | |
//add id to columns |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
alias beep='afplay /System/Library/Sounds/Ping.aiff -v 100;echo "I beeped!"' |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
chmod 400 ~/.ssh/field.pem | |
cat >> ~/.ssh/config << EOF | |
Host *.field.xxxx.com | |
IdentityFile ~/.ssh/field.pem | |
CheckHostIP=no | |
StrictHostKeyChecking=no | |
User centos | |
UserKnownHostsFile=/dev/null | |
## Automatically restore a connection if reconnected within 5 minutes (in case the VPN drops) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"//*[local-name()='IdentdEmissor']/text()" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
select | |
*, | |
row_number() over (partition by site,zone,location,prod_code,pallet_sequence order by year,week) as pallet_age | |
from | |
( | |
select | |
*, | |
sum(broken_sequence) over (partition by site,zone,location,prod_code order by yearweek) as pallet_sequence | |
from | |
( |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
... | |
STORED AS ORC | |
TBLPROPERTIES | |
( | |
'orc.create.index'='true', | |
'orc.bloom.filter.columns'='field1,field2' | |
); | |