Skip to content

Instantly share code, notes, and snippets.

View joyoyoyoyoyo's full-sized avatar
🏴
supporting my community

Angel Ortega joyoyoyoyoyo

🏴
supporting my community
  • InterMedia Advertising
  • SoCal / Remote / LA / IE
View GitHub Profile
@joyoyoyoyoyo
joyoyoyoyoyo / KVMetricsSource.java
Created September 17, 2019 11:32 — forked from ambud/KVMetricsSource.java
Spark Custom Metrics Source
/**
* Copyright 2017 Ambud Sharma
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
@joyoyoyoyoyo
joyoyoyoyoyo / SparkStatsLogger.scala
Created September 17, 2019 11:30 — forked from petrovg/SparkStatsLogger.scala
Logging listener for Spark
import java.io.PrintWriter
import java.util.Properties
import org.apache.spark.SparkContext
import org.apache.spark.scheduler._
import org.apache.spark.storage.RDDInfo
/**
* Created by petrovg on 22/04/2017.
*/
@joyoyoyoyoyo
joyoyoyoyoyo / GitHub protocol comparison.md
Created August 29, 2019 12:03
A comparison of protocols offered by GitHub (for #git on Freenode).

Primary differences between SSH and HTTPS. This post is specifically about accessing Git repositories on GitHub.

Protocols to choose from when cloning:

plain Git, aka git://github.com/

  • Does not add security beyond what Git itself provides. The server is not verified.

    If you clone a repository over git://, you should check if the latest commit's hash is correct.

@joyoyoyoyoyo
joyoyoyoyoyo / clean_code.md
Created August 29, 2019 11:59 — forked from wojteklu/clean_code.md
Summary of 'Clean code' by Robert C. Martin

Code is clean if it can be understood easily – by everyone on the team. Clean code can be read and enhanced by a developer other than its original author. With understandability comes readability, changeability, extensibility and maintainability.


General rules

  1. Follow standard conventions.
  2. Keep it simple stupid. Simpler is always better. Reduce complexity as much as possible.
  3. Boy scout rule. Leave the campground cleaner than you found it.
  4. Always find root cause. Always look for the root cause of a problem.

Design rules

@joyoyoyoyoyo
joyoyoyoyoyo / animalexample.thrift
Created August 29, 2019 03:28 — forked from myhrvold/animalexample.thrift
Thrift Example: Service-Oriented Architecture at Uber Engineering
struct Animal {
1: i32 id
2: string name
3: string sound
}
exception NotFoundException {
1: i32 what
2: string why
}
@joyoyoyoyoyo
joyoyoyoyoyo / docker-compose.yml
Created August 27, 2019 07:28 — forked from noemi-dresden/docker-compose.yml
Complete compose file for monitoring spark on prometheus
version: "3.1"
services:
spark-master:
image: bde2020/spark-master:2.4.0-hadoop2.7
container_name: spark-master
ports:
- "8080:8080"
- "7077:7077"
environment:
- INIT_DAEMON_STEP=setup_spark
@joyoyoyoyoyo
joyoyoyoyoyo / .gitconfig
Created August 17, 2019 08:34 — forked from berngp/.gitconfig
dot.gitconfig
[user]
name = Your Name
email = [email protected]
[color]
ui = true
[core]
excludesfile = ~/.gitignore_global
editor = /usr/local/bin/mvim -f
@joyoyoyoyoyo
joyoyoyoyoyo / spark-yarn-emr-client.rb
Created August 11, 2019 15:13 — forked from tamizhgeek/spark-yarn-emr-client.rb
Chef Recipe for remote spark-submit setup to YARN running on Amazon EMR
# setup for the hadoop + spark env for airflow
remote_file "/home/airflow/spark.tgz" do
source "/remote/spark/download/url"
owner "airflow"
group "airflow"
mode '0755'
not_if { File.exists?("/home/airflow/spark.tgz") }
end
@joyoyoyoyoyo
joyoyoyoyoyo / helper.gradle
Created July 31, 2019 11:22 — forked from matthiasbalke/helper.gradle
Gradle resolveDependencies Task
// found here: http://jdpgrailsdev.github.io/blog/2014/10/28/gradle_resolve_all_dependencies.html
task resolveDependencies {
doLast {
project.rootProject.allprojects.each { subProject ->
subProject.buildscript.configurations.each { configuration ->
resolveConfiguration(configuration)
}
package net.atos.sparti.pub
import java.io.PrintStream
import java.net.Socket
import org.apache.commons.pool2.impl.{DefaultPooledObject, GenericObjectPool}
import org.apache.commons.pool2.{ObjectPool, PooledObject, BasePooledObjectFactory}
import org.apache.spark.streaming.dstream.DStream
class PooledSocketStreamPublisher[T](host: String, port: Int)