Skip to content

Instantly share code, notes, and snippets.

View chicagobuss's full-sized avatar

Joshua Buss chicagobuss

View GitHub Profile
@chicagobuss
chicagobuss / build_file
Created March 23, 2016 17:56
ignition build file
python_binary(
name='ignition',
dependencies=[
'3rdparty/python:Flask',
'3rdparty/python:apache-libcloud',
'3rdparty/python:gevent',
'3rdparty/python:pycrypto',
],
resources=globs('static/*', 'templates/*'),
source='ignition.py',
@chicagobuss
chicagobuss / amazingapp.conf
Last active April 4, 2016 14:35
How to make a self-signed ssl cert
[ req ]
default_bits = 2048
default_keyfile = keyfile.pem
distinguished_name = req_distinguished_name
prompt = no
req_extensions = x509v3_extensions
[ req_distinguished_name ]
C = US
ST = Illinois
@chicagobuss
chicagobuss / full_error
Last active April 11, 2016 16:41
jupyter kernel config
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-3-65fd6de06602> in <module>()
1 from pyspark.sql import SQLContext
----> 2 df = sqlContext.createDataFrame([("test", 1)])
/usr/lib/spark/python/pyspark/sql/context.py in createDataFrame(self, data, schema, samplingRatio)
428 rdd, schema = self._createFromLocal(data, schema)
429 jrdd = self._jvm.SerDeUtil.toJavaArray(rdd._to_java_object_rdd())
--> 430 jdf = self._ssql_ctx.applySchemaToPythonRDD(jrdd.rdd(), schema.json())
@chicagobuss
chicagobuss / test.json
Created April 15, 2016 16:55
packer json
{
"variables": {
"my_key": "{{ env `HOME`}}/.ssh/id_dsa",
"aws_key": "{{ env `HOME`}}/.ssh/id_aws-test_dsa",
},
"builders": [{
"type": "amazon-ebs",
"region": "us-east-1",
"source_ami": "ami-6d1c2007",
"instance_type": "t2.micro",
@chicagobuss
chicagobuss / core-site.xml
Created April 26, 2016 13:37
spark-s3a-error
<configuration>
<property>
<name>fs.s3a.access.key</name>
<value>S3_ACCESS_KEY</value>
</property>
<property>
<name>fs.s3a.secret.key</name>
<value>S3_SECRET_KEY</value>
</property>
@chicagobuss
chicagobuss / error.out
Created May 11, 2016 13:48
spark code and file not found error
---------------------------------------------------------------------------
Py4JJavaError Traceback (most recent call last)
<ipython-input-12-956d648fbc30> in <module>()
----> 1 df.count()
/usr/lib/spark/python/pyspark/sql/dataframe.py in count(self)
267 2
268 """
--> 269 return int(self._jdf.count())
270
%pyspark
from pyspark.sql.types import *
sqlContext = SQLContext(sc)
lines = sc.textFile("s3n://mahbucket/test/1k.csv")
parts = lines.map(lambda l: l.split(","))
data = parts.map(lambda p: (int(p[0]), float(p[1])))
schemaString = "id value"
@chicagobuss
chicagobuss / tunnels.sh
Last active September 11, 2017 20:25
SSH Tunnel Examples
# simple local port forwarding example for vault
# - exposes localhost:8200 to 10.20.30.40:8200 via bastion.internal.company
# - bastion.internal.company has to be able to reach 10.20.30.40:8200
ssh -M -S http_vault -fnNT -L 8200:10.20.30.40:8200 [email protected]
alias tunnel_http='ssh -L <host_a>:<port_a>:<host_c>:<port_c> -i ~/.ssh/id_rsa <host_b>'
where host_a:port_a is the host you're actually trying to hit from your local box
and host_b is the host you're able to ssh into from host_a
and host_c:port_c is the application you're trying to reach (and accessible via this ip/port from host_b)
@chicagobuss
chicagobuss / job_rec.txt
Created February 15, 2017 23:23
job_rec
Data Engineering
Purpose of the role:
Be part of a new Data Engineering team tasked with building the next generation of data ingestion, processing and storage framework at Citadel. Data engineers will be working directly with business customers to understand their processing needs and building solutions that can be repurposed across the entire organization. Team members will come up with creative solutions on tight deadlines to real world business problems that will have a significant impact to the business successes. As such, individual engineers will see and feel significant impact and responsibility.
Citadel has unique challenges of scale and rate of scale in terms of both compute, data volumes and user experience. If you enjoy pushing the boundaries of what is possible and building creative solutions this is the place and role for you.
Key job responsibilities include:
Design, build and support Citadel's data processing platforms
@chicagobuss
chicagobuss / clean_docker.sh
Created July 12, 2017 21:15
Docker cleanup script
#!/bin/bash
echo "Removing exited containers"
for i in $(sudo docker ps -a | grep Exit | awk '{print $1}')
do
sudo docker rm ${i}
done
echo "Removing unused images"
for i in $(sudo docker images | grep -v REPOS | awk '{print $3}')