Joshua Buss chicagobuss

I hereby claim:

To claim this, I am signing this object:

	# If you're having problems creating an admin role binding in k8s on GKE,
	# make sure you're a project owner first in the project the cluster is in

	# Then you can do this:
	~$ kubectl create clusterrolebinding foo-cluster-admin-binding --clusterrole=cluster-admin --user="[email protected]"
	clusterrolebinding "foo-cluster-admin-binding" created

	#!/bin/bash
	#
	# Author: Joshua Buss (@chicagobuss)
	# Usage: elevated.sh [-d]
	# Example:
	# ./elevated.sh \| grep 'is p'
	# ./elevated.sh -d \| grep 'is not'

	for id in $(docker ps -a \| grep "Up" \| awk '{print $1}'); do
	name=$(docker ps -a \| grep "${id}" \| awk '{print $2}')

	# Vault forwarding
	http_port 8200 accel defaultsite=vault

	cache_peer 10.10.10.41 parent 8200 0 proxy-only name=vault1
	cache_peer 10.10.50.52 parent 8200 0 proxy-only name=vault2

	acl localnet src 10.0.0.0/8 # RFC1918 possible internal network
	http_access allow localnet
	http_access allow localhost
	http_access deny all


	# First I created a working directory
	~$ mkdir s3-kafka

	# Then I installed the dependency I needed in that directory with pip
	~$ cd s3-kafka
	~$ pip install kafka-python -t $(pwd)

	# Then I put my code into a file called s3-kafka.py
	~$ vi s3-kafka.py

	#!/bin/bash

	echo "Removing exited containers"
	for i in $(sudo docker ps -a \| grep Exit \| awk '{print $1}')
	do
	sudo docker rm ${i}
	done

	echo "Removing unused images"
	for i in $(sudo docker images \| grep -v REPOS \| awk '{print $3}')

	#!/bin/bash
	LIFECYCLE_POLICY='{"rules":[{"rulePriority":10,"description":"keeps 50 latest tagged images","selection":{"tagStatus":"tagged","countType":"imageCountMoreThan","countNumber":50,"tagPrefixList":["v"]},"action":{"type":"expire"}},{"rulePriority":20,"description":"keeps 5 latest untagged images","selection":{"tagStatus":"untagged","countType":"imageCountMoreThan","countNumber":5},"action":{"type":"expire"}},{"rulePriority":30,"description":"keeps latest 20 numeric-tagged images","selection":{"tagStatus":"tagged","countType":"imageCountMoreThan","tagPrefixList":["0","1","2","3","4","5","6","7","8","9"],"countNumber":20},"action":{"type":"expire"}},{"rulePriority":40,"description":"keeps latest 20 a-f tagged images","selection":{"tagStatus":"tagged","countType":"imageCountMoreThan","tagPrefixList":["a","b","c","d","e","f"],"countNumber":20},"action":{"type":"expire"}}]}'
	aws ecr put-lifecycle-policy --region ${AWS_REGION} --repository-name ${REPO} --lifecycle-policy-text ${LIFECYCLE_POLICY} \|\| echo "Fa

	Data Engineering

	Purpose of the role:
	Be part of a new Data Engineering team tasked with building the next generation of data ingestion, processing and storage framework at Citadel. Data engineers will be working directly with business customers to understand their processing needs and building solutions that can be repurposed across the entire organization. Team members will come up with creative solutions on tight deadlines to real world business problems that will have a significant impact to the business successes. As such, individual engineers will see and feel significant impact and responsibility.

	Citadel has unique challenges of scale and rate of scale in terms of both compute, data volumes and user experience. If you enjoy pushing the boundaries of what is possible and building creative solutions this is the place and role for you.

	Key job responsibilities include:

	Design, build and support Citadel's data processing platforms

	# simple local port forwarding example for vault
	# - exposes localhost:8200 to 10.20.30.40:8200 via bastion.internal.company
	# - bastion.internal.company has to be able to reach 10.20.30.40:8200
	ssh -M -S http_vault -fnNT -L 8200:10.20.30.40:8200 [email protected]

	alias tunnel_http='ssh -L <host_a>:<port_a>:<host_c>:<port_c> -i ~/.ssh/id_rsa <host_b>'

	where host_a:port_a is the host you're actually trying to hit from your local box
	and host_b is the host you're able to ssh into from host_a
	and host_c:port_c is the application you're trying to reach (and accessible via this ip/port from host_b)

	%pyspark
	from pyspark.sql.types import *

	sqlContext = SQLContext(sc)

	lines = sc.textFile("s3n://mahbucket/test/1k.csv")
	parts = lines.map(lambda l: l.split(","))
	data = parts.map(lambda p: (int(p[0]), float(p[1])))

	schemaString = "id value"