Skip to content

Instantly share code, notes, and snippets.

View mstump's full-sized avatar
😀

Matt Stump mstump

😀
  • San Francisco, CA
View GitHub Profile
def getMultiSegmentRdd(
sc: SparkContext,
sqlContext: CassandraSQLContext,
keyspace: String,
table: String,
tenantId: Int,
segments: Array[String],
columns: Array[String] = Array()) :
SchemaRDD = {
val source = sc.parallelize(segments).map(Tuple2(tennantId, _))
@steveash
steveash / 00_input.conf
Last active May 6, 2022 03:41
ELK configuration for aggregating cassandra and spark logs
input {
lumberjack {
# The port to listen on
port => 5043
# The paths to your ssl cert and key
ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder/logstash-forwarder.crt"
ssl_key => "/etc/pki/tls/private/logstash-forwarder/logstash-forwarder.key"
# default type, but this will already be set by logstash-forwarder anyways
@beauzeaux
beauzeaux / sqlite2parquet.py
Created April 23, 2015 19:58
sqlite2parquet
import sqlite3
import os
import argparse
try:
import pyspark
import pyspark.sql
except ImportError:
import sys
import os
@niftynei
niftynei / gist:9865193
Created March 30, 2014 00:11
Java Example using Apache's OLTU OAuth2 library
/// GENERAL METHODS FOR OAUTH'ING
/// using the Apache Oltu library
/// https://cwiki.apache.org/confluence/display/OLTU/Index
public String getAuthUrl() {
OAuthClientRequest request = null;
try {
request = OAuthClientRequest
.authorizationLocation("https://www.hackerschool.com/oauth/authorize")
@stuart11n
stuart11n / gist:9628955
Created March 18, 2014 20:34
rename git branch locally and remotely
git branch -m old_branch new_branch # Rename branch locally
git push origin :old_branch # Delete the old branch
git push --set-upstream origin new_branch # Push the new branch, set local branch to track the new remote
# If you're looking into the C10M problem (10 million concurrent connections)
# you might want to play with DPDK (Originally proprietry Intel, now open source)
#
# C10M: http://c10m.robertgraham.com/
# DPDK: http://dpdk.org/
#
# This is a quick summary how to install dpdk on ubuntu
# running inside virtualbox on a mac
# On my Mac:
@chanks
chanks / gist:7585810
Last active July 22, 2025 01:00
Turning PostgreSQL into a queue serving 10,000 jobs per second

Turning PostgreSQL into a queue serving 10,000 jobs per second

RDBMS-based job queues have been criticized recently for being unable to handle heavy loads. And they deserve it, to some extent, because the queries used to safely lock a job have been pretty hairy. SELECT FOR UPDATE followed by an UPDATE works fine at first, but then you add more workers, and each is trying to SELECT FOR UPDATE the same row (and maybe throwing NOWAIT in there, then catching the errors and retrying), and things slow down.

On top of that, they have to actually update the row to mark it as locked, so the rest of your workers are sitting there waiting while one of them propagates its lock to disk (and the disks of however many servers you're replicating to). QueueClassic got some mileage out of the novel idea of randomly picking a row near the front of the queue to lock, but I can't still seem to get more than an an extra few hundred jobs per second out of it under heavy load.

So, many developers have started going straight t

@stucchio
stucchio / basic_income_monte_carlo.py
Last active August 28, 2021 01:42
Monte carlo simulation of basic income/basic job calculations, from blog.
from pylab import *
from scipy.stats import *
num_adults = 227e6
basic_income = 7.25*40*50
labor_force = 154e6
disabled_adults = 21e6
current_wealth_transfers = 3369e9
def jk_rowling(num_non_workers):
@elmer-garduno
elmer-garduno / RMQReceiver.scala
Created September 6, 2013 03:01
RabbitMQ Actor with Receiver
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
@hoelzro
hoelzro / lua-set-path.c
Created July 3, 2013 15:10
Override package.path from C
int
luaR_set_package_path(lua_State *L, const char *path)
{
lua_getglobal(L, "package");
lua_pushstring(L, path);
lua_setfield(L, -2, "path");
lua_pop(L, 1);
return 0; /* should check for errors */
}