Skip to content

Instantly share code, notes, and snippets.

View ksindi's full-sized avatar
🚀
Shipping

Kamil Sindi ksindi

🚀
Shipping
View GitHub Profile
@ksindi
ksindi / update.py
Created June 12, 2016 17:33
Recursively merge or update dict-like objects
import collections
def update(d, other):
"""Recursively merge or update dict-like objects.
>>> from pprint import pprint
>>> pprint(update({'k1': {'k2': 2}}, {'k1': {'k2': {'k3': 3}}, 'k4': 4}))
{'k1': {'k2': {'k3': 3}}, 'k4': 4}
>>> pprint(update({'k1': {'k2': 2}}, {'k1': {'k3': 3}}))
{'k1': {'k2': 2, 'k3': 3}}
>>> pprint(update({'k1': {'k2': 2}}, dict()))
@ksindi
ksindi / bokeh_utils.py
Created May 28, 2016 17:56 — forked from robintw/bokeh_utils.py
Bokeh Utils
from bokeh.plotting import figure, ColumnDataSource
from bokeh.models import HoverTool
def scatter_with_hover(df, x, y,
fig=None, cols=None, name=None, marker='x',
fig_width=500, fig_height=500, **kwargs):
"""
Plots an interactive scatter plot of `x` vs `y` using bokeh, with automatic
tooltips showing columns from `df`.
@ksindi
ksindi / spark-csv.ipynb
Created April 26, 2016 02:49 — forked from parente/spark-csv.ipynb
Use spark-csv from Jupyter Notebook
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ksindi
ksindi / spark_notes.md
Created April 25, 2016 21:46 — forked from robcowie/spark_notes.md
Apache Spark Notes

Install Apache Spark (OSX)

$ brew install apache-spark

Run the Spark python shell

A python shell with a preconfigured SparkContext (available as sc). It is

@ksindi
ksindi / graceful_int_handler.py
Created March 30, 2016 18:11 — forked from nonZero/graceful_int_handler.py
GracefulInterruptHandler
import signal
class GracefulInterruptHandler(object):
def __init__(self, sig=signal.SIGINT):
self.sig = sig
def __enter__(self):
self.interrupted = False
@ksindi
ksindi / s3.py
Created March 30, 2016 12:19
Get data from boto3
import json
import boto3
s3 = boto3.resource('s3')
obj = s3.Object(bucket, key)
data = obj.get()['Body'].read()
d = json.loads(data)
@ksindi
ksindi / iquery.py
Created February 21, 2016 19:20
Yields rows in chunks.
def chunk_query(sql, cursor, chunksize=10):
"""Yields rows in chunks."""
cursor.execute(sql)
while True:
nextrows = cursor.fetchmany(chunksize)
if not nextrows:
break
yield nextrows
def iquery(sql, cursor, chunksize=10):
@ksindi
ksindi / mysqldb_query_generator.py
Created February 20, 2016 17:56 — forked from robcowie/mysqldb_query_generator.py
Memory-efficient, streaming query generator with MySQLdb
from MySQLdb.cursors import SSDictCursor
def iterate_query(query, connection, arraysize=1):
c = connection.cursor(cursorclass=SSDictCursor)
c.execute(query)
while True:
nextrows = c.fetchmany(arraysize)
if not nextrows:
break
@ksindi
ksindi / cumulative_return_ranking.py
Created December 28, 2015 19:36
Show cumulative return and rank
import pandas as pd
df = pd.DataFrame([['2012', 'A', 1], ['2012', 'B', 4], ['2011', 'A', 5], ['2011', 'B', 4]],
columns=['year', 'manager', 'return_pct'])
df['total_return'] = (df
.groupby('manager')['return_pct']
.transform(lambda group: (1 + group / 100.).cumprod().iat[-1])) - 1
df['ranking'] = df.total_return.rank(ascending=False, method='dense')
@ksindi
ksindi / reactor.py
Created December 26, 2015 20:46 — forked from jpanganiban/reactor.py
A Very Simple Reactor Pattern Implementation in Python (with Gevent)
#!/usr/bin/env python
from gevent import monkey
monkey.patch_all() # Patch everything
import gevent
import time
class Hub(object):
"""A simple reactor hub... In async!"""