Skip to content

Instantly share code, notes, and snippets.

View schmohlio's full-sized avatar

Matthew Schmohl schmohlio

View GitHub Profile
@schmohlio
schmohlio / EmbeddedFunctionHandler.py
Created May 24, 2014 01:52
An error handler for functions that are embedded within parallelized functions
"""
EmbeddedFunctionHandler.py
A class that decorates functions that are embedded/called within
other map|reduce functions. Provides error logging
to log. Type of error and function *args are logged. Useful for Python's
dynamic typing
"""
@schmohlio
schmohlio / test1.py
Created May 25, 2014 01:14
test EmbeddedFucntionHandler 1
import numpy as np
from multiprocessing import Pool
def add_one(x):
return x+1
p = Pool(4)
l = np.arange(0,1000000)
%timeit p.map(add_one, l)
@schmohlio
schmohlio / test2.py
Created May 25, 2014 01:53
test EmbeddedFunctionHandler part 2
EmbeddedFunctionHandler.init_log('file.log', logging.DEBUG)
MAX_ERRORS = 2
@EmbeddedFunctionHandler("catch errors adding 1", MAX_ERRORS)
def add_one2(x):
return x+1
%timeit map(add_one2,l)
# 1 loops, best of 3: 1.13 s per loop
@schmohlio
schmohlio / test3.py
Created May 25, 2014 02:17
testing EmbeddedFunctionHandler part 3
dirty_ls = list(l)
dirty_ls[600], dirty_ls[20000] = ("kittens","cats")
res = map(add_one2,dirty_ls)
# ***Repl Closed***
@schmohlio
schmohlio / gist:44b36146a54800334e77
Last active August 29, 2015 14:19
flatten nested array, where each level of the the tree represents a different category to search by, and preserve traversal metadata.
/**
* @keys n-length list of "labels", where the @key[i] is the category label of the ith level of @dat
* @dat a nested associative array, tree-like, with n+1 levels, where 1..n levels of the tree represent categories
* and the n+1 level is the actual request data.
* returns: list of associative arrays, each representing rows to be inserted into database.
*
* flattens nested array into a list of "rows", where the |rows| == |leaves in tree|.
* each "row" contains the data within the leaves, plus n additional key-value pairs represented by
* label => category (i.e. [...country => 'US', type => 'tablet'] when @keys = ['country', 'type'], and @dat has n+1==3 levels.)
**/
@schmohlio
schmohlio / two_largest
Last active August 29, 2015 14:19
finding sum of two largest values in list (unordered).
val sample = List(1,2,3,3,5,4,6,7,4)
// a good default if sample is always positive integers.
val START = (-1, -1)
// this is what I meant by default foldLeft (or scanLeft) value
val two_largest = test.foldLeft(START) { (acc: (Int, Int), n: Int) =>
val (smaller, larger) = acc
if (n > larger)
(larger, n)
@schmohlio
schmohlio / gist:fe200a77628e28355bb4
Last active August 29, 2015 14:19
finding max number of meeting rooms
meetings_sample = [(0, 30), (60, 90), (20, 65)]
[(0,True),(30,False),(60,True),(90,False),(20,True),(65,False)]
def count_rooms(meeting_events):
meeting_events.sort(lambda x,y: x[0]<y[0]) # O(nlogn), need to add sorting here so that False is first
# not sure if that sort is left associative or right
# associative
ongoing_meetings = [] # pretend stack
max_num_meetings = 0 # initialize
stack_size = 0
#!/usr/bin/env python
'''
JumbleSorter.py
sorts a list of strings and integers, but keeps types at nth element
in list constant in result.
only implement for stdin for now, i.e.,
@schmohlio
schmohlio / gist:f3d6866b9b3174f1fb1a
Created June 3, 2015 02:17
copy all missing servers based on instructions
#!/usr/bin/env python
'''
DataSync
Makes instructions to copy datasets to servers missing backups
based on input data.
- Ensure that each data center has a copy of every data set.
- Every dataset is included in at least 1 data center.
@schmohlio
schmohlio / EventMachines.md
Created January 8, 2016 16:19 — forked from eulerfx/EventMachines.md
The relationship between state machines and event sourcing

A state machine is defined as follows:

  • Input - a set of inputs
  • Output - a set of outputs
  • State - a set of states
  • S0 ∈ S - an initial state
  • T : Input * State -> Output * State - a transition function

If you model your services (aggregates, projections, process managers, sagas, whatever) as state machines, one issue to address is management of State. There must be a mechanism to provide State to the state machine, and to persist resulting State for subsequent retrieval. One way to address this is by storing State is a key-value store. Another way is to use a SQL database. Yet another way is event sourcing. The benefit of even sourcing is that you never need to store State itself. Instead, you rely on the Output of a service to reconstitute state. In order to do that, the state machine transition function needs to be factored into two functions as follows: