I hereby claim:
- I am corey on github.
- I am coreyr (https://keybase.io/coreyr) on keybase.
- I have a public key ASB8VlTKBAJeOdUzWFC-FhjDQgnl_FsisFq2vyVQeSzwewo
To claim this, I am signing this object:
I hereby claim:
To claim this, I am signing this object:
A friend asked me for a few pointers to interesting, mostly recent papers on data warehousing and "big data" database systems, with an eye towards real-world deployments. I figured I'd share the list. While it's biased and rather incomplete but maybe of interest to someone. While many are obvious choices (I've omitted several, like MapReduce), I think there are a few underappreciated gems.
###Dataflow Engines:
Dryad--general-purpose distributed parallel dataflow engine
http://research.microsoft.com/en-us/projects/dryad/eurosys07.pdf
Spark--in memory dataflow
http://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf
# config/initializers/extensions/active_record.rb | |
module ActiveRecord | |
class Base | |
class << self | |
delegate :pluck, to: :scoped | |
end | |
end | |
class CollectionProxy | |
delegate :pluck, to: :scoped |
# Have you ever had to sleep() in Capybara-WebKit to wait for AJAX and/or CSS animations? | |
describe 'Modal' do | |
should 'display login errors' do | |
visit root_path | |
click_link 'My HomeMarks' | |
within '#login_area' do | |
fill_in 'email', with: '[email protected]' | |
fill_in 'password', with: 'test' |
# activerecord/lib/active_record/associations/builder/belongs_to.rb automagically creates | |
# the private methods for you if you include counter_cache in your belongs_to association. | |
# We simply override the basic behavior of these with our own conditions. | |
# | |
# For more information, check out: | |
# https://github.com/rails/rails/blob/733bfa63f5d8d3b963202b6d3e9f00b4db070b91/activerecord/lib/active_record/associations/builder/belongs_to.rb | |
# Lines 23 - 44 | |
class Inventory < ActiveRecord::Base | |
belongs_to :user, counter_cache:true |
(ns gist.globhfs | |
(:import [cascading.tap GlobHfs])) | |
;; ### Bucket to Cluster | |
;; | |
;;; To get tuples back out of our directory structure on S3, we employ | |
;; Cascading's [GlobHFS] (http://goo.gl/1Vwdo) tap, along with an | |
;; interface tailored for datasets stored in the MODIS sinusoidal | |
;; projection. For details on the globbing syntax, see | |
;; [here](http://goo.gl/uIEzu). |
# unicorn_rails -c /data/github/current/config/unicorn.rb -E production -D | |
rails_env = ENV['RAILS_ENV'] || 'production' | |
# 16 workers and 1 master | |
worker_processes (rails_env == 'production' ? 16 : 4) | |
# Load rails+github.git into the master before forking workers | |
# for super-fast worker spawn times | |
preload_app true |
# If your workers are inactive for a long period of time, they'll lose | |
# their MySQL connection. | |
# | |
# This hack ensures we re-connect whenever a connection is | |
# lost. Because, really. why not? | |
# | |
# Stick this in RAILS_ROOT/config/initializers/connection_fix.rb (or somewhere similar) | |
# | |
# From: | |
# http://coderrr.wordpress.com/2009/01/08/activerecord-threading-issues-and-resolutions/ |
# Author: Pieter Noordhuis | |
# Description: Simple demo to showcase Redis PubSub with EventMachine | |
# | |
# Requirements: | |
# - rubygems: eventmachine, thin, cramp, sinatra, yajl-ruby | |
# - a browser with WebSocket support | |
# | |
# Usage: | |
# ruby redis_pubsub_demo.rb | |
# |