Skip to content

Instantly share code, notes, and snippets.

@jvns
jvns / blogs.md
Last active April 16, 2020 09:34
Tech blogs I subscribe to
@debasishg
debasishg / gist:8172796
Last active October 19, 2025 00:47
A collection of links for streaming algorithms and data structures

General Background and Overview

  1. Probabilistic Data Structures for Web Analytics and Data Mining : A great overview of the space of probabilistic data structures and how they are used in approximation algorithm implementation.
  2. Models and Issues in Data Stream Systems
  3. Philippe Flajolet’s contribution to streaming algorithms : A presentation by Jérémie Lumbroso that visits some of the hostorical perspectives and how it all began with Flajolet
  4. Approximate Frequency Counts over Data Streams by Gurmeet Singh Manku & Rajeev Motwani : One of the early papers on the subject.
  5. [Methods for Finding Frequent Items in Data Streams](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.9800&rep=rep1&t
@andrew-michael-lee
andrew-michael-lee / se-install-collectd
Last active December 28, 2015 21:29
A snippet detailing the installation and configuration of collectd on Ubuntu 12.04 with openjdk-7.
# client install
sudo apt-get install libcurl4-gnutls-dev ruby cmake openjdk-7-jdk openjdk-7-source libgcrypt11-dev python-dev git
# server install
sudo apt-get install libcurl4-gnutls-dev ruby cmake openjdk-7-source libgcrypt11-dev librrd4 librrd-dev
# install yajl
git clone git://github.com/lloyd/yajl
cd yajl
./configure
@thcipriani
thcipriani / chef_server_setup
Last active December 26, 2015 05:49
Chef Server Setup
Chef Server Setup
=============
- Ubuntu 12.04 x86_64 (only other option is RHEL 5 or 6)
- Hostname setup:
echo "parabola" > /etc/hostname
hostname -F /etc/hostname
```/etc/hosts
127.0.0.1 fqdn hostname localhost
# (e.g., 127.0.0.1 parabola.tylercipriani.com parabola localhost)
# -*- mode: ruby -*-
# vi: set ft=ruby :
# Vagrant plug-ins in use:
# vagrant-vbguest to ensure all VirtualBox VMs have guest additions
# vagrant-hostmanager to manipulate /etc/hosts on the guest VMs and host machine.
# vagrant-proxyconf to configure an HTTP proxy for apt [requires instructor VM to be booted on an accessible IP]
#
# If vagrant-hostmanager isn't installed edit /etc/hosts on your laptop and place these entries in it.
# 172.16.1.10 web
@obazoud
obazoud / Graphite - Relay - Collectd
Last active March 16, 2017 20:04
Install Graphite / Collectd in Ubuntu 12.04 (precise 64)
ssh -p 2222 -R 2204:192.168.7.176:2204 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o LogLevel=ERROR -o IdentitiesOnly=yes -i ~/.vagrant.d/insecure_private_key [email protected]
@logikal
logikal / monitorama_scatterbrain_notes.md
Created August 16, 2013 17:06
Just some scatterbrained notes I wrote at the end of monitorama.

We need 3 things from our monitoring systems: Log aggregation and analysis tools for (deep-dive info) Data visualization tools (at-a-glace information, data correlation/causation, pattern identification, easier anomaly detection) non-simple error reporting (To let us know when things are actually going wrong. e.g. rollups, multi-variable alerts, alerts that include more data than 'I passed a threshold')

If I were starting from scratch, this is the architecture I'd build for monitoring.

Logstash -> Reimann and/or Flapjack-> (dataviz) Statsd -> Graphite -> Tasseo & Descarte
                                   |
 |--> (alerting) Sensu -> Pagerduty
@kcd83
kcd83 / encrypt_data_bag.rb
Created August 14, 2013 03:29
Standalone script for encrypting a json file data bag into an encrypted data bag for opscode chef .
#!/usr/bin/env ruby
if ARGV.length < 2
puts "usage: #{$0} databag.json new_encrypted_databag.json [encrypted_data_bag_secret]"
exit(1)
end
databag_file = ARGV[0]
out_file = ARGV[1]
if ARGV.length >= 3
@temoto
temoto / helpers_data.py
Last active September 10, 2024 20:12
Part of py-helpers. Gzip compression shortcuts. Encoding. Database helpers. Retry decorator.
def namedlist(typename, field_names):
"""Returns a new subclass of list with named fields.
>>> Point = namedlist('Point', ('x', 'y'))
>>> Point.__doc__ # docstring for the new class
'Point(x, y)'
>>> p = Point(11, y=22) # instantiate with positional args or keywords
>>> p[0] + p[1] # indexable like a plain list
33
>>> x, y = p # unpack like a regular list