Skip to content

Instantly share code, notes, and snippets.

@hivefans
hivefans / nested_example.bash
Last active March 17, 2020 02:04 — forked from dedico/nested_example.bash
|-|{"files":{"nested_example.bash":{"env":"plain"}},"tag":"bigdata"}
# setup
# delete index
curl -XDELETE 'http://localhost:9200/hotels/'
# create index
curl -XPOST 'http://localhost:9200/hotels/'
# create mapping
curl -XPOST localhost:9200/hotels/nested_hotel/_mapping -d '{
"nested_hotel":{
"properties":{
"rooms": {
@hivefans
hivefans / elasticsearch.yml
Last active March 17, 2020 02:04 — forked from reyjrar/elasticsearch.yml
|-|{"files":{"elasticsearch.yml":{"env":"plain"}},"tag":"bigdata"}
##################################################################
# /etc/elasticsearch/elasticsearch.yml
#
# Base configuration for a write heavy cluster
#
# Cluster / Node Basics
cluster.name: logng
# Node can have abritrary attributes we can use for routing
@hivefans
hivefans / out_reloadable_copy.rb
Last active March 17, 2020 02:03 — forked from bash0C7/out_reloadable_copy.rb
|-|{"files":{"out_reloadable_copy.rb":{"env":"plain"}},"tag":"bigdata"}
#
# Fluent
#
# Copyright (C) 2011 FURUHASHI Sadayuki
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
@hivefans
hivefans / sync_offsets.rb
Last active March 17, 2020 02:03 — forked from killme2008/sync_offsets.rb
|-|{"files":{"sync_offsets.rb":{"env":"plain"}},"tag":"bigdata"}
####
# Description:a ruby script to sync consumers offsets with brokers offsets.
# Requirements: zookeeper
# sudo gem install zookeeper
#
#####
require 'rubygems'
require 'zookeeper'
require 'socket'
@hivefans
hivefans / scaladays2014.md
Last active March 17, 2020 02:04 — forked from kevinwright/scaladays2014.md
|-|{"files":{"scaladays2014.md":{"env":"plain"}},"tag":"bigdata"}

As compiled by Kevin Wright a.k.a @thecoda

(executive producer of the movie, and I didn't even know it... clever huh?)

please, please, please - If you know of any slides/code/whatever not on here, then ping me on twitter or comment this Gist!

This gist will be updated as and when I find new information. So it's probably best not to fork it, or you'll miss the updates!

Monday June 16th

@hivefans
hivefans / install-parallel-centos-5.sh
Last active March 17, 2020 02:03
|-|{"files":{"install-parallel-centos-5.sh":{"env":"plain"}},"tag":"bigdata"}
#!/bin/bash
# Install parallel on CentOS 5.
# Assumes you are root. Prefix w/ sudo if not.
cd /etc/yum.repos.d/
wget http://download.opensuse.org/repositories/home:tange/CentOS_CentOS-5/home:tange.repo
yum install parallel
@hivefans
hivefans / rb-zookeeper.rb
Last active March 17, 2020 02:03 — forked from isterin/gist:1328677
ruby access zookeeper lock|-|{"files":{"rb-zookeeper.rb":{"env":"plain"}},"tag":"bigdata"}
require 'zookeeper'
class Lock
def initialize(host, root="/my-app")
@zk = Zookeeper.new(host)
@root = root
end
def with_lock(app, timeout, timeout_callback, &block)
new_lock_res = @zk.create(:path => "#{@root}/#{app}-", :sequence => true, :ephemeral => true)
@hivefans
hivefans / kafka.md
Last active March 17, 2020 02:04 — forked from ashrithr/kafka.md
|-|{"files":{"kafka.md":{"env":"plain"}},"tag":"bigdata"}

Introduction to Kafka

Kafka acts as a kind of write-ahead log (WAL) that records messages to a persistent store (disk) and allows subscribers to read and apply these changes to their own stores in a system appropriate time-frame.

Terminology:

  • Producers send messages to brokers
  • Consumers read messages from brokers
  • Messages are sent to a topic
@hivefans
hivefans / slideshare-dl.py
Last active April 17, 2022 13:36 — forked from julionc/slideshare-dl.py
|-|{"files":{"slideshare-dl.py":{"env":"plain"}},"tag":"bigdata"}
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
slideshare-dl.py
~~~~~~~~~~~~~~~~
slideshare-dl is a small command-line program
for downloading slides from SlideShare.net
@hivefans
hivefans / tornadocrud.py
Last active March 17, 2020 02:03 — forked from cjgiridhar/tornadocrud.py
|-|{"files":{"tornadocrud.py":{"env":"plain"}},"tag":"bigdata"}
import tornado.ioloop
import tornado.web
from datetime import datetime
import urlparse
from bson.json_util import dumps
import pymongo
from pymongo import Connection
class Home(tornado.web.RequestHandler):