Here is an essay version of my class notes from Class 1 of CS183: Startup. Errors and omissions are my own. Credit for good stuff is Peter’s entirely.
CS183: Startup—Notes Essay—The Challenge of the Future
Purpose and Preamble
#!/usr/bin/python | |
# coding=utf-8 | |
# Python version of Zach Holman's "spark" | |
# https://github.com/holman/spark | |
# by Stefan van der Walt <[email protected]> | |
""" | |
USAGE: |
#!/bin/bash | |
# herein we backup our indexes! this script should run at like 6pm or something, after logstash | |
# rotates to a new ES index and theres no new data coming in to the old one. we grab metadatas, | |
# compress the data files, create a restore script, and push it all up to S3. | |
TODAY=`date +"%Y.%m.%d"` | |
INDEXNAME="logstash-$TODAY" # this had better match the index name in ES | |
INDEXDIR="/usr/local/elasticsearch/data/logstash/nodes/0/indices/" | |
BACKUPCMD="/usr/local/backupTools/s3cmd --config=/usr/local/backupTools/s3cfg put" | |
BACKUPDIR="/mnt/es-backups/" | |
YEARMONTH=`date +"%Y-%m"` |
// convenient Spring JDBC RowMapper for when you want the flexibility of Jackson's TreeModel API | |
// Note: Jackson can also serialize standard Java Collections (Maps and Lists) to JSON: if you don't need JsonNode, | |
// it's simpler and more portable to have Spring JDBC simply return a Map or List<Map>. | |
package org.springframework.jdbc.core; | |
import java.math.BigDecimal; | |
import java.sql.ResultSet; | |
import java.sql.ResultSetMetaData; | |
import java.sql.SQLException; |
sudo apt-get install build-essential libsqlite3-dev zlib1g-dev libncurses5-dev libgdbm-dev libbz2-dev libreadline5-dev libssl-dev libdb-dev | |
wget http://www.python.org/ftp/python/2.7.3/Python-2.7.3.tgz | |
tar -xzf Python-2.7.3.tgz | |
cd Python-2.7.3 | |
./configure --prefix=/usr --enable-shared | |
make | |
sudo make install | |
cd .. |
language: java | |
env: | |
global: | |
- SONATYPE_USERNAME=yourusername | |
- secure: "your encrypted SONATYPE_PASSWORD=pass" | |
after_success: | |
- python addServer.py | |
- mvn clean deploy --settings ~/.m2/mySettings.xml |
import spark.streaming.StreamingContext._ | |
import spark.streaming.{Seconds, StreamingContext} | |
import spark.SparkContext._ | |
import spark.storage.StorageLevel | |
import spark.streaming.examples.twitter.TwitterInputDStream | |
import com.twitter.algebird.HyperLogLog._ | |
import com.twitter.algebird._ | |
/** | |
* Example of using HyperLogLog monoid from Twitter's Algebird together with Spark Streaming's |
Here is an essay version of my class notes from Class 1 of CS183: Startup. Errors and omissions are my own. Credit for good stuff is Peter’s entirely.
CS183: Startup—Notes Essay—The Challenge of the Future
Purpose and Preamble
Kafka acts as a kind of write-ahead log (WAL) that records messages to a persistent store (disk) and allows subscribers to read and apply these changes to their own stores in a system appropriate time-frame.
Terminology:
public static Random random = new Random(DateTime.Now.Millisecond); | |
public int chooseWithChance(params int[] args) | |
{ | |
/* | |
* This method takes number of chances and randomly chooses | |
* one of them considering their chance to be choosen. | |
* e.g. | |
* chooseWithChance(1,99) will most probably (%99) return 1 since index of 99 is 1 | |
* chooseWithChance(99,1) will most probably (%99) return 0 since index of 99 is 0 | |
* chooseWithChance(0,100) will always return 1. |