onesuper / KafkaProducer.java

Created February 24, 2017 02:34 — forked from yaroncon/KafkaProducer.java

Kafka producer, with Kafka-Client and Avro

	import org.apache.avro.Schema;
	import org.apache.avro.generic.GenericData;
	import org.apache.avro.generic.GenericDatumWriter;
	import org.apache.avro.generic.GenericRecord;
	import org.apache.avro.io.Encoder;
	import org.apache.avro.io.EncoderFactory;
	import org.apache.kafka.clients.producer.KafkaProducer;
	import org.apache.kafka.clients.producer.ProducerRecord;
	import org.springframework.boot.SpringApplication;
	import org.springframework.boot.autoconfigure.EnableAutoConfiguration;

onesuper / gist:9ca37a6041f2bd26722ad9bfbc2fd0e7

Created February 23, 2017 03:54 — forked from allwefantasy/gist:3615650

jedis example

	public class RedisClient {

	private JedisPool pool;

	@Inject
	public RedisClient(Settings settings) {
	try {
	pool = new JedisPool(new JedisPoolConfig(), settings.get("redis.host"), settings.getAsInt("redis.port", 6379));
	} catch (SettingsException e) {
	// ignore

onesuper / spark-env.sh

Created October 11, 2016 06:16 — forked from berngp/spark-env.sh

Spark Env Shell for YARN - Vagrant Hadoop 2.3.0 Cluster Pseudo distributed mode.

	#!/usr/bin/env bash

	# This file contains environment variables required to run Spark. Copy it as
	# spark-env.sh and edit that to configure Spark for your site.
	#
	# The following variables can be set in this file:
	# - SPARK_LOCAL_IP, to set the IP address Spark binds to on this node
	# - MESOS_NATIVE_LIBRARY, to point to your libmesos.so if you use Mesos
	# - SPARK_JAVA_OPTS, to set node-specific JVM options for Spark. Note that
	# we recommend setting app-wide options in the application's driver program.

onesuper / backup.sh

Created October 9, 2016 03:22 — forked from nherment/backup.sh

Backup and restore an Elastic search index (shamelessly copied from http://tech.superhappykittymeow.com/?p=296)

	#!/bin/bash
	# herein we backup our indexes! this script should run at like 6pm or something, after logstash
	# rotates to a new ES index and theres no new data coming in to the old one. we grab metadatas,
	# compress the data files, create a restore script, and push it all up to S3.
	TODAY=`date +"%Y.%m.%d"`
	INDEXNAME="logstash-$TODAY" # this had better match the index name in ES
	INDEXDIR="/usr/local/elasticsearch/data/logstash/nodes/0/indices/"
	BACKUPCMD="/usr/local/backupTools/s3cmd --config=/usr/local/backupTools/s3cfg put"
	BACKUPDIR="/mnt/es-backups/"
	YEARMONTH=`date +"%Y-%m"`

onesuper / yg-env.sh

Last active September 18, 2016 01:15

yg-env.sh

	#!/usr/bin/env sh
	#
	# $ echo 'source path_to_env.sh' >> ~/.bashrc
	# $ yg_install

	KAFKA_HOST=bj2-storm03:9092
	ES_HOST=bj2-storm03:9200
	ZK_HOST=bj2-storm03:2181,bj2-storm04:2181,bj2-storm05:2181
	STORM_LOG_HOME='/usr/local/storm-default/logs/'
	TODAY=$(date '+%Y-%m-%d')

onesuper / 词性标记.md

Created August 10, 2016 02:22 — forked from luw2007/词性标记.md

词性标记：包含 ICTPOS3.0词性标记集、ICTCLAS 汉语词性标注集、jieba 字典中出现的词性、simhash 中可以忽略的部分词性

词的分类

实词：名词、动词、形容词、状态词、区别词、数词、量词、代词
虚词：副词、介词、连词、助词、拟声词、叹词。

ICTPOS3.0词性标记集

n 名词

nr 人名

onesuper / tuning_storm_trident.asciidoc

Created July 5, 2016 07:41 — forked from mrflip/tuning_storm_trident.asciidoc

Notes on Storm+Trident tuning

Tuning Storm+Trident

Tuning a dataflow system is easy:

The First Rule of Dataflow Tuning:
* Ensure each stage is always ready to accept records, and
* Deliver each processed record promptly to its destination

onesuper / kafka.md

Created June 30, 2016 08:46 — forked from ashrithr/kafka.md

kafka introduction

Introduction to Kafka

Kafka acts as a kind of write-ahead log (WAL) that records messages to a persistent store (disk) and allows subscribers to read and apply these changes to their own stores in a system appropriate time-frame.

Terminology:

Producers send messages to brokers
Consumers read messages from brokers
Messages are sent to a topic

onesuper / add_del_blank.rb

Created December 7, 2015 06:19

add or del blanks between Chinese and English words.

	#!/usr/bin/env ruby
	# Usage: deal_blanks.rb input.txt >out.txt

	def isEn(char)
	/\w/.match(char) != nil
	end

	File.open(ARGV[0], "r") do \|file\|

	blanks_del = 0

onesuper / gist:2faf976ffad21a0248e7

Created June 8, 2014 01:50


	import random

	class Pool:

	def __init__(self, names):
	self.names = names

	def pick(self):
	if len(self.names) == 0:

Dreamsome onesuper

词的分类

ICTPOS3.0词性标记集

Tuning Storm+Trident

Introduction to Kafka