Skip to content

Instantly share code, notes, and snippets.

View v5tech's full-sized avatar
🎯
Focusing

v5tech

🎯
Focusing
  • Xi'an China
  • 14:16 (UTC +08:00)
View GitHub Profile
This gist covers a simple Hive genericUDF in Java, that mimics NVL2 functionality in Oracle.
NVL2 is used to handle nulls and conditionally substitute values.
Included:
1. Input data
2. Expected results
3. UDF code in java
4. Hive query to demo the UDF
5. Output
package com.jteso.hadoop.contrib.inputformat;
import java.io.IOException;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.BytesWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.InputSplit;
import org.apache.hadoop.mapreduce.RecordReader;
import org.apache.hadoop.mapreduce.TaskAttemptContext;
This gist includes oozie workflow components to run a pig latin script to parse
(Syslog generated) log files using regex;
Usecase: Count the number of occurances of processes that got logged, by month,
day and process.
Pictorial overview of workflow:
-------------------------------
http://hadooped.blogspot.com/2013/07/apache-oozie-part-7-oozie-workflow-with_3.html
Includes:
This gist includes a pig latin script to parse Syslog generated log files through a
java mapreduce program that uses regex;
Usecase: Count the number of occurances of processes that got logged, by month,
day and process.
Related gist that covers the java code - https://gist.github.com/airawat/5915374
Pig version: version 0.10.0
This gist includes a pig latin script to parse Syslog generated log files using regex;
Usecase: Count the number of occurances of processes that got logged, by month,
day and process.
Includes:
---------
Sample data and structure: 01-SampleDataAndStructure
Data and script download: 02-DataAndScriptDownload
Data load commands: 03-HdfsLoadCommands
Pig script: 04-PigLatinScript
This gist includes hive ql scripts to create an external partitioned table for Syslog
generated log files using regex serde;
Usecase: Count the number of occurances of processes that got logged, by year, month,
day and process.
Includes:
---------
Sample data and structure: 01-SampleDataAndStructure
Data download: 02-DataDownload
Data load commands: 03-DataLoadCommands
This gist includes a mapper, reducer and driver in java that can parse log files using
regex; The code for combiner is the same as reducer;
Usecase: Count the number of occurances of processes that got logged, inception to date.
Includes:
---------
Sample data and scripts for download:01-ScriptAndDataDownload
Sample data and structure: 02-SampleDataAndStructure
Mapper: 03-LogEventCountMapper.java
Reducer: 04-LogEventCountReducer.java
This gist includes a mapper and reducer in python that can parse log files using
regex; Usecase: Count the number of occurances of processes that got logged by month.
Includes:
---------
Sample data
Review of log data structure
Sample data and scripts for download
Mapper
Reducer
@v5tech
v5tech / app.js
Created August 9, 2014 14:37 — forked from auser/app.js
Rapid chrome app development with angular. http://www.ng-newsletter.com/posts/chrome-apps-on-angular.html
angular.module('myApp', ['ngRoute'])
.provider('Weather', function() {
var apiKey = "";
this.getUrl = function(type, ext) {
return "http://api.wunderground.com/api/" +
this.apiKey + "/" + type + "/q/" +
ext + '.json';
};
# 规则配置仅供参考,适用于 Surge Mac (1.0.7) 及其后续版本;
# 包含 Proxy Group、URL Rewrite 特性;
# 包含 Reject 规则,用于拦截广告、行为分析、数据统计;
# 屏蔽 Hao123、百度搜索,放行百度地图、百度外卖、百度音乐、百度云盘、百度百科。
[General]
# warning, notify, info, verbose
skip-proxy = 192.168.0.0/16, 10.0.0.0/8, 172.16.0.0/12, 100.64.0.0/10, localhost, *.local
bypass-tun = 0.0.0.0/8, 192.168.0.0/16, 10.0.0.0/8, 172.16.0.0/12
loglevel = notify