Skip to content

Instantly share code, notes, and snippets.

@airawat
airawat / 00-LogParser-Hive-Regex
Last active October 4, 2020 01:56
Log parser in Hive using regex serde
This gist includes hive ql scripts to create an external partitioned table for Syslog
generated log files using regex serde;
Usecase: Count the number of occurances of processes that got logged, by year, month,
day and process.
Includes:
---------
Sample data and structure: 01-SampleDataAndStructure
Data download: 02-DataDownload
Data load commands: 03-DataLoadCommands
@airawat
airawat / 00-LogParser-JavaMapReduce-Regex
Last active September 18, 2016 09:36
00-JavaMapperReducerUsingRegex
This gist includes a mapper, reducer and driver in java that can parse log files using
regex; The code for combiner is the same as reducer;
Usecase: Count the number of occurances of processes that got logged, inception to date.
Includes:
---------
Sample data and scripts for download:01-ScriptAndDataDownload
Sample data and structure: 02-SampleDataAndStructure
Mapper: 03-LogEventCountMapper.java
Reducer: 04-LogEventCountReducer.java
@airawat
airawat / 00-LogParser-PythonMR-UsingRegex
Last active December 19, 2015 06:59
Mapper and Reducer in python for log parsing using python regex
This gist includes a mapper and reducer in python that can parse log files using
regex; Usecase: Count the number of occurances of processes that got logged by month.
Includes:
---------
Sample data
Review of log data structure
Sample data and scripts for download
Mapper
Reducer