Created
September 3, 2012 02:34
-
-
Save sheeplogh/3606342 to your computer and use it in GitHub Desktop.
[urchin] logformat conf for Amazon S3
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#----------------------------------------------------------------------------------------------- | |
# Urchin Logformat Map - Custom Format | |
# | |
# Urchin uses this file to determine which fields are contained in the log file. | |
# The file contains name/value pairs which affect the parsing of the data fields. | |
# The log file can contain lines with up to two different formats which are | |
# denoted as Primary and Secondary in the name/value pairs below. | |
# | |
# Lines beginning with a '#' are ignored. Fields in this file are separated by | |
# whitespace (spaces or tabs). Any fields that have whitespace in them must be | |
# surrounded by quotes. The ID numbers used in the PrimaryPositions and | |
# SecondaryPositions are defined in fieldlist.txt file. Values entered as "-" | |
# are considered to be empty. To enter a literal "-" character, escape it with | |
# a backslash. To enter a literal backslash, escape it with a backslash. | |
# | |
# Name: Available Value(s) | |
# ----- ------------------ | |
# PrimaryPositions AUTO or comma separated list of field IDs from fieldlist.txt | |
# Use 0 as the field ID for unused fields in your log file. | |
# PrimaryKey - or character that distinguishes this line from the Secondary format | |
# PrimaryContent Hit, Item, or Transaction | |
# SecondaryPositions AUTO, -, or comma separated list of field IDs from fieldlist.txt | |
# Use 0 as the field ID for unused fields in your log file. | |
# SecondaryKey - or character that distinguishes this line from the Primary format | |
# SecondaryContent Hit, Item, Transaction, or - | |
# CommentKey - or character that signals the line in the log file is a comment line | |
# FieldSeparator[1,2] - or character used to separate the fields (space = \s, tab = \t) | |
# QuotesEscapeSep Yes or No Specifies to ignore field separators when inside quotes | |
# BracketsEscapeSep Yes or No Specifies to ignore field separators when inside brackets | |
# MergeSuccessiveSep Yes or No Specifies to interpret successive separators as one | |
# CleanWhiteSpace Yes or No Specifies to remove white space from the ends of the field | |
# StatusRequired Yes or No Specifies whether hits must have a valid status value | |
# CustomDateFormat - or format used by strptime (%Y = 4 digit year, %m = month, %d = day) | |
# CustomTimeFormat - or format used by strptime (%H = 0-24 hour, %M = minutes, %S = seconds) | |
# TimeZoneOffset 0 or +/-HHMM offset from GMT in which the date/time is recorded. Set to 0 | |
# for timestamps in GMT or for timestamps that contain timezone offsets | |
#----------------------------------------------------------------------------------------------- | |
PrimaryPositions: "201,202,3,12,203,204,205,206,6,10,207,11,208,209,210,15,13,211" | |
SecondaryPositions: - | |
PrimaryKey: - | |
SecondaryKey: - | |
PrimaryContent: HIT | |
SecondaryContent: - | |
CommentKey: # | |
FieldSeparator1: \s | |
FieldSeparator2: \t | |
QuotesEscapeSep: YES | |
BracketsEscapeSep: YES | |
MergeSuccessiveSep: NO | |
CleanWhiteSpace: NO | |
StatusRequired: YES | |
CustomDateFormat: "%m/%d/%y" | |
CustomTimeFormat: "%H:%M:%S" | |
TimeZoneOffset: 0 | |
#----------------------------------------------------------------------------------------------- | |
# This file can also be used to specify custom filters. These filters are | |
# specified using the format listed below. The field definitions are as follows: | |
# | |
# Field: Definition/Value(s) | |
# ------ ------------------- | |
# ID ID number of field to store the data in | |
# TYPE CALC (represents custom calculated field) | |
# NAME User friendly name for this field specified in ID | |
# SRC-A ID number of field (1-300) | |
# EXP-A Regular expression used to capture data from SRC-A | |
# SRC-B ID number of field (1-300) | |
# EXP-B Regular expression used to capture data from SRC-B | |
# CONSTRUCT Format string that specifies which parts to combine from SRC-A and SRC-B. Matched | |
# pieces of the regular expressions are specified by the format $A1, where A is the | |
# source field and 1 is the first match. For example, "$A1|$B1" specifies to put the | |
# first matched part from A together with a '|' character and then the first matched | |
# part from B. | |
# REQUIRE A, B, Both, Either, or - Specifies which fields must have data before creating the | |
# output | |
# OVERRIDE Yes or No Specifies to overwrite data in the ID field if it already contains data | |
# CASE Yes or No Specifies whether filters are case sensitive (Default is No). | |
# | |
# Custom Calculated Fields (#226-300) | |
#ID TYPE NAME SRC-A EXP-A SRC-B EXP-B CONSTRUCT REQUIRE OVERRIDE CASE | |
#----------------------------------------------------------------------------------------------- | |
#226 CALC custom_calc_field1 10 (.*) 11 (.*) $A1|$B1 BOTH YES NO |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment