Created
June 11, 2014 13:11
-
-
Save lbjay/dced1c6f5c24a3d28922 to your computer and use it in GitHub Desktop.
Hackish workaround to wrongly encoded event data in the logstash pipeline
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require "logstash/filters/base" | |
require "logstash/namespace" | |
# This filter is a mildly hackish workaround for wrongly character encoded event data | |
# and the issues it causes in the logstash pipeline. (See: LOGSTASH-1443,LOGSTASH-1308 | |
# LOGSTASH-1353, etc) | |
# | |
# Simply list the event fields that are causing you problems, along with a tag to | |
# attach for offending events, like so: | |
# | |
# force_encoding { | |
# fields => ["path","qstring","message","referrer"] | |
# tag => "_argh_wtf-8_encoding!" | |
# } | |
# | |
# Any wrongly encoded data in the fields listed will, by default, be forced to ASCII-8BIT. | |
# This will, of course, result in a modicum of lost data, but that seems way better than | |
# logstash falling over, amirite? | |
class LogStash::Filters::ForceEncoding < LogStash::Filters::Base | |
config_name "force_encoding" | |
milestone 1 | |
# what fields to operate on | |
config :fields, :validate => :array, :required => true | |
# what charset to use for invalid encodings | |
config :to_charset, :validate => :string, :default => 'ASCII-8BIT' | |
config :tag, :validate => :string, :default => '_forcedencoding' | |
public | |
def register | |
# nothing to do | |
end # def register | |
public | |
def filter(event) | |
# return nothing unless there's an actual filter event | |
return unless filter?(event) | |
@fields.each do |field| | |
next unless event.include?(field) | |
unless event[field].valid_encoding? | |
event['tags'] ||= [] | |
event['tags'] |= [@tag] | |
event[field].encode!(@to_charset, :invalid => :replace, :undef => :replace, :replace => '').force_encoding(@to_charset) | |
end | |
end | |
# filter_matched should go in the last line of our successful code | |
filter_matched(event) | |
end # def filter | |
end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi guys
After suffering this for months without visible solution (at least in 1.4.5) I have found what seems to be a definitive solution... Early testings seems to confirm that applying this solves the problem.
All credits go to someone (Chinese I guess) I found ggogling around looking for accurence of this trouble
http://www.n0tr00t.com/2015/04/18/dataminding-logstash.html
The problem , as stated by logstash error file is at line 148 of file /opt/logstash/lib/logstash/event.rb
Se here is the edition:
Original function commented out
#public
#def to_json(_args)
return @data.to_json(_args)
#end # def to_json
Replacement
public
def to_json(_args)
begin
return @data.to_json(_args)
rescue
@DaTa = {}
return @data.to_json()
end
end
I'm not a ruby programmer... but it seems to me that that "rescue" statement seems to handle the situation (it appears to me a kind of Exception handling)
Logstash runs fine without being denied a whole day untill logrotation any more!!!!
Regards.