Skip to content

Instantly share code, notes, and snippets.

@mannharleen
Created September 20, 2017 04:05
Show Gist options
  • Select an option

  • Save mannharleen/385233c3fc0e5a1fabdbc6e190068bae to your computer and use it in GitHub Desktop.

Select an option

Save mannharleen/385233c3fc0e5a1fabdbc6e190068bae to your computer and use it in GitHub Desktop.
avro source to file channel to hdfs sink (avro with snappy codec)
#avro source to file channel to hdfs sink (with avro snappy codec)
agent4.sources = source1
agent4.channels = channel1
agent4.sinks = sink1
# Source configuration
agent4.sources.source1.type = avro
agent4.sources.source1.port = 11112
agent4.sources.source1.bind = localhost
agent4.sinks.sink1.type = hdfs
agent4.sinks.sink1.hdfs.path = /user/cloudera/flumesink/
agent4.sinks.sink1.hdfs.filePrefix = data_%D_%t
agent4.sinks.sink1.hdfs.fileSuffix = .avro
agent4.sinks.sink1.hdfs.fileType = DataStream
agent4.sinks.sink1.serializer = avro_event
agent4.sinks.sink1.serializer.compressionCodec = snappy
agent4.sinks.sink1.hdfs.useLocalTimeStamp=true
agent4.channels.channel1.type = file
agent4.sources.source1.channels = channel1
agent4.sinks.sink1.channel = channel1
#####################################################################
#Commands:
flume-ng agent --n agent4 -c . -f agent4.conf
flume-ng avro-client -H localhost -p 11112 --F file1
#Output file:
hdfs dfs -cat /user/cloudera/flumesink/data_09/19/17_1505880098394.1505880098400.avro
Objavro.codec
snappyavro.schema�{"type":"record","name":"Event","fields":[{"name":"headers","type":{"type":"map","values":"string"}},{"name":"body","type":"bytes"}]}ޤ��u�@|s��E�(41 First line���ޤ��u�@|s��E
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment