Skip to content

Instantly share code, notes, and snippets.

@elvisgiv
Last active November 29, 2016 10:43
Show Gist options
  • Save elvisgiv/7e81df173056fc06a62904e66f9f2423 to your computer and use it in GitHub Desktop.
Save elvisgiv/7e81df173056fc06a62904e66f9f2423 to your computer and use it in GitHub Desktop.

flume from kafka to elasticsearch config

flume config

# Name the components on this agent
a1.sources = kafka-source-1
a1.sinks = k1
a1.channels = c1

# Use a channel which buffers events in memory
a1.channels.c1.type=memory
a1.channels.c1.capacity = 10000
a1.channels.c1.transactionCapacity = 1000

# for kafka settings
a1.sources.kafka-source-1.type = org.apache.flume.source.kafka.KafkaSource
a1.sources.kafka-source-1.zookeeperConnect = 10.1.0.12:2181
a1.sources.kafka-source-1.topic = logstash
a1.sources.kafka-source-1.channels = c1

# Describe the sink ES
a1.sinks = k1
a1.sinks.k1.type = com.frontier45.flume.sink.elasticsearch2.ElasticSearchSink
a1.sinks.k1.hostNames = 172.17.0.2:9300
a1.sinks.k1.indexName = flume
a1.sinks.k1.indexType = logs
a1.sinks.k1.clusterName = elastic
a1.sinks.k1.batchSize = 500
a1.sinks.k1.ttl = 5d
a1.sinks.k1.serializer = com.frontier45.flume.sink.elasticsearch2.ElasticSearchDynamicSerializer
a1.sinks.k1.indexNameBuilder = com.frontier45.flume.sink.elasticsearch2.TimeBasedIndexNameBuilder
a1.sinks.k1.channel = c1

before we started flume-ng agent whith this config, we need get .jar-files for kafka source and elasticsearch sink

for kafka source we need download zookeeper-3.4.5-cdh5.4.4.jar from

http://grepcode.com/snapshot/repository.cloudera.com/content/repositories/releases/org.apache.zookeeper/zookeeper/3.4.5-cdh5.4.4

for elasticsearch sink we clone the .jars from

https://github.com/lucidfrontier45/ElasticsearchSink2

and copy all .jars to /usr/local/flume/lib/ $ sudo cp -rf <YOUR_DOWNLOAD_FOLDER> /usr/local/flume/lib/ or use my libpack, which include .jars for kafka-source, hdfs-sink, elasticsearch-sink

LINK TO MY LIBPACK

useful things for elasticsearch

http://www.tutorialspoint.com/apache_kafka/apache_kafka_basic_operations.htm

https://medium.com/@omallassi/elasticsearch-kibana-flume-a1e20649b2ae#.vudg49vk2

http://stackoverflow.com/questions/33732193/configure-sink-elasticsearch-apache-flume

elasticsearch config (for docker container)

	$ docker ps -a
	$ sudo docker exec -i -t <container_name> /bin/bash #команда захода в контейнер

nano install

	$ apt-get update && apt-get install nano
	$ export TERM=xterm #запускать каждый раз, когда хотим воспользоваться nano
$ mv /etc/elasticsearch/elasticsearch.yml /usr/share/elasticsearch/config/
$ nano /usr/share/elasticsearch/config/elasticsearch.yml

прописываем хост и имя кластера (для примера)

network.host: 0.0.0.0
cluster.name: elastic

узнаем порты для эластиксерча в докер-контейнере $ nano /etc/hosts 127.0.0.1 localhost ... 172.17.0.2 1cb87303f067

kibana config

$ sudo nano /opt/kibana/config/kibana.yml

Укажем ip, на котором будет сидеть Kibana. Можете указать свой ип в сети(для этого поставим 0.0.0.0). Так же укажем url elasticsearch.

server.port: 5601
server.host: "0.0.0.0"
elasticsearch.url: "http://172.17.0.2:9200"

#!!! Atention !!! в настройках флюма мы пишем в эластиксерч в 9300 порт

a1.sinks.k1.hostNames = 172.17.0.2:9300

а в кибане ставим читать из эластика 9200 порт

elasticsearch.url: "http://172.17.0.2:9200"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment