Skip to content

Instantly share code, notes, and snippets.

@makeittotop
Created August 29, 2016 08:37
Show Gist options
  • Save makeittotop/fcba879b48c1cd5fc44e0155c0440ff8 to your computer and use it in GitHub Desktop.
Save makeittotop/fcba879b48c1cd5fc44e0155c0440ff8 to your computer and use it in GitHub Desktop.
apache-spark-find-malicious-ips-apache-log
>>> input = sc.textFile("/vagrant/access_log-20160828")
>>> xmlrpc_attack = input.filter(lambda x: "xmlrpc.php" in x)
>>> attacking_ips=xmlrpc_attack.filter(lambda x: '12.0.1.8' not in x).map(lambda x: x.split()[0])
>>> counts = attacking_ips.map(lambda ip: (ip,1))
>>> total = counts.reduceByKey(lambda x,y : x+y)
>>> total.foreach(print)
(u'185.106.122.248', 57)
(u'163.172.149.145', 2376)
(u'89.248.174.4', 4)
(u'163.172.179.35', 741)
(u'163.172.175.207', 443)
(u'163.172.174.255', 610)
(u'163.172.177.210', 8336)
(u'185.158.112.38', 2)
(u'212.47.243.235', 1016)
(u'12.0.2.101', 64)
(u'163.172.174.28', 1439)
(u'163.172.177.30', 250)
(u'163.172.179.147', 928)
(u'163.172.179.127', 102)
(u'163.172.176.64', 1053)
(u'163.172.174.64', 894)
(u'80.82.78.57', 296)
(u'154.16.199.74', 2967)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment