Long story, short: I'm totally open to supporting more rubies if possible. Details follow.
Related issue: http://code.google.com/p/logstash/issues/detail?id=37
Summary:
- core and stdlib ruby changes violently and without notice and without backwards compatibility. I want nothing of that.
- need a cross-ruby date library that isn't part of stdlib (see previous point) and is also good.
- need an easy way to use multiple cpus that is cross-ruby (threads are not it)
Details:
Mainly, the ruby core/stdlib API changes between ruby 1.8 and 1.9 are very poorly done. Some are documented while others are not. Some changes make sense, while others do not. That was the main reason for originally deciding to use jruby.
JRuby lets me use Java libraries in place of crappy ruby ones. For example, there are some undocumented changes to datetime between ruby 1.8 and 1.9, so the logstash 'date' filter uses Joda-Time instead of ruby's stdlib datetime.
Further, JRuby's performance options are currently much better than MRI or YARV. At worst, during benchmarks, JRuby performs on-par with YARV 1.9.2, but since JRuby has actual threads, we can use more cpus more easily, and pretty much beat plain ruby.
Additionally, java debugging tools are quite excellent. jvisualvm, jstack, etc.
Lastly, I can very easily ship a single 'executable' that should work on most platforms with java - see the monolithic jar logstash releases. I can't easily do this with other rubies.
There are some parts of logstash that explicitly require java currently - the date filter, elasticsearch support, and thread support.
The code is also only tested under ruby 1.8.7, and performance difference between JRuby and MRI 1.8.7 is pretty huge. It might get better if you try REE, but that's not really the same ruby everyone's going to have.
The date filter can be made ruby-friendly if someone write a non-crappy date parsing library in ruby. The ones that ship with stdlib are not fast or safe to use (ruby core changes it wildly without notice).
ElasticSearch support is much faster in jruby/jvm than it was using pure ruby, because we are now using the java APi for elasticsearch. Previously we were using the HTTP/REST api using EventMachine and em-http-request, which has much lower throughput.
Lastly, jruby supports proper threading so logstash can process events on multiple CPU cores. MRI and YARV Ruby cannot do this without forking and message passing.
The downsides to using JRuby are possibly higher in-memory footprint.
Again, I'm open to supporting non-JRuby rubies, but there needs to be answers for some of the above.
With you on most of this; using jruby as well. One issue that you don't address is memory usage. Currently this is not yet an issue for me but I could imagine using cheap vms where sacrificing a quarter or more of RAM to logstash is a non starter. In my current setup, I could see myself needing a replacement for the bit of logstash plumbing that I currently have that is responsible for gathering collectd and logs from various files and delivering that to elasticsearch. I don't see why that should take more than a few MB of RAM instead of 0.5GB. For now it is a fair compromise but it does mean, I have to reconsider my architecture when we move to using a lot of cheap amazon boxes for our frontend.