Replying to a Twitter conversation with @johnjoseph_code:
Our number one concern is helping our developers get better performance out of their app. Period.
We are recommending Unicorn so that developers will use more RAM. Most Rails apps that I examined were using thin - because we had recommended it in the past - and were not using anywhere near the full capacity of their dynos in terms of memory or CPU. Because of Unicorn's forking process model, it is often a trivial switch for an application to make that provides a near instant boost in performance for them by actually using all the RAM they have requested. With 2X dynos, developers can verically scale pretty well, generally fitting 4-8 unicorn workers on a dyno (or more if you set extremely tight conditions with https://github.com/kzk/unicorn-worker-killer).
We hope developers switching to Unicorn and 2X dynos will be able to scale down their apps from where they are today. You are correct that we sell in units of RAM (though in reality it is a container with an allocation of RAM and a share of CPU, IO, etc - 2X dynos have twice the sahre of CPU, for instance). And we hope Unicorn will let them use more of the resources they are paying for today.
As for Puma and Thin - both are great servers! However, both are almost guaranteed to require changes in the way the application is structured to fully utilize their potential. We were simply not willing to say today "Hey, we see you're having performance problems on Heroku. Go rewrite your app and you'll be fine". In general, I think an app built to properly harness the power of either of these servers is very likely to exceed the performance of the same app on Unicorn. Definitely taking your feedback to heart here. We can do a better job providing a recommendation. I'll start thinking about how we might best document the benefits and pitfalls associated with each server for our developers in a way that allows them to make a more informed choice when they a just starting to developer their application.
As for blocking IO, this is a definite Unicorn problem. In general, it will not impact most normal HTTP requests because we buffer 8K of headers and request body (https://devcenter.heroku.com/articles/http-routing#request-buffering). That's not good enough though. File uploads on Unicorn just plain suck without something in front of it to buffer the full request. We're considering lots of options here, including some zany things like deploying nginx in front of all apps at the dyno level (which would be pretty neat!), as well as investigating Rainbows!, which I wasn't aware of until recently. Given the relative obscurity of Rainbows! though, I am hesitant to recommend it without a lot more testing (the warnings on the homepage especially give me great pause). We do have some new internal clients on Zbatery (a Rainbows! offshoot), and it definitely looks promising.
I have not seen the email you mentioned, but I agree - it sounds sloppy. We have tried to be extremely careful not to miscommunicate our advice, but performance is just a tricky subject to make sweeping recommendations about. Would you mind forwarding the message to dominic [at] heroku ?
spawning unicorn processes uses memory per process. ruby 1.9.3 doesn't have CoW, so this is linear or close to linear. so yes, it's possible that using unicorn will increase usage per dollar. but, the overall processes serving requests won't go up (assuming the user already had enough dyno's running to meet their needs). So this is merely for money saving. which is nice, but not really "increasing concurrency" which seems to imply a performance improvement or removal of interdependencies between processes.
Now, unicorn IS better at scheduling a queue than heroku's router is. But unfortunately this is only relevant per-dyno. so the random routing phenomenon that results in underutilization of dynos is still going to be present for apps with many dynos.
So, all of this above should be discussed either in the email or in a heroku document, and they should be treated as the separate, somewhat-overlapping issues that they are.