-
-
Save courtneycouch/2652991 to your computer and use it in GitHub Desktop.
ar http = require('http'); | |
var fs = require('fs'); | |
var util = require('util'); | |
var fileCache; | |
var sendFile = function(conn, file) { | |
conn.writeHead(200, {"Content-Type": "text/html", "Content-Length": file.length}); | |
conn.write(file); | |
conn.end(); | |
} | |
http.createServer(function (req, res) { | |
if (fileCache == undefined) { | |
fs.readFile("foo.html", function(err, file) { | |
fileCache = file; | |
sendFile(res, fileCache); | |
}); | |
} else { | |
sendFile(res, fileCache); | |
} | |
}).listen(8080, 'localhost'); |
load('vertx.js') | |
var fileCache; | |
var sendFile = function(req, file) { | |
req.response.headers["Content-Length"] = file.length() | |
req.response.headers["Content-Type"] = "text/html" | |
req.response.end(file) | |
} | |
vertx.createHttpServer().requestHandler(function(req) { | |
if (fileCache == undefined) { | |
vertx.fileSystem.readFile("httpperf/foo.html" function(err, file) { | |
fileCache = file; | |
sendFile(req, fileCache); | |
}); | |
} else { | |
sendFile(req, fileCache); | |
} | |
}); | |
}).listen(8080, 'localhost'); |
There are many options for caching - even node itself provides this mechanism
http://nodejs.org/docs/latest/api/fs.html#fs_fs_watchfile_filename_options_listener
This benchmark simply shows that node.js outperforms vertx when serving a cached file via http. Nothing more.
"I doubt the JVM is caching the file. Most probably..." -- spoken like a true computer scientist. lol.
about the bs called "nobody runs on a single core"
when ever you rent some hosted machines, you rent a virtual machine where the host has 12 cpu cores but the vm it self just shows 8000mhz (so the max performance is limited per hosting package - my virtual machine can use 8000mhz per second but without seeing on which core my load is processed). in general the most hosting providers doesn't show the guest os on their machines how many cpus are available and here is why nodejs shines so bright.
@egeozcan I've not gone through what netty is doing exactly but I believe the key difference is that nodejs is using read() and vertx is using mmap() on the OS level. There's modules out there to provide mmap bindings (https://github.com/bnoordhuis/node-mmap for example) and as @tipiirai there are lots of caching options available as well. There's some static file serving solutions with all the problems worked out (https://github.com/cloudhead/node-static)
One thing to note however, if indeed the difference is that nodejs is using read() and vertx is using mmap() (I know for a fact nodejs uses read()) then nodejs will be more performance for one off small file reads. So if you only need to read a file once, or rarely, and the files are not HUGE then nodejs file reading will get more performance. There is a cost to using memory mapped os file reads. So keep that in mind if you switch to using mmap bindings.
@fibric you make a good point. Also, node still shines on multicore setups (https://gist.github.com/2657432) as well ;) It's somewhat of a myth that nodejs can't be setup to take advantage of multiple cores. The tests there were even using node cluster which is known to be quite slow. I suspect haproxy + independent node instances would scale linearly with the cores (provided the processes are bound to individual cores).
Note the mmap is probably only used when readFile() is used. If you use streams I don't believe it uses memory mapping, also (as mentioned before) if you use sendFile(), vert.x will avoid copying through userspace altogether.
This is highly efficient, but you only really see the benefits for larger files (which is why I didn't use it in the benchmark).
@courtneycouch I didn't expect such a detailed explanation... thank you very much!
@tipiirai exactly. This is hardly a useful benchmark. Was merely meant to show that there's more to the story than the vert.x benchmarks show which has everyone saying "vert.x is 5x faster than nodejs!" which simply isn't true.
the moral of the story here is.. There are many different ways to handle disk IO on the OS level. You explicitly pick the bindings you want to use with nodejs via modules. You need to be aware of the ramifications of the pros and cons of the IO you are doing. The flaw in the original benchmarks that inspired me to do t his is that the two benchmarks were using two different IO methods on the OS level and trying to compare them.. It was more of a benchmark of read() vs mmap() on reading a single small file over and over again than it was a benchmark of node and vertx. I don't think @purplefox realized that. I suspect he isn't familiar with how node handles IO in order to realize he was comparing apples and oranges.
There are plenty of benchmarks comparing read() and mmap() and when it's faster to use one or the other. Reading the same small file over and over again is a bad use case for read() and you can see that in the results on the original vert.x benchmarks.
You need to pick the IO bindings based on the type of data access.. Don't always trust that some language will understand what kind of IO you want to do.. using just some file read method on whatever language you are using. Find out what bindings it's really using on the OS level to see if those fit your scenario (that's if you are worried about the IO performance)
btw vertx-server.js has a missing comma at line 10, before the anonymous fn parameter.
@courtneycouch You can save a syscall in the node server if you do res.end(file)
rather than res.write(file);res.end()
Yea I didn't really spend any time on this.. just a few minutes changing the IO silliness (oh and the other benchmark showing node on multiple cores). I really should have tried mmap as well but really I should be spending my time doing actual stuff instead of making benchmarks hah. I think I made my point though that vertx isn't simply 5x faster than node as is being propagated.
i found an blog post on my readitlater list about couchdb but this blog post is about benchmarking.
http://jan.prima.de/~jan/plok/archives/175-Benchmarks-You-are-Doing-it-Wrong.html
I doubt the JVM is caching the file. Most probably it's using memory mapped files, so it's the OS handling the caching. I'd need to check the JVM source though.