I've been very obsessed with the duration of Zipkin's POST endpoint, more than how many bytes of memory are used while processing a POST (I also obsess about that, but it doesn't keep me up at night). The duration of an endpoint that receives telemetry data, is the part that you can control of the response time.
Callers of Zipkin's POST endpoint are usually little http loops inside an application. Even when these are done neatly, with bounds etc, blocking those loops causes damage (lost spans), and also causes more overhead as these queues fill to capacity. Crazy, but true.. sometimes people literally POST to zipkin inline (ad-hoc or sometimes in php)! While we shouldn't optimize for this, it is crazy the amount of impact you can do.
For this reason, we need to succeed fast, and we also need to fail fast. We want these things to clear or fail quickly (ex in case of failure, the client can try another node, right?). This "fast" must apply at reasonable percentage of requests, because you don't want half your