The new serverless deployment routes workflow requests to CPU-only inference instances with remote execution enabled. Model blocks on these CPU hosts make HTTP requests to GPU instances for inference, which return X-Processing-Time headers. The downstream billing service already differentiates between CPU and GPU usage records, but needs the GPU processing times from the CPU orchestrator's response to bill correctly: fixed 100ms per frame for CPU + actual GPU time.
Currently the CPU orchestrator only returns its own wall-clock X-Processing-Time. We need to additionally return the remote GPU processing times collected during workflow execution, so the billing service has the full picture.
Key constraints:
- Don't change the usage collector (serves broader reporting purposes)