Objective:
For Task<T>
and ValueTask<T>
(T
=int
as a common non-trivial but inlineable case), compare async performance in
the synchronous scenario (i.e. where data happens to be buffered - common in deserialization etc code) for 3 implementations:
- using
await
throughout - using synchronous code until incompleteness detected (via
IsCompleted
); switch via localasync Awaited
if needed - using synchronous code until incompleteness detected (via
IsCompletedSuccessfully
); switch via localasync Awaited
if needed
Note:
These findings should not be viewed as a general view of anything async. It applies specifically to the scenario where you expect many / most of your results to be available synchronously. It absolutely does not apply if your code is usually / always going to do actual async work.
Conclusions:
Holy shoot ValueTask<int>
is awesome for the "already completed* scenario, if you manually hoist it. The await
implementation
doesn't gain any performance over Task<int>
, but it doesn't allocate, so that's great. But if you use the manual "try to keep sync,
switch if needed" approach you can reap awesome performance boosts.
When baselining the manually implemented ValueTask<T>
versions (int/manual/valuetask/*
) against the synchronous version (int/sync
), there is essentially no overhead on .netcoreapp1.1, and very minimal overhead on net462.
Additional observation: in the synchronous case of ValueTask<T>
, the IsCompletedSuccessfully
approach is optimal vs IsCompleted
; in
the case of Task<int>
, IsCompleted
is optimal, presumably because of https://github.com/dotnet/corefx/issues/16745 which calls
out that Status
is heavy-weight and cannot be inlined.
Code:
https://github.com/mgravell/protobuf-net/tree/async/src/TheAwaitingGame
Data:
> dotnet run -c Release -f netcoreapp1.1
Method | Mean | StdDev | Op/s | Scaled | Scaled-StdDev | Gen 0 | Allocated |
--------------------------------------------- |--------------:|-----------:|-----------:|-------:|--------------:|-------:|----------:|
int/await/task | 4,808.1011 ns | 34.5216 ns | 207982.32 | 6.70 | 0.05 | 3.4438 | 21.61 kB |
int/manual/task/iscompleted | 2,490.7947 ns | 24.4512 ns | 401478.29 | 3.47 | 0.04 | 3.4786 | 21.82 kB |
int/manual/task/iscompletedsuccessfully | 2,987.2058 ns | 25.0176 ns | 334761 | 4.16 | 0.04 | 3.4781 | 21.82 kB |
int/await/valuetask | 5,395.7703 ns | 27.8437 ns | 185330.35 | 7.52 | 0.05 | - | 0 kB |
int/manual/valuetask/iscompleted | 702.5910 ns | 4.0544 ns | 1423303.24 | 0.98 | 0.01 | - | 0 kB |
int/manual/valuetask/iscompletedsuccessfully | 702.9448 ns | 2.4655 ns | 1422586.73 | 0.98 | 0.01 | - | 0 kB |
int/sync | 717.4125 ns | 3.0914 ns | 1393898.17 | 1.00 | 0.00 | - | 0 kB |
> dotnet run -c Release -f net462
Method | Mean | StdDev | Op/s | Scaled | Scaled-StdDev | Gen 0 | Allocated |
--------------------------------------------- |--------------:|-----------:|-----------:|-------:|--------------:|-------:|----------:|
int/await/task | 7,165.8281 ns | 60.4568 ns | 139551.21 | 10.85 | 0.11 | 3.8271 | 24.08 kB |
int/manual/task/iscompleted | 2,769.5637 ns | 18.9399 ns | 361067.7 | 4.19 | 0.04 | 3.8651 | 24.32 kB |
int/manual/task/iscompletedsuccessfully | 3,167.4105 ns | 23.8556 ns | 315715.31 | 4.79 | 0.04 | 3.8646 | 24.32 kB |
int/await/valuetask | 7,056.9673 ns | 20.9213 ns | 141703.93 | 10.68 | 0.07 | - | 0 kB |
int/manual/valuetask/iscompleted | 712.3976 ns | 2.2813 ns | 1403710.43 | 1.08 | 0.01 | - | 0 kB |
int/manual/valuetask/iscompletedsuccessfully | 698.7291 ns | 1.7101 ns | 1431169.82 | 1.06 | 0.01 | - | 0 kB |
int/sync | 660.6417 ns | 4.0141 ns | 1513679.87 | 1.00 | 0.00 | - | 0 kB |