WebGPU has certain design principles that lead to certain decisions. This document is my opinion of what those principles are what some of the conclusions they lead to
- Be Safe
- Be Portable
- Be Fast
- Be Memory Efficient (.)
This one is pretty obvious. It means, for example all buffer usages are checked against sizes and offsets to ensure nothing is used out of bounds. Similarly, WGSL out of bounds array access is required to have defined behavior. Those are 2 many ways WebGPU chooses safety.
This one is also relatively obvious. Not everything can be portable. For example, different GPUs support different texture compression. WebGPU can't paper over that issue. Similarly, different GPUs have different limits. But, in general, as much as possible, WebGPU tries to do the portable thing. Find the intersections in behavior across the various APIs and expose them in a portable way
Examples of places this come up:
-
depth textures are not filterable
Many devices do not support filtering depth textures when sampling. Some other APIs silently ignore this so your app may be not getting what it asked for. WebGPU disallows this because that's the portable solution. Note that all WebGPU devices support filtering on comparisons (
textureSampleCompare
,textureSampleCompareLevel
), but not ontextureSample
and related functions.
I could have called this something more descriptive but, a guiding principle of WebGPU is, do not introduce heavy workarounds. Instead, pass the core features on to the application so it can decide if it wants to opt into some heavy workaround or re-design its own usage to not need the workaround.
Some examples of where this comes up
-
copying a buffer to a texture or texture to a buffer requires
bytesPerRow
to be a multiple of 256In WebGL you could read a 4x4 rgba8 texture like this
const data = new Uint8Array(4 * 4 * 4); gl.readPixels(0, 0, 4, 4, gl.RGBA, gl.UNSIGNED_BYTE, data);
In WebGPU, given the alignment requirements, you can't read a 4x4 texture directly into 64 bytes. Instead you'd need a buffer of 784 bytes. (3 rows of 256 bytes plus one row of 16 bytes)
This limit is a limit of one or more of the underlying APIs (Metal/DirectX/Vulkan). WebGL (via ANGLE or similar) would work around this limit for you. That's convenient but slow.
The "Be Fast" principle means that WebGPU does not provide this slow workaround (which might be slow on some machines and not others). Instead, by following the principle of "Be Fast", WebGPU gives you the fast option (pad your rows). If you want to stay fast, then you adjust your code to expect the padding. If you want the slow option then you implement one. This way, you know what you're getting. There is no magic happening behind the scenes that may or may not be adding overhead.
There were quite a few of these types of workarounds in WebGL for which you'd pay an invisible speed penalty. WebGPU instead, opts to give you the fast path and you can then choose to opt into slower workarounds.
This principle effectively means, do not pick a solution that requires that WebGPU to make internal copies of buffers or textures.
We can take the same example of copyTextureToBuffer
. One solution, if
WebGPU chose to workaround this limit, would be to allocate a temporary
padded buffer, read the texture into this padded buffer, copy the texture to
the user's unpaddded buffer, and then delete the temporary buffer. WebGL is
full of these kinds of temporary allocations to work around driver limits.
Allocation can fail. Allocation is also slow. WebGPU, again, chooses not to
use these types of solutions and instead pass them on to the app developer
who can best choose if they want a solution that uses more memory or wants
to design their app not to need it.