Last active
March 5, 2026 12:50
-
-
Save Ristovski/d16c47d84d4a6ce039159d931cbfb3d5 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| vkperf (0.99.5) tests various performance characteristics of Vulkan devices. | |
| Devices in the system: | |
| AMD Radeon Graphics (RADV RENOIR) | |
| NVIDIA GeForce RTX 4070 Ti SUPER | |
| llvmpipe (LLVM 19.1.7, 256 bits) | |
| Selected device: | |
| NVIDIA GeForce RTX 4070 Ti SUPER | |
| VendorID: 0x10de (Nvidia) | |
| DeviceID: 0x2705 | |
| Vulkan version: 1.4.303 | |
| Driver version: 570.133.7.0 (2392932800, 0x8ea141c0) | |
| Driver name: NVIDIA | |
| Driver info: 570.133.07 | |
| DriverID: NvidiaProprietary | |
| Driver conformance version: 1.4.1.0 | |
| GPU memory: 16GiB (16376MiB) | |
| Max memory allocations: 4294967295 | |
| Standard (non-sparse) buffer alignment: 16 | |
| Number of triangles for tests: 1000000 | |
| Sparse mode for tests: None | |
| Timestamp number of bits: 64 | |
| Timestamp period: 1ns | |
| Vulkan Instance version: 1.4.328 | |
| Operating system: < unknown, non-Windows > | |
| Processor: AMD Ryzen 7 5700G with Radeon Graphics | |
| Triangle throughput: | |
| Triangle list (triangle list primitive type, | |
| single per-scene vkCmdDraw() call, attributeless, | |
| constant VS output): 10.38 giga-triangles/s | |
| Indexed triangle list (triangle list primitive type, single | |
| per-scene vkCmdDrawIndexed() call, no vertices shared between triangles, | |
| attributeless, constant VS output): 10.38 giga-triangles/s | |
| Indexed triangle list that reuses two indices of the previous triangle | |
| (triangle list primitive type, single per-scene vkCmdDrawIndexed() call, | |
| attributeless, constant VS output): 20.34 giga-triangles/s | |
| Triangle strips of various lengths | |
| (per-strip vkCmdDraw() call, 1 to 1000 triangles per strip, | |
| attributeless, constant VS output): | |
| strip length 1: 302.1 mega-triangles/s | |
| strip length 2: 606.9 mega-triangles/s | |
| strip length 5: 1.521 giga-triangles/s | |
| strip length 8: 2.435 giga-triangles/s | |
| strip length 10: 3.042 giga-triangles/s | |
| strip length 20: 6.103 giga-triangles/s | |
| strip length 25: 7.629 giga-triangles/s | |
| strip length 40: 12.36 giga-triangles/s | |
| strip length 50: 15.50 giga-triangles/s | |
| strip length 100: 30.51 giga-triangles/s | |
| strip length 125: 26.39 giga-triangles/s | |
| strip length 1000: 28.72 giga-triangles/s | |
| Indexed triangle strips of various lengths | |
| (per-strip vkCmdDrawIndexed() call, 1-1000 triangles per strip, | |
| no vertices shared between strips, each index used just once, | |
| attributeless, constant VS output): | |
| strip length 1: 277.1 mega-triangles/s | |
| strip length 2: 555.4 mega-triangles/s | |
| strip length 5: 1.391 giga-triangles/s | |
| strip length 8: 2.229 giga-triangles/s | |
| strip length 10: 2.790 giga-triangles/s | |
| strip length 20: 5.580 giga-triangles/s | |
| strip length 25: 7.025 giga-triangles/s | |
| strip length 40: 11.22 giga-triangles/s | |
| strip length 50: 14.15 giga-triangles/s | |
| strip length 100: 28.72 giga-triangles/s | |
| strip length 125: 31.50 giga-triangles/s | |
| strip length 1000: 28.72 giga-triangles/s | |
| Primitive restart indexed triangle strips of various lengths | |
| (single per-scene vkCmdDrawIndexed() call, 1-1000 triangles per strip, | |
| no vertices shared between strips, each index used just once, | |
| attributeless, constant VS output): | |
| strip length 1: 1.903 giga-triangles/s | |
| strip length 2: 3.685 giga-triangles/s | |
| strip length 5: 8.346 giga-triangles/s | |
| strip length 8: 12.20 giga-triangles/s | |
| strip length 1000: 27.90 giga-triangles/s | |
| Primitive restart, each triangle is replaced by one -1 | |
| (single per-scene vkCmdDrawIndexed() call, | |
| no fragments produced): 2.077 giga-triangles/s | |
| Primitive restart, only zeros in the index buffer | |
| (single per-scene vkCmdDrawIndexed() call, | |
| no fragments produced): 30.51 giga-triangles/s | |
| Instancing throughput of vkCmdDraw() | |
| (one triangle per instance, constant VS output, one draw call, | |
| attributeless): 2.146 giga-triangles/s | |
| Instancing throughput of vkCmdDrawIndexed() | |
| (one triangle per instance, constant VS output, one draw call, | |
| attributeless): 2.077 giga-triangles/s | |
| Instancing throughput of vkCmdDrawIndirect() | |
| (one triangle per instance, one indirect draw call, | |
| one indirect record, attributeless: 2.141 giga-triangles/s | |
| Instancing throughput of vkCmdDrawIndexedIndirect() | |
| (one triangle per instance, one indirect draw call, | |
| one indirect record, attributeless: 2.077 giga-triangles/s | |
| vkCmdDraw() throughput | |
| (per-triangle vkCmdDraw() in command buffer, | |
| attributeless, constant VS output): 302.8 mega-triangles/s | |
| vkCmdDrawIndexed() throughput | |
| (per-triangle vkCmdDrawIndexed() in command buffer, | |
| attributeless, constant VS output): 276.9 mega-triangles/s | |
| VkDrawIndirectCommand processing throughput | |
| (per-triangle VkDrawIndirectCommand, one vkCmdDrawIndirect() call, | |
| attributeless): 187.8 mega-indirectRecords/s | |
| VkDrawIndirectCommand processing throughput with stride 32 | |
| (per-triangle VkDrawIndirectCommand, one vkCmdDrawIndirect() call, | |
| attributeless): 120.5 mega-indirectRecords/s | |
| VkDrawIndexedIndirectCommand processing throughput | |
| (per-triangle VkDrawIndexedIndirectCommand, | |
| 1x vkCmdDrawIndexedIndirect() call, | |
| attributeless): 143.6 mega-indirectRecords/s | |
| VkDrawIndexedIndirectCommand processing throughput with stride 32 | |
| (per-triangle VkDrawIndexedIndirectCommand, | |
| 1x vkCmdDrawIndexedIndirect() call, | |
| attributeless): 117.6 mega-indirectRecords/s | |
| Vertex and geometry shader throughput: | |
| VS throughput using vkCmdDraw() - minimal VS that just writes | |
| constant output position (per-scene vkCmdDraw() call, | |
| no attributes, no fragments produced): 31.16 giga-vertices/s | |
| VS throughput using vkCmdDrawIndexed() - minimal VS that just writes | |
| constant output position (per-scene vkCmdDrawIndexed() call, | |
| no attributes, no fragments produced): 31.16 giga-vertices/s | |
| VS producing output position from VertexIndex and InstanceIndex | |
| using vkCmdDraw() (single per-scene vkCmdDraw() call, | |
| attributeless, no fragments produced): 31.16 giga-vertices/s | |
| VS producing output position from VertexIndex and InstanceIndex | |
| using vkCmdDrawIndexed() (single per-scene vkCmdDrawIndexed() call, | |
| attributeless, no fragments produced): 31.16 giga-vertices/s | |
| GS one triangle in and no triangle out | |
| (empty VS, attributeless): 3.577 giga-invocations/s | |
| GS one triangle in and single constant triangle out | |
| (empty VS, attributeless): 3.577 giga-invocations/s | |
| GS one triangle in and two constant triangles out | |
| (empty VS, attributeless): 3.577 giga-invocations/s | |
| Attributes and buffers: | |
| One attribute performance - 1x vec4 attribute | |
| (attribute used, per-scene draw call): 30.83 giga-vertices/s | |
| One buffer performance - 1x vec4 buffer | |
| (1x read in VS, per-scene draw call): 30.83 giga-vertices/s | |
| One buffer performance - 1x vec3 buffer | |
| (1x read in VS, one draw call): 31.16 giga-vertices/s | |
| Two attributes performance - 2x vec4 attribute | |
| (both attributes used): 19.92 giga-vertices/s | |
| Two buffers performance - 2x vec4 buffer | |
| (both buffers read in VS): 19.66 giga-vertices/s | |
| Two buffers performance - 2x vec3 buffer | |
| (both buffers read in VS): 25.92 giga-vertices/s | |
| Two interleaved attributes performance - 2x vec4 | |
| (2x vec4 attribute fetched from the single buffer in VS | |
| from consecutive buffer locations: 19.79 giga-vertices/s | |
| Two interleaved buffers performance - 2x vec4 | |
| (2x vec4 fetched from the single buffer in VS | |
| from consecutive buffer locations: 20.92 giga-vertices/s | |
| Packed buffer performance - 1x buffer using 32-byte struct unpacked | |
| into position+normal+color+texCoord: 20.06 giga-vertices/s | |
| Packed attribute performance - 2x uvec4 attribute unpacked | |
| into position+normal+color+texCoord: 19.92 giga-vertices/s | |
| Packed buffer performance - 2x uvec4 buffers unpacked | |
| into position+normal+color+texCoord: 19.66 giga-vertices/s | |
| Packed buffer performance - 2x buffer using 16-byte struct unpacked | |
| into position+normal+color+texCoord: 19.66 giga-vertices/s | |
| Packed buffer performance - 2x buffer using 16-byte struct | |
| read multiple times and unpacked | |
| into position+normal+color+texCoord: 19.66 giga-vertices/s | |
| Four attributes performance - 4x vec4 attribute | |
| (all attributes used): 10.10 giga-vertices/s | |
| Four buffers performance - 4x vec4 buffer | |
| (all buffers read in VS): 10.53 giga-vertices/s | |
| Four buffers performance - 4x vec3 buffer | |
| (all buffers read in VS): 13.81 giga-vertices/s | |
| Four interleaved attributes performance - 4x vec4 | |
| (4x vec4 fetched from the single buffer | |
| on consecutive locations: 10.10 giga-vertices/s | |
| Four interleaved buffers performance - 4x vec4 | |
| (4x vec4 fetched from the single buffer | |
| on consecutive locations: 10.61 giga-vertices/s | |
| Four attributes performance - 2x vec4 and 2x uint attribute | |
| (2x vec4f32 + 2x vec4u8, 2x conversion from vec4u8 | |
| to vec4): 15.58 giga-vertices/s | |
| Transformations: | |
| Matrix performance - one matrix as uniform for all triangles | |
| (maxtrix read in VS, | |
| coordinates in vec4 attribute): 30.83 giga-vertices/s | |
| Matrix performance - per-triangle matrix in buffer | |
| (different matrix read for each triangle in VS, | |
| coordinates in vec4 attribute): 17.13 giga-vertices/s | |
| Matrix performance - per-triangle matrix in attribute | |
| (triangles are instanced and each triangle receives a different matrix, | |
| coordinates in vec4 attribute: 5.847 giga-vertices/s | |
| Matrix performance - one matrix in buffer for all triangles and 2x uvec4 | |
| packed attributes (each triangle reads matrix from the same place in | |
| the buffer, attributes unpacked): 19.92 giga-vertices/s | |
| Matrix performance - per-triangle matrix in the buffer and 2x uvec4 packed | |
| attributes (each triangle reads a different matrix from a buffer, | |
| attributes unpacked): 12.15 giga-vertices/s | |
| Matrix performance - per-triangle matrix in buffer and 2x uvec4 packed | |
| buffers (each triangle reads a different matrix from a buffer, | |
| packed buffers unpacked): 12.68 giga-vertices/s | |
| Matrix performance - GS reads per-triangle matrix from buffer and 2x uvec4 | |
| packed buffers (each triangle reads a different matrix from a buffer, | |
| packed buffers unpacked in GS): 9.212 giga-vertices/s | |
| Matrix performance - per-triangle matrix in buffer and four attributes | |
| (each triangle reads a different matrix from a buffer, | |
| 4x vec4 attribute): 7.609 giga-vertices/s | |
| Matrix performance - 1x per-triangle matrix in buffer, 2x uniform matrix and | |
| and 2x uvec4 packed attributes (uniform view and projection matrices | |
| multiplied with per-triangle model matrix and with unpacked attributes of | |
| position, normal, color and texCoord: 12.15 giga-vertices/s | |
| Matrix performance - 2x per-triangle matrix (mat4+mat3) in buffer, | |
| 3x uniform matrix (mat4+mat4+mat3) and 2x uvec4 packed attributes | |
| (full position and normal computation with MVP and normal matrices, | |
| all matrices and attributes multiplied): 9.668 giga-vertices/s | |
| Matrix performance - 2x per-triangle matrix (mat4+mat3) in buffer, | |
| 2x non-changing matrix (mat4+mat4) in push constants, | |
| 1x constant matrix (mat3) and 2x uvec4 packed attributes (all | |
| matrices and attributes multiplied): 9.668 giga-vertices/s | |
| Matrix performance - 2x per-triangle matrix (mat4+mat3) in buffer, 2x | |
| non-changing matrix (mat4+mat4) in specialization constants, 1x constant | |
| matrix (mat3) defined by VS code and 2x uvec4 packed attributes (all | |
| matrices and attributes multiplied): 9.511 giga-vertices/s | |
| Matrix performance - 2x per-triangle matrix (mat4+mat3) in buffer, | |
| 3x constant matrix (mat4+mat4+mat3) defined by VS code and | |
| 2x uvec4 packed attributes (all matrices and attributes | |
| multiplied): 9.574 giga-vertices/s | |
| Matrix performance - GS five matrices processing, 2x per-triangle matrix | |
| (mat4+mat3) in buffer, 3x uniform matrix (mat4+mat4+mat3) and | |
| 2x uvec4 packed attributes passed through VS (all matrices and | |
| attributes multiplied): 8.394 giga-vertices/s | |
| Matrix performance - GS five matrices processing, 2x per-triangle matrix | |
| (mat4+mat3) in buffer, 3x uniform matrix (mat4+mat4+mat3) and | |
| 2x uvec4 packed data read from buffer in GS (all matrices and attributes | |
| multiplied): 7.550 giga-vertices/s | |
| Textured Phong and Matrix performance - 2x per-triangle matrix | |
| in buffer (mat4+mat3), 3x uniform matrix (mat4+mat4+mat3) and | |
| four attributes (vec4f32+vec3f32+vec4u8+vec2f32), | |
| no fragments produced: 8.394 giga-vertices/s | |
| Textured Phong and Matrix performance - 1x per-triangle matrix | |
| in buffer (mat4), 2x uniform matrix (mat4+mat4) and | |
| four attributes (vec4f32+vec3f32+vec4u8+vec2f32), | |
| no fragments produced: 10.50 giga-vertices/s | |
| Textured Phong and Matrix performance - 1x per-triangle matrix | |
| in buffer (mat4), 2x uniform matrix (mat4+mat4) and 2x uvec4 packed | |
| attribute, no fragments produced: 12.15 giga-vertices/s | |
| Textured Phong and Matrix performance - 1x per-triangle row-major matrix | |
| in buffer (mat4), 2x uniform not-row-major matrix (mat4+mat4), | |
| 2x uvec4 packed attributes, | |
| no fragments produced: 12.15 giga-vertices/s | |
| Textured Phong and Matrix performance - 1x per-triangle mat4x3 matrix | |
| in buffer, 2x uniform matrix (mat4+mat4) and 2x uvec4 packed attributes, | |
| no fragments produced: 13.37 giga-vertices/s | |
| Textured Phong and Matrix performance - 1x per-triangle row-major mat4x3 | |
| matrix in buffer, 2x uniform matrix (mat4+mat4), 2x uvec4 packed | |
| attribute, no fragments produced: 13.43 giga-vertices/s | |
| Textured Phong and PAT performance - PAT v1 (Position-Attitude-Transform, | |
| performing translation (vec3) and rotation (quaternion as vec4) using | |
| implementation 1), PAT is per-triangle 2x vec4 in buffer, | |
| 2x uniform matrix (mat4+mat4), 2x uvec4 packed attributes, | |
| no fragments produced: 15.02 giga-vertices/s | |
| Textured Phong and PAT performance - PAT v2 (Position-Attitude-Transform, | |
| performing translation (vec3) and rotation (quaternion as vec4) using | |
| implementation 2), PAT is per-triangle 2x vec4 in buffer, | |
| 2x uniform matrix (mat4+mat4), 2x uvec4 packed attributes, | |
| no fragments produced: 14.94 giga-vertices/s | |
| Textured Phong and PAT performance - PAT v3 (Position-Attitude-Transform, | |
| performing translation (vec3) and rotation (quaternion as vec4) using | |
| implementation 3), PAT is per-triangle 2x vec4 in buffer, | |
| 2x uniform matrix (mat4+mat4), 2x uvec4 packed attributes, | |
| no fragments produced: 15.02 giga-vertices/s | |
| Textured Phong and PAT performance - constant single PAT v2 sourced from | |
| the same index in buffer (vec3+vec4), 2x uniform matrix (mat4+mat4), | |
| 2x uvec4 packed attributes, | |
| no fragments produced: 19.92 giga-vertices/s | |
| Textured Phong and PAT performance - indexed draw call, per-triangle PAT v2 | |
| in buffer (vec3+vec4), 2x uniform matrix (mat4+mat4), 2x uvec4 packed | |
| attribute, no fragments produced: 13.75 giga-vertices/s | |
| Textured Phong and PAT performance - indexed draw call, constant single | |
| PAT v2 sourced from the same index in buffer (vec3+vec4), | |
| 2x uniform matrix (mat4+mat4), 2x uvec4 packed attributes, | |
| no fragments produced: 17.75 giga-vertices/s | |
| Textured Phong and PAT performance - primitive restart, indexed draw call, | |
| per-triangle PAT v2 in buffer (vec3+vec4), 2x uniform matrix (mat4+mat4), | |
| 2x uvec4 packed attributes, | |
| no fragments produced: 5.710 giga-vertices/s | |
| Textured Phong and PAT performance - primitive restart, indexed draw call, | |
| constant single PAT v2 sourced from the same index in buffer (vec3+vec4), | |
| 2x uniform matrix (mat4+mat4), 2x uvec4 packed attributes, | |
| no fragments produced: 5.710 giga-vertices/s | |
| Textured Phong and double precision matrix performance - double precision | |
| per-triangle matrix in buffer (dmat4), double precision per-scene view | |
| matrix in uniform (dmat4), both matrices converted to single precision | |
| before computations, single precision per-scene perspective matrix in | |
| uniform (mat4), single precision vertex positions, packed attributes | |
| (2x uvec4), no fragments produced: 8.719 giga-vertices/s | |
| Textured Phong and double precision matrix performance - double precision | |
| per-triangle matrix in buffer (dmat4), double precision per-scene view | |
| matrix in uniform (dmat4), both matrices multiplied in double precision, | |
| single precision vertex positions, single precision per-scene | |
| perspective matrix in uniform (mat4), packed attributes (2x uvec4), | |
| no fragments produced: 5.203 giga-vertices/s | |
| Textured Phong and double precision matrix performance - double precision | |
| per-triangle matrix in buffer (dmat4), double precision per-scene view | |
| matrix in uniform (dmat4), both matrices multiplied in double precision, | |
| double precision vertex positions (dvec3), single precision per-scene | |
| perspective matrix in uniform (mat4), packed attributes (3x uvec4), | |
| no fragments produced: 5.415 giga-vertices/s | |
| Textured Phong and double precision matrix performance using GS - double | |
| precision per-triangle matrix in buffer (dmat4), double precision | |
| per-scene view matrix in uniform (dmat4), both matrices multiplied in | |
| double precision, double precision vertex positions (dvec3), single | |
| precision per-scene perspective matrix in uniform (mat4), packed | |
| attributes (3x uvec4), | |
| no fragments produced: 2.013 giga-vertices/s | |
| Fragment throughput: | |
| Single full-framebuffer quad, | |
| constant color FS: 135.0 giga-fragments/s | |
| 10x full-framebuffer quad, | |
| constant color FS: 202.5 giga-fragments/s | |
| Four smooth interpolators (4x vec4), | |
| 10x fullscreen quad: 164.6 giga-fragments/s | |
| Four flat interpolators (4x vec4), | |
| 10x fullscreen quad: 174.5 giga-fragments/s | |
| Four textured phong interpolators (vec3+vec3+vec4+vec2), | |
| 10x fullscreen quad: 200.4 giga-fragments/s | |
| Textured Phong, packed uniforms (four smooth interpolators | |
| (vec3+vec3+vec4+vec2), 4x uniform (material (56 byte) + | |
| globalAmbientLight (12 byte) + light (64 byte) + sampler2D), | |
| 10x fullscreen quad): 120.5 giga-fragments/s | |
| Textured Phong, not packed uniforms (four smooth interpolators | |
| (vec3+vec3+vec4+vec2), 4x uniform (material (72 byte) + | |
| globalAmbientLight (12 byte) + light (80 byte) + sampler2D), | |
| 10x fullscreen quad): 120.5 giga-fragments/s | |
| Simplified Phong, no texture, no specular (2x smooth interpolator | |
| (vec3+vec3), 3x uniform (material (vec4+vec4) + globalAmbientLight | |
| (vec3) + light (48 bytes: position+attenuation+ambient+diffuse)), | |
| 10x fullscreen quad): 198.5 giga-fragments/s | |
| Simplified Phong, no texture, no specular, single uniform | |
| (2x smooth interpolator (vec3+vec3), 1x uniform | |
| (material+globalAmbientLight+light (vec4+vec4+vec4 + 3x vec4), | |
| 10x fullscreen quad): 196.6 giga-fragments/s | |
| Constant color from uniform, 1x uniform (vec4) in FS, | |
| 10x fullscreen quad: 202.5 giga-fragments/s | |
| Constant color from uniform, 1x uniform (uint) in FS, | |
| 10x fullscreen quad: 202.5 giga-fragments/s | |
| Transfer throughput: | |
| Transfer of consecutive blocks: | |
| 4 bytes: 12.4224ns per transfer (0.299885 GiB/s) | |
| 4 bytes: 8.9664ns per transfer (0.415472 GiB/s) | |
| 8 bytes: 11.0016ns per transfer (0.677227 GiB/s) | |
| 16 bytes: 11.1648ns per transfer (1.33466 GiB/s) | |
| 32 bytes: 11.4688ns per transfer (2.59856 GiB/s) | |
| 64 bytes: 11.5456ns per transfer (5.16254 GiB/s) | |
| 128 bytes: 12.4512ns per transfer (9.57412 GiB/s) | |
| 256 bytes: 17.4805ns per transfer (13.6391 GiB/s) | |
| 512 bytes: 28.9609ns per transfer (16.4648 GiB/s) | |
| 1024 bytes: 53.5938ns per transfer (17.7945 GiB/s) | |
| 2048 bytes: 104.031ns per transfer (18.3344 GiB/s) | |
| 4096 bytes: 204.25ns per transfer (18.6766 GiB/s) | |
| 8192 bytes: 405.75ns per transfer (18.8032 GiB/s) | |
| 16384 bytes: 812ns per transfer (18.7916 GiB/s) | |
| 32768 bytes: 1623ns per transfer (18.8032 GiB/s) | |
| 65536 bytes: 3246ns per transfer (18.8032 GiB/s) | |
| 131072 bytes: 6494ns per transfer (18.7974 GiB/s) | |
| 262144 bytes: 12672ns per transfer (19.2661 GiB/s) | |
| 524288 bytes: 2048ns per transfer (238.419 GiB/s) | |
| 1048576 bytes: 3584ns per transfer (272.478 GiB/s) | |
| 2097152 bytes: 6144ns per transfer (317.891 GiB/s) | |
| Transfer of spaced blocks: | |
| 4 bytes: 8.9632ns per transfer (0.415621 GiB/s) | |
| 4 bytes: 8.9632ns per transfer (0.415621 GiB/s) | |
| 8 bytes: 8.96ns per transfer (0.831538 GiB/s) | |
| 16 bytes: 8.9632ns per transfer (1.66248 GiB/s) | |
| 32 bytes: 8.9696ns per transfer (3.32259 GiB/s) | |
| 64 bytes: 8.9792ns per transfer (6.63808 GiB/s) | |
| 128 bytes: 9.9168ns per transfer (12.0209 GiB/s) | |
| 256 bytes: 16.4648ns per transfer (14.4805 GiB/s) | |
| 512 bytes: 25.875ns per transfer (18.4285 GiB/s) | |
| 1024 bytes: 46.5156ns per transfer (20.5022 GiB/s) | |
| 2048 bytes: 90.0938ns per transfer (21.1707 GiB/s) | |
| 4096 bytes: 180.375ns per transfer (21.1487 GiB/s) | |
| 8192 bytes: 346.375ns per transfer (22.0264 GiB/s) | |
| 16384 bytes: 671.25ns per transfer (22.7319 GiB/s) | |
| 32768 bytes: 1381.5ns per transfer (22.0902 GiB/s) | |
| 65536 bytes: 2910ns per transfer (20.9743 GiB/s) | |
| 131072 bytes: 5838ns per transfer (20.9096 GiB/s) | |
| 262144 bytes: 10752ns per transfer (22.7065 GiB/s) | |
| 524288 bytes: 13312ns per transfer (36.6798 GiB/s) | |
| 1048576 bytes: 17440ns per transfer (55.9956 GiB/s) | |
| 2097152 bytes: 20480ns per transfer (95.3674 GiB/s) | |
| Measurement statistics: | |
| Triangle throughput measurement time: 10.5 seconds using 413 test rounds. | |
| Vertex throughput measurement time: 0.505 seconds using 413 test rounds. | |
| Attribute and Buffer measurement time: 1.37 seconds using 413 test rounds. | |
| Transformation measurement time: 4.6 seconds using 413 test rounds. | |
| Fragment throughput measurement time: 0.504 seconds using 413 test rounds. | |
| Transfer throughput measurement time: 1.58 seconds using 413 test rounds. | |
| Total device time: 18.5 seconds. | |
| Total real time: 20 seconds. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment