Visualization of the MobileNet-v3-large shapes (ordered similarly by decreasing time percentage, so the most important shapes come first).
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # Training data consist of X=(str, str), y=float: | |
| from sklearn.model_selection import train_test_split | |
| X = [ | |
| ["Hello World!", "Good morning!"], | |
| ["It is raining", "It is cold"], | |
| ["Beautiful city beside mountain", "Quiet street in downtown area"], | |
| ["AI is the future", "AI is just a tool"], | |
| ["This application is great", "software is the problem"], | |
| ["Hello World!", "Good morning!"], | |
| ["It is raining", "It is cold"], |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # Stick this in C:\Users\<login>\OneDrive\Documents\PowerShell\ | |
| # Test-Path -Path $PROFILE | |
| # New-Item -ItemType File -Path $PROFILE -Force | |
| # code $PROFILE | |
| $vswhere = "${env:ProgramFiles(x86)}\Microsoft Visual Studio\Installer\vswhere.exe" | |
| # Check if vswhere exists, if not you may need to adjust the path or install the VS installer | |
| if (-not (Test-Path $vswhere)) { | |
| Write-Host "vswhere.exe not found at the expected location." | |
| # Provide a link to the official vswhere documentation if needed | |
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| hipblaslt-bench --api_method c -m 768 -n 77 -k 768 --lda 768 --ldb 768 --ldc 768 --ldd 768 --stride_a 0 --stride_b 0 --stride_c 0 --stride_d 0 --alpha 1.000000 --beta 0.000000 --transA T --transB N --batch_count 1 --scaleA 0 --scaleB 0 --bias_vector --bias_source d --a_type f32_r --b_type f32_r --c_type f32_r --d_type f32_r --scale_type f32_r --bias_type f32_r --compute_type f32_r --algo_method index --solution_index 1270 --activation_type none --any_stride --rotating 0 --cold_iters 0 --iters 0 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Launch docker: | |
| rocm/sgl-dev:v0.5.8.post1-rocm720-mi30x-20260211-preview | |
| Inside container - | |
| Install Sglang from source by following - | |
| https://docs.sglang.io/platforms/amd_gpu.html#install-from-source | |
| Install transformer from source - |
The GFX1250→GFX942 cross-family ISA transpiler passes 18-19/20 tests. The remaining failure is the matmul_splitk test, where the inner loop exits prematurely and non-deterministically. This document details the exhaustive investigation.
The split-K matmul uses two kernels:
matmul_splitk_compute(236 GFX12 instructions): Each workgroup computes a partial matmul for a chunk of K. UsesblockIdx.yfor split index.
Experimental effort to run ROCm + PyTorch on a gfx1201 GPU (AMD Radeon RX 9070 XT /
AI PRO R9700, RDNA4, PCI 0x1002:0x7551) through a shared, firmware-light lite::
ROCr backend that programs the GPU's compute queue directly from userspace — on three
OSes that lack the usual ROCm/KFD kernel path:
- macOS (Apple Silicon) — AMD eGPU over Thunderbolt, via a DriverKit DEXT (no kernel driver).
- Linux (x86) —
amdgpu_liteminimal kernel shim + a userspace bring-up. - Windows — userspace
D3DKMTEscapebackend over the productionamdgpu_wddmKMD; compute shader dispatch working on gfx1201.
