Skip to content

Instantly share code, notes, and snippets.

@hitorilabs
Last active November 15, 2024 03:25
Show Gist options
  • Save hitorilabs/4b3664685305ebf883e7e2fc6be9593f to your computer and use it in GitHub Desktop.
Save hitorilabs/4b3664685305ebf883e7e2fc6be9593f to your computer and use it in GitHub Desktop.

Since I don't know how computers work... I'm always running into issues and for some reason I tend to never read the errors or logs.

NOT TODAY!

I just thought about it for a few moments, but I was trying to figure out how to get ncu to run (Nsight Compute CLI)

ncu --set full -o output python3 prof.py

I already have ncu installed, but it spits out an error saying something like this

==ERROR== ERR_NVGPUCTRPERM - The user does not have permission to access NVIDIA GPU Performance Counters on the target device 0. For instructions
 on enabling permissions and to get more information see https://developer.nvidia.com/ERR_NVGPUCTRPERM 

ok... so then I prepend sudo and then...

sudo: ncu: command not found

(???) well the reason is actually quite obvious when I think about it because I have all the cuda paths defined on my non-root user .bashrc so of course it can't find it.

Instead of relying on the $PATH variable we should just swap it out with the absolute path.

$(which ncu) --set full -o output python3 prof.py

Now run it again and I get

ModuleNotFoundError: No module named 'torch'                                                                    
==ERROR== The application returned an error code (1). 

Ok now that I'm using my brain again, it's obvious what we need to do. I'm using venv, but as I realize now... sudo is running things as the root user and resolving the path differently. Just do the same thing as before, but for python

$(which ncu) --set full -o output $(which python3) prof.py

life is good.


NOTE:

The alternatives suggest doing something more heavy-handed like:

modprobe nvidia NVreg_RestrictProfilingToAdminUsers=0

Or even writing a .conf in /etc/modprobe.d/ to persist this option... IMO you shouldn't do too much if you are just profiling on your personal machines.

@itsdaniele
Copy link

saved my life

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment