Last active
July 8, 2023 14:55
-
-
Save martindurant/5f517ec55a5bff9c32637e8ebc57ef7c to your computer and use it in GitHub Desktop.
Author
martindurant
commented
Apr 4, 2021
via email
Worth finding out what is getting measured by the worker.
…On April 4, 2021 9:56:02 AM EDT, alexis-intellegens ***@***.***> wrote:
How much memory should this consume? The array is around 4gb, if it
wasn't duplicated my workers wouldn't crash right.
--
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
https://gist.github.com/5f517ec55a5bff9c32637e8ebc57ef7c#gistcomment-3692542
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
If Dask is asking the workers for the number of bytes contained in a variable, like sys.getsizeof, then the array will appear to contribute to the memory footprint of each process, even if it isn't actually duplicated. I'm not sure how psutil's memory_info, which is what the nanny watches, handles this. I'm not certain how to tell the difference. Since this would be a new way to do things, it's worth finding out. Something similar must happen when multiple dask processes use one GPU.
I see what you're saying. That could be happening. Are you able to reproduce the out of memory errors with code similar to mine?
I haven't yet tried, sorry
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment