The RAPIDS cudf.pandas
accelerator allows you to leverage the power of NVIDIA GPU acceleration in your pandas
workflows.
Scripts that use pandas
can be run via the cudf.pandas
module to accelerate your code with zero-code change.
python my_code.py # Uses the CPU
python -m cudf.pandas my_code.py # Same pandas code uses the GPU
But what if you don't have a GPU? That's where Coiled comes in. With the coiled run
tool you can execute scripts from your local machine on a cloud VM with
whatever hardware you choose. You will be billed only for what you use and the VM will shut down again when the script completes.
coiled run python my_code.py # Boots a VM on the cloud, runs the scripts, then shuts down again
We can tie these two tools together neatly to GPU accelerate your code on the cloud without having to change your code or where it lives.
To demonstrate this let's run the attached Pandas code to load some data and perform some standard dataframe operations
including join
, groupby
, sort_values
, count
, etc.
$ coiled run --gpu --name rapids-demo --keepalive 5m --container nvcr.io/nvidia/rapidsai/base:24.10-cuda12.5-py3.12 -- python cudf_pandas_coiled_demo.py
╭────────── Running python -m cudf.pandas cudf_pandas_coiled_demo.py ──────────╮
│ │
│ Details: https://cloud.coiled.io/clusters/xxxxxx?account=xxxxxxxx │
│ │
│ Ready ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │
│ │
│ Environment: │
│ base_24_10-cuda12_5-py3_12-x86_64-xxxxxx │
│ Region: us-east-1 Uptime: 3m 19s │
│ VM Type: g4dn.xlarge Approx cloud cost: $0.53/hr │
│ │
╰──────────────────────────────────────────────────────────────────────────────╯
Output
------
This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.download.nvidia.com/licenses/NVIDIA_Deep_Learning_Container_License.pdf
Calculate violations by state took: 25.128 seconds
Calculate violations by vehicle type took: 7.279 seconds
Calculate violations by day of week took: 22.253 seconds
In our coiled run
command we specified --gpu
to select a GPU VM type, this selected a g4dn.xlarge
on AWS but we could also specify this manually.
We set --keepalive 5m
to tell Coiled to keep our VM around for 5 mins after the script completes. This makes it easy to reuse the VM by running another script.
We also explcitly specified the latest RAPIDS container with --container nvcr.io/nvidia/rapidsai/base:24.10-cuda12.5-py3.12
,
by default Coiled will sync your local software environment to the remote machine, but in this case we explicitly want a GPU software environment and not our local environment.
The first time we run this script it takes a couple of minutes to boot up the VM. But then we can see our pandas
computations runs in just under a minute.
We can run this script as many more times as we like in the next 5 mins and it will reuse that same VM.
Next let's add the python -m cudf.pandas
flag to tell pandas
to use the GPU.
$ coiled run --gpu --name rapids-demo --keepalive 5m --container nvcr.io/nvidia/rapidsai/base:24.10-cuda12.5-py3.12 -- python -m cudf.pandas cudf_pandas_coiled_demo.py
╭────────── Running python -m cudf.pandas cudf_pandas_coiled_demo.py ──────────╮
│ │
│ Details: https://cloud.coiled.io/clusters/xxxxxx?account=xxxxxxxx │
│ │
│ Ready ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │
│ │
│ Environment: │
│ base_24_10-cuda12_5-py3_12-x86_64-xxxxxx │
│ Region: us-east-1 Uptime: 8m 55s │
│ VM Type: g4dn.xlarge Approx cloud cost: $0.53/hr │
│ │
╰──────────────────────────────────────────────────────────────────────────────╯
Output
------
This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.download.nvidia.com/licenses/NVIDIA_Deep_Learning_Container_License.pdf
Calculate violations by state took: 3.470 seconds
Calculate violations by vehicle type took: 0.145 seconds
Calculate violations by day of week took: 1.238 seconds
This time we can see that our code took less than 5 seconds to run!
coiled run ... -- python cudf_pandas_coiled_demo.py # 60 seconds of computation
coiled run --gpu ... -- python -m cudf.pandas cudf_pandas_coiled_demo.py # 5 seconds of computation