Hopobcn/README.md

Use Gitlab-CI with GPU support

Since `gitlab-runner` cannot be forced to use `nvidia-docker` wrapper, follow this steps:

Install all required software: docker, nvidia-docker, gitlab-ci-multi-runner
Execute: curl -s http://localhost:3476/docker/cli
Use that data to fill devices/volumes/volume_driver fields in /etc/gitlab-runner/config.toml

frtrotta · 2019-07-05T10:02:26Z

The following config.toml provides GPU support (notice the runtime parameter).

concurrent = 1
check_interval = 0

[[runners]]
  name = "Docker runner <---complete-me--->"
  url = "https://<---complete-me---->"
  token = "28ce17edc8ea7437f3e49969c86341"
  executor = "docker"
  [runners.docker]
    tls_verify = false
    image = "nvidia/cuda"
    privileged = false
    disable_cache = false
    volumes = ["/cache"]
    shm_size = 0
    runtime = "nvidia"
[runners.cache]

Yet, is it not clear to me how to restrict the GPUs assigned to the runner, on a multi-GPU server. This functionality is named "GPU isolation".

The docker run command for GPU isolation follows: please notice the -e NVIDIA_VISIBLE_DEVICES=0. how can this be set for the runner in config.toml?

docker run --runtime=nvidia --rm -e NVIDIA_VISIBLE_DEVICES=0 nvidia/cuda:9.0-base nvidia-smi

Hopobcn · 2019-07-08T07:23:10Z

In the [[runers]] section there's and environment keyword to define environment vars. But I guess that it wont work because you have to specify that environment var to docker.

So the only way I see is to specify NVIDIA_VISIBLE_DEVICES directly in the Dockerfile
https://github.com/NVIDIA/nvidia-docker/wiki/Usage#dockerfiles

frtrotta · 2019-07-10T11:27:48Z

It seems that environment in [[runners]] section is exactly what we were looking for.

Actually, whatever environment variable setting that happens before running the script section of the .gitlab-ci.yml configuration file is ok. See the following two examples: both of them worked for me.

Example 1: using gitlab-runner configuration only

In /etc/gitlab-runner/config.toml:

[[runners]]
  name = "runner-gpu0-test"
  url = "<url>"
  token = "<token>"
  executor = "docker"
  environment = ["NVIDIA_VISIBLE_DEVICES=0"]   # <== Notice this
  [runners.docker]
    runtime = "nvidia"  # <== Notice this
    tls_verify = false
    image = "nvidia/cuda:9.0-base"
    privileged = false
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = false
    volumes = ["/cache"]
    shm_size = 0
  [runners.cache]
    [runners.cache.s3]
    [runners.cache.gcs]

[[runners]]
  name = "runner-gpu1-test"
  url = "<url>"
  token = "<token>"
  executor = "docker"
  environment = ["NVIDIA_VISIBLE_DEVICES=1"]  # <== Notice this
  [runners.docker]
    runtime = "nvidia"  # <== Notice this
    tls_verify = false
    image = "nvidia/cuda:9.0-base"
    privileged = false
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = false
    volumes = ["/cache"]
    shm_size = 0
  [runners.cache]
    [runners.cache.s3]
    [runners.cache.gcs]

The .gitlab-ci.yml file.

image: nvidia/cuda:9.0-base

test:run_on_gpu0:
  stage: test
  script:
    - echo NVIDIA_VISIBLE_DEVICES=${NVIDIA_VISIBLE_DEVICES}
    - nvidia-smi
    - sleep 10s
  tags:
    - docker
    - gpu0

test:run_on_gpu1:
  stage: test
  script:
    - echo NVIDIA_VISIBLE_DEVICES=${NVIDIA_VISIBLE_DEVICES}
    - nvidia-smi
    - sleep 7s
  tags:
    - docker
    - gpu1

The two runners have been tagged with docker, gpu0 and docker, gpu1 respectively.

Example2: using Gitlab CI custom environment variables

Gitlab CI custom environment variables

/etc/gitlab-runner/config.toml same as Example 1.

The .gitlab-ci.yml file.

image: nvidia/cuda:9.0-base

variables:
   NVIDIA_VISIBLE_DEVICES: "3"  # This is going to override definition(s) in /etc/gitlab-runner/config.toml

test:run_on_gpu0:
  stage: test
  script:
    - echo NVIDIA_VISIBLE_DEVICES=${NVIDIA_VISIBLE_DEVICES}
    - nvidia-smi
    - sleep 10s
  tags:
    - docker
    - gpu0

test:run_on_gpu1:
  stage: test
  script:
    - echo NVIDIA_VISIBLE_DEVICES=${NVIDIA_VISIBLE_DEVICES}
    - nvidia-smi
    - sleep 7s
  tags:
    - docker
    - gpu1

hyviquel · 2019-09-18T16:45:22Z

Do you guys know how to make it work with docker v19.03.2 which integrates native support for nvidia gpus?
The runtime = "nvidia"does not work anymore, containers should be executed with --gpus flag now.

docker run -it --rm --gpus all ubuntu nvidia-smi

frtrotta · 2019-09-19T11:05:04Z

it is an open issue and, looking at the comments, it does not seem to be fixed soon.

I am using Docker 19.03 together with nvidia-docker2. This provides the new --gpu switch, while keeping the compatibility with the old --runtime switch (refer to https://github.com/NVIDIA/nvidia-docker/tree/master#upgrading-with-nvidia-docker2-deprecated).

Hopobcn/README.md

Use Gitlab-CI with GPU support

Since `gitlab-runner` cannot be forced to use `nvidia-docker` wrapper, follow this steps:

frtrotta commented Jul 5, 2019

Hopobcn commented Jul 8, 2019

frtrotta commented Jul 10, 2019

hyviquel commented Sep 18, 2019 •

edited

Loading

frtrotta commented Sep 19, 2019 •

edited

Loading

	concurrent = 1
	check_interval = 0

	[[runners]]
	name = "Docker runner <---complete-me--->"
	url = "https://<---complete-me---->"
	token = "28ce17edc8ea7437f3e49969c86341"
	executor = "docker"
	[runners.docker]
	tls_verify = false
	image = "nvidia/cuda"
	privileged = false
	disable_cache = false
	devices = ["/dev/nvidiactl", "/dev/nvidia-uvm", "/dev/nvidia-uvm-tools", "/dev/nvidia3", "/dev/nvidia2", "/dev/nvidia1", "/dev/nvidia0"]
	volumes = ["/cache", "nvidia_driver_384.81:/usr/local/nvidia:ro"]
	volume_driver = "nvidia-docker"
	shm_size = 0
	[runners.cache]

Hopobcn/README.md

Use Gitlab-CI with GPU support

Since gitlab-runner cannot be forced to use nvidia-docker wrapper, follow this steps:

frtrotta commented Jul 5, 2019

Hopobcn commented Jul 8, 2019

frtrotta commented Jul 10, 2019

Example 1: using gitlab-runner configuration only

Example2: using Gitlab CI custom environment variables

hyviquel commented Sep 18, 2019 • edited Loading

frtrotta commented Sep 19, 2019 • edited Loading

Since `gitlab-runner` cannot be forced to use `nvidia-docker` wrapper, follow this steps:

hyviquel commented Sep 18, 2019 •

edited

Loading

frtrotta commented Sep 19, 2019 •

edited

Loading