Skip to content

Instantly share code, notes, and snippets.

@kinow
Last active February 8, 2020 08:48
Show Gist options
  • Save kinow/ea8593abe5b6fd401598ec8e0688cefb to your computer and use it in GitHub Desktop.
Save kinow/ea8593abe5b6fd401598ec8e0688cefb to your computer and use it in GitHub Desktop.
Using containers to run Cylc

Document created for Cylc meetup in NIWA Wellington on February 2020.

Docker

Some of the images mentioned here are available on Docker Hub, with the source code hosted on GitHub here and here.

Cylc Flow

I published a Docker image some time ago, that used an external Cylc Flow installation. This was harder to set up and run, so I used parts of the image by Alan and created a new image that includes Cylc Flow 8.0a1.

Linux Alpine was used to keep the container size small. The base image is python:3.7.6-alpine that occupies 97.8MB. The final image of Cylc Flow 8.0a1 occupies 140MB.

This image uses a user called cylc within the container. When you run the container, you will be doing so as cylc, regardless of your external user, and with no sudo access.

Running the container

$ docker run -ti --rm --name cylc-flow kinow/cylc-flow:8.0a1-alpine
Unable to find image 'kinow/cylc-flow:8.0a1-alpine' locally
8.0a1-alpine: Pulling from kinow/cylc-flow
e6b0cf9c0882: Pull complete 
da0e9bf0cc60: Pull complete 
8914a5766180: Pull complete 
e59c7f5578a9: Pull complete 
a4a511021c4d: Pull complete 
f012a8e72f51: Pull complete 
d9009ff8bd26: Pull complete 
aa61d85dbd96: Pull complete 
be10e7aace65: Pull complete 
b5e6966eb3d9: Pull complete 
Digest: sha256:77f38b739d72364f1b386d423933d498f0053d8e6f9b60ca6aa87c5570a75d7d
Status: Downloaded newer image for kinow/cylc-flow:8.0a1-alpine
bash-5.0$ cylc --version
8.0a1

Breaking down the command line we have:

  • docker run - docker is a container command line utility. Like cylc run, docker run will run a container
    • -ti - combined -t to allocate a pseudo-TTY and -i for interactive
    • --rm to remove the image after it has finished running
    • --name cylc-flow the container name, as it appears on docker ps or docker ps -a if not running and not removed
    • kinow/cylc-flow:8.0a1-alpine the image tag to download

There is also an ommitted command that can be given to the container.

$ docker run -ti --rm --name cylc-flow kinow/cylc-flow:8.0a1-alpine ls
cylc-suites           docker-entrypoint.sh
$ docker run -ti --rm --name cylc-flow kinow/cylc-flow:8.0a1-alpine /usr/local/bin/cylc --version
8.0a1

Running a workflow inside a container

You can execute a workflow inside a container with the following command:

$ docker run -t -i --rm --name cylc-flow-hello kinow/cylc-flow:8.0a1-alpine cylc run hello --no-detach

It will create a temporary container for the workflow, and run the command you specified after the container tag.

Volumes

After the container is removed, your data is gone. So if your suite produces any output files, you need to save it somewhere else. Volumes can be used for that.

$ mkdir -p workflow-output
$ docker run -ti --rm --name cylc-flow  -v $PWD/workflow-output:/home/cylc/cylc-run kinow/cylc-flow:8.0a1-alpine
bash-5.0$ cylc register hello cylc-suites/hello/
REGISTERED hello -> /home/cylc/cylc-suites/hello
bash-5.0$ cylc run --no-detach hello
            ._.                                                       
            | |                 The Cylc Suite Engine [8.0a1]         
._____._. ._| |_____.           Copyright (C) 2008-2019 NIWA          
| .___| | | | | .___|   & British Crown (Met Office) & Contributors.  
| !___| !_! | | !___.  _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
!_____!___. |_!_____!  This program comes with ABSOLUTELY NO WARRANTY.
      .___! |              It is free software, you are welcome to    
      !_____!             redistribute it under certain conditions;   
                        see `COPYING' in the Cylc source distribution. 
                                                                       
2020-02-04T20:43:01Z INFO - Suite server: url=tcp://132199d48c45:43090/ pid=13
2020-02-04T20:43:01Z INFO - Run: (re)start=0 log=1
2020-02-04T20:43:01Z INFO - Cylc version: 8.0a1
2020-02-04T20:43:01Z INFO - Run mode: live
2020-02-04T20:43:01Z INFO - Initial point: 1
2020-02-04T20:43:01Z INFO - Final point: 1
2020-02-04T20:43:01Z INFO - Cold Start 1
2020-02-04T20:43:01Z INFO - [Hello.1] -submit-num=01, owner@host=132199d48c45
2020-02-04T20:43:01Z INFO - [Hello.1] -triggered off []
2020-02-04T20:43:03Z INFO - [Hello.1] status=ready: (internal)submitted at 2020-02-04T20:43:03Z for job(01)
2020-02-04T20:43:03Z INFO - [Hello.1] -health check settings: submission timeout=None
2020-02-04T20:43:03Z INFO - [client-command] put_messages cylc@132199d48c45:cylc-message
2020-02-04T20:43:04Z INFO - [Hello.1] status=submitted: (received)started at 2020-02-04T20:43:03Z for job(01)
2020-02-04T20:43:04Z INFO - [Hello.1] -health check settings: execution timeout=None
2020-02-04T20:43:04Z INFO - [client-command] put_messages cylc@132199d48c45:cylc-message
2020-02-04T20:43:05Z INFO - [Hello.1] status=running: (received)succeeded at 2020-02-04T20:43:04Z for job(01)
2020-02-04T20:43:05Z INFO - Suite shutting down - AUTOMATIC
2020-02-04T20:43:06Z INFO - DONE
bash-5.0$ exit
$ tree workflow-output/
workflow-output/
└── hello
    ├── log
    │   ├── db
    │   ├── job
    │   │   └── 1
    │   │       └── Hello
    │   │           ├── 01
    │   │           │   ├── job
    │   │           │   ├── job-activity.log
    │   │           │   ├── job.err
    │   │           │   ├── job.out
    │   │           │   └── job.status
    │   │           └── NN -> 01
    │   ├── suite
    │   │   ├── log -> log.20200204T204301Z
    │   │   └── log.20200204T204301Z
    │   └── suiterc
    │       └── 20200204T204301Z-run.rc
    ├── share
    ├── suite.rc.processed
    └── work
        └── 1

12 directories, 10 files

We used a volume for the folder /home/cylc/cylc-run/. So when the Cylc Flow workflow finished running, and the container was removed, the data was still kept in the host computer.

Cylc UI Server

Cylc UI Server contains cylc-flow and cylc-uiserver installed via pip on Alpine. jupyterhub and configurable-http-proxy are also installed.

Alpine has some issues with PAM, so the image is currently using the Dummy Authenticator, which means that it is not recommended to use this for anything but testing and development.

Running Cylc UI Server in a Docker container

You can start the UI Server with the following command:

docker run --name cylc-uiserver --rm -d -p 8000:8000 kinow/cylc-uiserver:0.1-alpine

Then access http://localhost:8000, and log in with the user cylc and any password.

That will initialize JupyterHub and the Configurable HTTP Proxy, and will probably also spawn the cylc-uiserver process, pointing to the version 0.1 of Cylc UI static files.

Let's break down the command line.

  • docker run - tell Docker command line to run a container
    • --name cylc-uiserver - give it the name cylc-uiserver so that it appears on docker ps or docker ps -a (if not removed)
    • --rm - remove the container once its execution is over
    • -d - run it as a daemon, in the background
    • -p 8000:8000 bind the local port 8000 in the host machine, to the container port 8000 (we could have 6000:8000 in case the host machine has another process listening on 8000, then the URL would be http://localhost:6000)
    • kinow/cylc-uiserver:0.1-alpine - the tag

The container contains three workflows installed, five, families2, and complex. And you can initialize any of these with this simple command.

Running a workflow inside the Cylc UI Server container

With the Cylc UI Server running, we probably want to have at least one workflow running to see something in the Web UI.

We can start the workflow five, for example, with the following command.

$ docker exec cylc-uiserver cylc run five
            ._.                                                       
            | |                 The Cylc Suite Engine [8.0a1]         
._____._. ._| |_____.           Copyright (C) 2008-2019 NIWA          
| .___| | | | | .___|   & British Crown (Met Office) & Contributors.  
| !___| !_! | | !___.  _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
!_____!___. |_!_____!  This program comes with ABSOLUTELY NO WARRANTY.
      .___! |              It is free software, you are welcome to    
      !_____!             redistribute it under certain conditions;   
                        see `COPYING' in the Cylc source distribution. 
                                                                       

*** listening on tcp://80a5d0fb08f0:43058/ ***

To view suite server program contact information:
 $ cylc get-suite-contact five

Other ways to see if the suite is still running:
 $ cylc scan -n 'five' 80a5d0fb08f0
 $ cylc ping -v --host=80a5d0fb08f0 five
 $ ps -opid,args 32  # on 80a5d0fb08f0

You should now be able to confirm that the workflow is running inside the container.

$ docker exec cylc-uiserver cylc scan
five cylc@80a5d0fb08f0:43058

And use the browser to visualize its tree view.

image

Security

I use Docker and the docker command line utility to run containers. docker-compose for local orchestration. And docker-credential-secretservice to avoid plain text passwords when using docker login.

The containers I use are normally using a specific user in the container - most communities are adopting it now. This includes the Cylc containers I am creating.

I run the Docker daemon as root user. This is considered unsafe by some, but I have other services running as root too, and use a Virtual Machine for Docker, where I don't have anything really sensitive.

But the official documentation includes instructions on how to run Docker daemon without the root user.

Furthermore, if you use a cloud provider like Azure, GCP, AWS, etc. These normally have extra levels of security available (e.g. AWS KMS).

Singularity

Singularity does not need root user to run containers. It requires only sudo to create the image.

It uses an input file similar to the Dockerfile, and that can import from Docker images. But when you create the Singularity container, it won't use the layered file system, so there is a small increase in the disk space used.

$ sudo singularity build cylc-7.8.1.simg Singularity-cylc-7.8.1

The file produced after running this command contains Cylc Flow and all the required dependencies inside it. You can it as you would normally run Cylc Flow.

$ ./cylc-7.8.1.simg check-software

And you can access a shell in the container with:

$ singularity shell cylc-7.8.1.simg

Note: as I don't use Singularity for my development workflow, and also don't have any environment running containers with Singularity, there may be new features or changes in Singularity that I am not aware.

What I wrote above is from memories from about 1 year ago. I have never published the images to their equivalent of Docker Hub because it requires to link my account on GitHub and give Singularity servers write access to my user data (which could have changed).

So this section is much shorter, but just to show that we can have the same that we have with Docker, with Singularity, with a few missing parts like the layered file system, the separate network, the daemon, Docker Compose, etc.

@kinow
Copy link
Author

kinow commented Feb 5, 2020

Did not have enough time to create containers for the current versions in the master branches, but will try to work on something until next week.

With the example above, users should be able to run Cylc Flow and Cylc UI Server on any platform with Docker.

@kinow
Copy link
Author

kinow commented Feb 8, 2020

Example of other workflow engines using containers:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment