Python command line utility for working with Cromwell in DSDE
Tested with Python 3, might work with Python 2
python setup.py install
Make sure Python 3.4+ is installed. From the root of this repository:
$ pyvenv-3.5 venv
$ source venv/bin/activate
$ python setup.py develop
It's a good idea to now symbolically link ./venv/bin/cromtool
into a directory in your PATH
. I typically link it into ~/bin
. Any time you modify the contents of this directory, the cromtool
executable will be immediately affected.
Run cromtool
to initialize directories and files:
$ cromtool
Initializing /Users/sfrazer/.cromtool...
Initializing /Users/sfrazer/.cromtool/build...
Initializing /Users/sfrazer/.cromtool/tmp...
Writing default config to /Users/sfrazer/.cromtool/config
======================================
A default configuration file has been written to /Users/sfrazer/.cromtool/config
Please modify this file and put in values for username and passwords
======================================
Below is a description of each sub-command to cromtool
Manage a list of Cromwell servers. These will be referenced by name
$ cromtool server ls
|Name |URL |
|------------|----------------------------------------------------|
|dsde-staging|https://cromwell.dsde-staging.broadinstitute.org:443|
|local |http://localhost:8000 |
|dsde-dev |https://cromwell.dsde-dev.broadinstitute.org:443 |
$ cromtool server add local-80 http://localhost
$ cromtool server ls
|Name |URL |
|------------|----------------------------------------------------|
|dsde-dev |https://cromwell.dsde-dev.broadinstitute.org:443 |
|dsde-staging|https://cromwell.dsde-staging.broadinstitute.org:443|
|local |http://localhost:8000 |
|local-80 |http://localhost:80 |
Acquire a Google Access Token:
$ cromtool access-token
ya29._wEWJqH4_c402W621this_is_super_secret_dont_tell_anybody_about_it
Acquire a Google Refresh Token:
$ cromtool refresh-token
1/another_secret_string
Returns a table of all VMs that are running JES jobs. This data is pieced together from the output of a gcloud compute instances list
command.
The last column gives a gcloud compute ssh
command that will allow SSH access to the VM where the job is running. Note that the actual process is running in a Docker container on that VM.
It is recommended that you specify --project
explicitly. Otherwise, the default project that gcloud is currently configured for (see output of gcloud info
)
Example usage:
$ cromtool jes-instances --project=broad-dsde-dev
> gcloud compute --project=broad-dsde-dev instances list --format json
|Name |Pipeline ID |Run ID |Machine Type |Status |gcloud |
|------------------------|--------------------|--------------------------------------------------------|-------------|-------|----------------------------------------------------------------|
|ggp-10444405266723971505|15579851035708136685|EJSbz4KvKhixw7Sqjej--JABIMO73rS7FyoMc3RhZ2luZ1F1ZXVl |n1-standard-1|RUNNING|gcloud compute ssh --zone=us-central1-a ggp-10444405266723971505|
|ggp-15855547600557060489|9083337551142404666 |EOe-6pe1KhiJ44CUlZGOhdwBIMO73rS7FyoPcHJvZHVjdGlvblF1ZXVl|n1-standard-1|RUNNING|gcloud compute ssh --zone=us-central1-a ggp-15855547600557060489|
|ggp-17029143444771312572|7915637536437851425 |EP2kgLm1Khi8p5D20ffqqewBIMO73rS7FyoPcHJvZHVjdGlvblF1ZXVl|n1-standard-1|RUNNING|gcloud compute ssh --zone=us-central1-a ggp-17029143444771312572|
|ggp-1771482711520258106 |16475625741508474430|EMytz9e3Khi60KSPv8Tkyhggw7vetLsXKg9wcm9kdWN0aW9uUXVldWU |n1-standard-1|RUNNING|gcloud compute ssh --zone=us-central1-a ggp-1771482711520258106 |
|ggp-3157797711739161870 |10910293093633253841|EJrvtpyzKhiOmoSGnImw6Ssgw7vetLsXKg9wcm9kdWN0aW9uUXVldWU |n1-standard-1|RUNNING|gcloud compute ssh --zone=us-central1-a ggp-3157797711739161870 |
|ggp-7010617938722704116 |17059662677337818778|EJPWn-qzKhj0lbGs8futpWEgw7vetLsXKg9wcm9kdWN0aW9uUXVldWU |n1-standard-1|RUNNING|gcloud compute ssh --zone=us-central1-a ggp-7010617938722704116 |
|ggp-8015122898931865014 |7747849907777471718 |EOfS-a24Khi2q7uv-PnbnW8gw7vetLsXKg9wcm9kdWN0aW9uUXVldWU |n1-standard-1|RUNNING|gcloud compute ssh --zone=us-central1-a ggp-8015122898931865014 |
|ggp-8473594771888441731 |968724465190265928 |EN7rpNm5KhiDk5LozLGQzHUgw7vetLsXKg9wcm9kdWN0aW9uUXVldWU |n1-standard-1|RUNNING|gcloud compute ssh --zone=us-central1-a ggp-8473594771888441731 |
|ggp-9284367781915948616 |954632328327854207 |EL2CrKqvKhjI7Lmp6ems7IABIMO73rS7FyoPcHJvZHVjdGlvblF1ZXVl|n1-standard-1|RUNNING|gcloud compute ssh --zone=us-central1-a ggp-9284367781915948616 |
The jes-job
subcommand will return lots of information about the GCE virtual machine and Docker container running on that virtual machine.
jes-job
needs a operation ID (which can be acquired using the jes-instances
subcommand) and a project. It is recommended to specify --project
, but the default configured for gcloud will be used if one is not provided.
In the example below, the gcloud compute ssh
command is given to SSH to the VM and the docker exec
command is used once SSH'd to that VM to get a shell on the container running the job.
On top of that, jes-job
will pring out disk usage statistics (output of df -h
) as well as a list of files that have been localized (via tree -h /mnt/local-disk
on the GCE VM)
$ cromtool jes-job --project=broad-dsde-dev EN7rpNm5KhiDk5LozLGQzHUgw7vetLsXKg9wcm9kdWN0aW9uUXVldWU
> gcloud compute --project=broad-dsde-dev instances list --format json
Pipeline ID: 968724465190265928
Run ID: EN7rpNm5KhiDk5LozLGQzHUgw7vetLsXKg9wcm9kdWN0aW9uUXVldWU
SSH: gcloud compute --project=broad-dsde-dev ssh --zone=us-central1-a ggp-8473594771888441731
> gcloud compute --project=broad-dsde-dev ssh --zone=us-central1-a ggp-8473594771888441731 sudo "docker ps"
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
95a875c07bb7 broadgdac/tool_gistic2:141 "dumb-init /tmp/ggp-9" 17 hours ago Up 17 hours backstabbing_hodgkin
Docker exec: sudo docker exec -t -i 95a875c07bb7 bash -l
> gcloud compute --project=broad-dsde-dev ssh --zone=us-central1-a ggp-8473594771888441731 "sudo df -h"
Warning: Permanently added '104.197.167.190' (RSA) to the list of known hosts.
Filesystem Size Used Avail Use% Mounted on
rootfs 9.8G 4.2G 5.1G 45% /
udev 10M 0 10M 0% /dev
tmpfs 372M 140K 372M 1% /run
/dev/disk/by-uuid/e8292f07-3714-41a2-ae8e-4a059f383139 9.8G 4.2G 5.1G 45% /
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 743M 212K 743M 1% /run/shm
cgroup 1.9G 0 1.9G 0% /sys/fs/cgroup
/dev/disk/by-uuid/e8292f07-3714-41a2-ae8e-4a059f383139 9.8G 4.2G 5.1G 45% /var/lib/docker/aufs
/dev/sdb 9.8G 270M 9.0G 3% /mnt/local-disk
none 9.8G 4.2G 5.1G 45% /var/lib/docker/aufs/mnt/95a875c07bb726998d5d2176bf7dcad05827b13dd5b9b55c8ec2d5c49081a204
> gcloud compute --project=broad-dsde-dev ssh --zone=us-central1-a ggp-8473594771888441731 "sudo apt-get -qq install tree && tree -h /mnt/local-disk"
Warning: Permanently added '104.197.167.190' (RSA) to the list of known hosts.
/mnt/local-disk
├── [ 14M] all_data_by_genes.txt
├── [ 46K] all_lesions.conf_99.txt
├── [468K] all_thresholded.by_genes.mat
├── [5.1M] all_thresholded.by_genes.txt
├── [8.4K] amp_genes.conf_99.txt
├── [4.5M] amp_qplot.fig
├── [7.1K] amp_qplot.png
├── [2.6K] arraylistfile.txt
├── [2.6K] array_list.txt
Acquire a MySQL connection string for a particular environment. Edit ~/.cromtool/config
to set the users and passwords for each environment:
This is recommended to be used in sub-shell form, $(cromtool mysql --env=dsde-dev)
, as seen in the example below.
NOTE: this will output a string with a MySQL password in it
$ $(cromtool mysql --env=dsde-dev)
Warning: Using a password on the command line interface can be insecure.
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 106677
Server version: 5.6.26 (Google)
Copyright (c) 2000, 2015, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql>
Runs a WDL file through Cromwell in one of two ways:
- Locally with JAR files built via the
cromtool build
subcommand. Requires the--build
flag to specify the name of the build to use. - Remotely via calls to the Cromwell REST API using the
--server
flag
The WDL file, inputs JSON file, and workflow options can be specified one of two ways:
- Via
--wdl
,--inputs
, and--options
flags, which each should specify the path to the appropriate file. - Via
--prefix
which will take the value and append.wdl
,.json
, and.options.json
to find the WDL file, inputs JSON file, and workflow options JSON file respectively.
Example of running a job from the command line runner:
$ cromtool run --build=0.12 --prefix=wdl/jes0 --poll
> java -jar /Users/sfrazer/.cromtool/build/0.12/cromwell-0.12.jar run wdl/jes0.wdl wdl/jes0.json wdl/jes0.options.json
[info] Slf4jLogger started
[info] Default backend: LOCAL
...
Example of running a job on a remote server:
$ cromtool run --server=dsde-dev --prefix=wdl/jes0 --poll
POST https://cromwell.dsde-dev.broadinstitute.org:443/workflows/v1
Content-Length: 1658
...
When running with --server
, HTTP requests and responses will be printed to standard out. The --poll
option will cause cromtool to poll the workflow after it's submitted every 5 seconds. These HTTP requests and responses will be printed to standard out as well.
Manages builds of Cromwell stored in ~/.cromtool/build/
Add a build from tree-like 0.11
on the Cromwell repository:
$ cromtool build add 0.11
> git init
Initialized empty Git repository in /Users/sfrazer/.cromtool/tmp/tmpr5dqmcz4/.git/
> git remote add origin https://github.com/broadinstitute/cromwell.git
> git fetch
...
> git reset --hard 0.11
HEAD is now at ec83068 Merge branch 'develop'
> sbt assembly
...
List builds:
$ cromtool build ls
|Name|Git tree-like|JAR Path |
|----|-------------|-----------------------------------------------------|
|0.11|0.11 |/Users/sfrazer/.cromtool/build/0.11/cromwell-0.11.jar|
|0.12|0.12 |/Users/sfrazer/.cromtool/build/0.12/cromwell-0.12.jar|
Remove build:
$ cromtool build rm 0.11
Given a workflow ID and a --server
, this will generate HTTPie commands for querying status, outputs, logs, and metadata:
$ cromtool query --server=dsde-dev 7e88bcb9-57c5-44ea-8319-0e2179e5a327
http 'https://cromwell.dsde-dev.broadinstitute.org:443/workflows/v1/7e88bcb9-57c5-44ea-8319-0e2179e5a327/status' 'Authorization: Bearer secret_access_token'
http 'https://cromwell.dsde-dev.broadinstitute.org:443/workflows/v1/7e88bcb9-57c5-44ea-8319-0e2179e5a327/outputs' 'Authorization: Bearer secret_access_token'
http 'https://cromwell.dsde-dev.broadinstitute.org:443/workflows/v1/7e88bcb9-57c5-44ea-8319-0e2179e5a327/logs' 'Authorization: Bearer secret_access_token'
http 'https://cromwell.dsde-dev.broadinstitute.org:443/workflows/v1/7e88bcb9-57c5-44ea-8319-0e2179e5a327/metadata' 'Authorization: Bearer secret_access_token'
Given a workflow ID and a --server
, this will use the /workflows/v1/<ID>/metadata
endpoint to build a table of the sub-job statuses:
$ cromtool status --server=dsde-dev ac61c1f2-21ae-46ae-a3fa-6df8ff43ad86
Workflow status: Succeeded
|FQN |status|
|---------|------|
|sfrazer.x|Done |
|sfrazer.z|Done |
|sfrazer.y|Done |