Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save fomightez/99f9056649e1ca8128563602b2eb95f9 to your computer and use it in GitHub Desktop.
Save fomightez/99f9056649e1ca8128563602b2eb95f9 to your computer and use it in GitHub Desktop.
August SO question & answer https://stackoverflow.com/a/78893777/8508004 : Snakemake - adapt the checkpoints example pipeline to the new version 8.18.1
{
"cells": [
{
"cell_type": "markdown",
"id": "c054608e-3102-4712-bab0-f50e24ff0cea",
"metadata": {},
"source": [
"# for SO https://stackoverflow.com/q/78892547/8508004 Snakemake - adapt the checkpoints example pipeline to the new version 8.18.1\n",
"\n",
"Set-up for situation presented.\n",
"\n",
"(Used session launched from [here](https://gist.github.com/fomightez/6773dedf6d5132795dd4245a18c066eb); go there and click on '`launch binder`' to get started. That was because when I tried with an older Python version from sessions launched from [binder's requirements.txt example repo](https://github.com/binder-examples/requirements), `pip` reported for Snakemake 8.18.1, I needed 3.11(?). [Although it seemed to work to install with `pip` on Anaconda cloud with Python 3.10.])\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "978ca02e-a7b2-45de-8e26-31d68e506770",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Collecting snakemake==8.18.1\n",
" Downloading snakemake-8.18.1-py3-none-any.whl.metadata (2.5 kB)\n",
"Collecting appdirs (from snakemake==8.18.1)\n",
" Downloading appdirs-1.4.4-py2.py3-none-any.whl.metadata (9.0 kB)\n",
"Collecting immutables (from snakemake==8.18.1)\n",
" Downloading immutables-0.20-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.6 kB)\n",
"Collecting configargparse (from snakemake==8.18.1)\n",
" Downloading ConfigArgParse-1.7-py3-none-any.whl.metadata (23 kB)\n",
"Collecting connection-pool>=0.0.3 (from snakemake==8.18.1)\n",
" Downloading connection_pool-0.0.3.tar.gz (3.8 kB)\n",
" Preparing metadata (setup.py) ... \u001b[?25ldone\n",
"\u001b[?25hCollecting datrie (from snakemake==8.18.1)\n",
" Downloading datrie-0.8.2.tar.gz (63 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m63.3/63.3 kB\u001b[0m \u001b[31m4.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25h Installing build dependencies ... \u001b[?25ldone\n",
"\u001b[?25h Getting requirements to build wheel ... \u001b[?25ldone\n",
"\u001b[?25h Installing backend dependencies ... \u001b[?25ldone\n",
"\u001b[?25h Preparing metadata (pyproject.toml) ... \u001b[?25ldone\n",
"\u001b[?25hCollecting docutils (from snakemake==8.18.1)\n",
" Downloading docutils-0.21.2-py3-none-any.whl.metadata (2.8 kB)\n",
"Collecting gitpython (from snakemake==8.18.1)\n",
" Downloading GitPython-3.1.43-py3-none-any.whl.metadata (13 kB)\n",
"Collecting humanfriendly (from snakemake==8.18.1)\n",
" Downloading humanfriendly-10.0-py2.py3-none-any.whl.metadata (9.2 kB)\n",
"Requirement already satisfied: jinja2<4.0,>=3.0 in /srv/conda/envs/notebook/lib/python3.11/site-packages (from snakemake==8.18.1) (3.1.4)\n",
"Requirement already satisfied: jsonschema in /srv/conda/envs/notebook/lib/python3.11/site-packages (from snakemake==8.18.1) (4.22.0)\n",
"Requirement already satisfied: nbformat in /srv/conda/envs/notebook/lib/python3.11/site-packages (from snakemake==8.18.1) (5.10.4)\n",
"Requirement already satisfied: packaging in /srv/conda/envs/notebook/lib/python3.11/site-packages (from snakemake==8.18.1) (24.1)\n",
"Requirement already satisfied: psutil in /srv/conda/envs/notebook/lib/python3.11/site-packages (from snakemake==8.18.1) (5.9.8)\n",
"Collecting pulp<2.9,>=2.3.1 (from snakemake==8.18.1)\n",
" Downloading PuLP-2.8.0-py3-none-any.whl.metadata (5.4 kB)\n",
"Requirement already satisfied: pyyaml in /srv/conda/envs/notebook/lib/python3.11/site-packages (from snakemake==8.18.1) (6.0.1)\n",
"Requirement already satisfied: requests<3.0,>=2.8.1 in /srv/conda/envs/notebook/lib/python3.11/site-packages (from snakemake==8.18.1) (2.32.3)\n",
"Collecting reretry (from snakemake==8.18.1)\n",
" Downloading reretry-0.11.8-py2.py3-none-any.whl.metadata (5.5 kB)\n",
"Collecting smart-open<8.0,>=4.0 (from snakemake==8.18.1)\n",
" Downloading smart_open-7.0.4-py3-none-any.whl.metadata (23 kB)\n",
"Collecting snakemake-interface-executor-plugins<10.0,>=9.2.0 (from snakemake==8.18.1)\n",
" Downloading snakemake_interface_executor_plugins-9.2.0-py3-none-any.whl.metadata (7.7 kB)\n",
"Collecting snakemake-interface-common<2.0,>=1.17.0 (from snakemake==8.18.1)\n",
" Downloading snakemake_interface_common-1.17.3-py3-none-any.whl.metadata (760 bytes)\n",
"Collecting snakemake-interface-storage-plugins<4.0,>=3.2.3 (from snakemake==8.18.1)\n",
" Downloading snakemake_interface_storage_plugins-3.3.0-py3-none-any.whl.metadata (9.8 kB)\n",
"Collecting snakemake-interface-report-plugins<2.0.0,>=1.0.0 (from snakemake==8.18.1)\n",
" Downloading snakemake_interface_report_plugins-1.0.0-py3-none-any.whl.metadata (3.4 kB)\n",
"Collecting stopit (from snakemake==8.18.1)\n",
" Downloading stopit-1.1.2.tar.gz (18 kB)\n",
" Preparing metadata (setup.py) ... \u001b[?25ldone\n",
"\u001b[?25hCollecting tabulate (from snakemake==8.18.1)\n",
" Downloading tabulate-0.9.0-py3-none-any.whl.metadata (34 kB)\n",
"Collecting throttler (from snakemake==8.18.1)\n",
" Downloading throttler-1.2.2-py3-none-any.whl.metadata (7.4 kB)\n",
"Collecting toposort<2.0,>=1.10 (from snakemake==8.18.1)\n",
" Downloading toposort-1.10-py3-none-any.whl.metadata (4.1 kB)\n",
"Collecting wrapt (from snakemake==8.18.1)\n",
" Downloading wrapt-1.16.0-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.6 kB)\n",
"Collecting yte<2.0,>=1.5.1 (from snakemake==8.18.1)\n",
" Downloading yte-1.5.4-py3-none-any.whl.metadata (3.4 kB)\n",
"Collecting dpath<3.0.0,>=2.1.6 (from snakemake==8.18.1)\n",
" Downloading dpath-2.2.0-py3-none-any.whl.metadata (15 kB)\n",
"Collecting conda-inject<2.0,>=1.3.1 (from snakemake==8.18.1)\n",
" Downloading conda_inject-1.3.2-py3-none-any.whl.metadata (855 bytes)\n",
"Requirement already satisfied: MarkupSafe>=2.0 in /srv/conda/envs/notebook/lib/python3.11/site-packages (from jinja2<4.0,>=3.0->snakemake==8.18.1) (2.1.5)\n",
"Requirement already satisfied: charset-normalizer<4,>=2 in /srv/conda/envs/notebook/lib/python3.11/site-packages (from requests<3.0,>=2.8.1->snakemake==8.18.1) (3.3.2)\n",
"Requirement already satisfied: idna<4,>=2.5 in /srv/conda/envs/notebook/lib/python3.11/site-packages (from requests<3.0,>=2.8.1->snakemake==8.18.1) (3.7)\n",
"Requirement already satisfied: urllib3<3,>=1.21.1 in /srv/conda/envs/notebook/lib/python3.11/site-packages (from requests<3.0,>=2.8.1->snakemake==8.18.1) (2.2.2)\n",
"Requirement already satisfied: certifi>=2017.4.17 in /srv/conda/envs/notebook/lib/python3.11/site-packages (from requests<3.0,>=2.8.1->snakemake==8.18.1) (2024.6.2)\n",
"Collecting argparse-dataclass<3.0.0,>=2.0.0 (from snakemake-interface-common<2.0,>=1.17.0->snakemake==8.18.1)\n",
" Downloading argparse_dataclass-2.0.0-py3-none-any.whl.metadata (7.2 kB)\n",
"Collecting plac<2.0.0,>=1.3.4 (from yte<2.0,>=1.5.1->snakemake==8.18.1)\n",
" Downloading plac-1.4.3-py2.py3-none-any.whl.metadata (5.9 kB)\n",
"Collecting gitdb<5,>=4.0.1 (from gitpython->snakemake==8.18.1)\n",
" Downloading gitdb-4.0.11-py3-none-any.whl.metadata (1.2 kB)\n",
"Requirement already satisfied: attrs>=22.2.0 in /srv/conda/envs/notebook/lib/python3.11/site-packages (from jsonschema->snakemake==8.18.1) (23.2.0)\n",
"Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /srv/conda/envs/notebook/lib/python3.11/site-packages (from jsonschema->snakemake==8.18.1) (2023.12.1)\n",
"Requirement already satisfied: referencing>=0.28.4 in /srv/conda/envs/notebook/lib/python3.11/site-packages (from jsonschema->snakemake==8.18.1) (0.35.1)\n",
"Requirement already satisfied: rpds-py>=0.7.1 in /srv/conda/envs/notebook/lib/python3.11/site-packages (from jsonschema->snakemake==8.18.1) (0.18.1)\n",
"Requirement already satisfied: fastjsonschema>=2.15 in /srv/conda/envs/notebook/lib/python3.11/site-packages (from nbformat->snakemake==8.18.1) (2.20.0)\n",
"Requirement already satisfied: jupyter-core!=5.0.*,>=4.12 in /srv/conda/envs/notebook/lib/python3.11/site-packages (from nbformat->snakemake==8.18.1) (5.7.2)\n",
"Requirement already satisfied: traitlets>=5.1 in /srv/conda/envs/notebook/lib/python3.11/site-packages (from nbformat->snakemake==8.18.1) (5.14.3)\n",
"Collecting smmap<6,>=3.0.1 (from gitdb<5,>=4.0.1->gitpython->snakemake==8.18.1)\n",
" Downloading smmap-5.0.1-py3-none-any.whl.metadata (4.3 kB)\n",
"Requirement already satisfied: platformdirs>=2.5 in /srv/conda/envs/notebook/lib/python3.11/site-packages (from jupyter-core!=5.0.*,>=4.12->nbformat->snakemake==8.18.1) (4.2.2)\n",
"Downloading snakemake-8.18.1-py3-none-any.whl (308 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m308.4/308.4 kB\u001b[0m \u001b[31m14.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hDownloading conda_inject-1.3.2-py3-none-any.whl (4.1 kB)\n",
"Downloading dpath-2.2.0-py3-none-any.whl (17 kB)\n",
"Downloading PuLP-2.8.0-py3-none-any.whl (17.7 MB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m17.7/17.7 MB\u001b[0m \u001b[31m37.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m00:01\u001b[0m00:01\u001b[0m\n",
"\u001b[?25hDownloading smart_open-7.0.4-py3-none-any.whl (61 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m61.2/61.2 kB\u001b[0m \u001b[31m9.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hDownloading snakemake_interface_common-1.17.3-py3-none-any.whl (14 kB)\n",
"Downloading ConfigArgParse-1.7-py3-none-any.whl (25 kB)\n",
"Downloading snakemake_interface_executor_plugins-9.2.0-py3-none-any.whl (22 kB)\n",
"Downloading snakemake_interface_report_plugins-1.0.0-py3-none-any.whl (6.9 kB)\n",
"Downloading snakemake_interface_storage_plugins-3.3.0-py3-none-any.whl (15 kB)\n",
"Downloading reretry-0.11.8-py2.py3-none-any.whl (5.6 kB)\n",
"Downloading throttler-1.2.2-py3-none-any.whl (7.6 kB)\n",
"Downloading toposort-1.10-py3-none-any.whl (8.5 kB)\n",
"Downloading wrapt-1.16.0-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (80 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m80.7/80.7 kB\u001b[0m \u001b[31m12.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hDownloading yte-1.5.4-py3-none-any.whl (7.7 kB)\n",
"Downloading appdirs-1.4.4-py2.py3-none-any.whl (9.6 kB)\n",
"Downloading docutils-0.21.2-py3-none-any.whl (587 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m587.4/587.4 kB\u001b[0m \u001b[31m66.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hDownloading GitPython-3.1.43-py3-none-any.whl (207 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m207.3/207.3 kB\u001b[0m \u001b[31m26.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hDownloading humanfriendly-10.0-py2.py3-none-any.whl (86 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m86.8/86.8 kB\u001b[0m \u001b[31m6.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hDownloading immutables-0.20-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (99 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m99.7/99.7 kB\u001b[0m \u001b[31m13.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hDownloading tabulate-0.9.0-py3-none-any.whl (35 kB)\n",
"Downloading argparse_dataclass-2.0.0-py3-none-any.whl (8.8 kB)\n",
"Downloading gitdb-4.0.11-py3-none-any.whl (62 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m62.7/62.7 kB\u001b[0m \u001b[31m12.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hDownloading plac-1.4.3-py2.py3-none-any.whl (22 kB)\n",
"Downloading smmap-5.0.1-py3-none-any.whl (24 kB)\n",
"Building wheels for collected packages: connection-pool, datrie, stopit\n",
" Building wheel for connection-pool (setup.py) ... \u001b[?25ldone\n",
"\u001b[?25h Created wheel for connection-pool: filename=connection_pool-0.0.3-py3-none-any.whl size=4061 sha256=4918f8d8a58b46512f41fbc00b04d40c5818a205139a504b11480239f6b82b15\n",
" Stored in directory: /home/jovyan/.cache/pip/wheels/2b/73/ac/bd9807cbc47e95c436b5d5afe6cca8299fdc69bf7bd9930618\n",
" Building wheel for datrie (pyproject.toml) ... \u001b[?25ldone\n",
"\u001b[?25h Created wheel for datrie: filename=datrie-0.8.2-cp311-cp311-linux_x86_64.whl size=179203 sha256=38434f3f5d7b73ff12a19217ce91932b9b1e2506f4dc8ac5bdb3b4ad51c5016a\n",
" Stored in directory: /home/jovyan/.cache/pip/wheels/8b/20/f4/aeacf0184f20e22473a01257c56c74cc976e1cd838a01de6d5\n",
" Building wheel for stopit (setup.py) ... \u001b[?25ldone\n",
"\u001b[?25h Created wheel for stopit: filename=stopit-1.1.2-py3-none-any.whl size=11939 sha256=63c60fd9b22d00767c5f853db696f22d50b900bd84771db6ec8a2028e2279b42\n",
" Stored in directory: /home/jovyan/.cache/pip/wheels/da/77/2d/adbc56bc4db95ad80c6d4e71cd69e2d9d122174904342e3f7f\n",
"Successfully built connection-pool datrie stopit\n",
"Installing collected packages: toposort, throttler, stopit, plac, connection-pool, appdirs, wrapt, tabulate, smmap, reretry, pulp, immutables, humanfriendly, dpath, docutils, datrie, configargparse, conda-inject, argparse-dataclass, yte, snakemake-interface-common, smart-open, gitdb, snakemake-interface-storage-plugins, snakemake-interface-report-plugins, snakemake-interface-executor-plugins, gitpython, snakemake\n",
"Successfully installed appdirs-1.4.4 argparse-dataclass-2.0.0 conda-inject-1.3.2 configargparse-1.7 connection-pool-0.0.3 datrie-0.8.2 docutils-0.21.2 dpath-2.2.0 gitdb-4.0.11 gitpython-3.1.43 humanfriendly-10.0 immutables-0.20 plac-1.4.3 pulp-2.8.0 reretry-0.11.8 smart-open-7.0.4 smmap-5.0.1 snakemake-8.18.1 snakemake-interface-common-1.17.3 snakemake-interface-executor-plugins-9.2.0 snakemake-interface-report-plugins-1.0.0 snakemake-interface-storage-plugins-3.3.0 stopit-1.1.2 tabulate-0.9.0 throttler-1.2.2 toposort-1.10 wrapt-1.16.0 yte-1.5.4\n",
"Note: you may need to restart the kernel to use updated packages.\n"
]
}
],
"source": [
"%pip install snakemake==8.18.1"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "33e26306-1963-4ff2-9589-1803d3c09ec3",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"8.18.1\n"
]
}
],
"source": [
"!snakemake --version"
]
},
{
"cell_type": "markdown",
"id": "edae733c-a387-404d-94f9-32bf9f824b8a",
"metadata": {},
"source": [
"Worked! Installed specified version.\n",
"\n",
"\n",
"----------------\n",
"\n",
"\n",
"Now to prepare for the Snakemake snakefile put forth...\n",
"\n",
"Code put forth as example is old and was from [bottom here at 'EdwardsLab's post: 'Snakemake - How to use snakemake checkpoints'](https://edwards.flinders.edu.au/how-to-use-snakemake-checkpoints/).\n",
"\n",
"Issue is that the code there has flag 'directory' used under `input` directive for the rule `make_third_files` and so you get `The flag 'directory' used in rule make_third_files is only valid for outputs, not inputs.`.\n",
"\n",
"The changes needed are not using `directory()` flag with `input` directives as covered [here](https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#directories-as-outputs) or else you get the error `The flag 'directory' used in rule make_third_files is only valid for outputs, not inputs.`. **However, directories themseleves are allowed as `input` for Snakemake so just use the path to them without the flag.** \n",
"Plus to get this to work, you need the two output directories as `input` for the default rule. (I'm not sure how it ever worked before because I don't see how rules `make_some_files` & `make_more_files` would get triggered otherwise.)\n",
"\n",
"To work this out, I originally put the original provided code between the `'''` below and then edit it and re-ran the next two cells, iterating on that until it worked. (I then later started a fresh session with what I worked out and re-ran to get the final-run 'clean' version seen here.)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "ec409220-49ac-47a1-8525-e791ac0fcc30",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Writing 'ns' (str) to file 'Snakefile'.\n"
]
}
],
"source": [
"ns='''OUTDIR = \"first_directory\"\n",
"SNDDIR = \"second_directory\"\n",
"THRDIR = \"third_directory\"\n",
"\n",
"\n",
"def combine(wildcards):\n",
" # read the first set of outputs\n",
" ck_output = checkpoints.make_some_files.get(**wildcards).output[0]\n",
" FIRSTS, = glob_wildcards(os.path.join(ck_output, \"{sample}.txt\"))\n",
" # read the second set of outputs\n",
" sn_output = checkpoints.make_more_files.get(**wildcards).output[0]\n",
" SECONDS, = glob_wildcards(os.path.join(sn_output, \"{smpl}.txt\"))\n",
" return expand(os.path.join(THRDIR, \"{first}.{second}.tsv\"), first=FIRSTS, second=SECONDS)\n",
"\n",
"rule all:\n",
" input: \n",
" OUTDIR,\n",
" SNDDIR,\n",
" combine\n",
"\n",
"checkpoint make_some_files:\n",
" output:\n",
" directory(OUTDIR)\n",
" shell:\n",
" \"\"\"\n",
" mkdir {output};\n",
" N=$(((RANDOM%5)+1));\n",
" for D in $(seq $N); do\n",
" touch {output}/$RANDOM.txt\n",
" done\n",
" \"\"\"\n",
"\n",
"checkpoint make_more_files:\n",
" output:\n",
" directory(SNDDIR)\n",
" shell:\n",
" \"\"\"\n",
" mkdir {output};\n",
" N=$(((RANDOM%5)+1));\n",
" for D in $(seq $N); do\n",
" touch {output}/$RANDOM.txt\n",
" done\n",
" \"\"\"\n",
"\n",
"rule make_third_files:\n",
" input:\n",
" OUTDIR,\n",
" SNDDIR,\n",
" output:\n",
" os.path.join(THRDIR, \"{first}.{second}.tsv\")\n",
" shell:\n",
" \"\"\"\n",
" touch {output}\n",
" \"\"\"\n",
" '''\n",
"%store ns >Snakefile"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "2381d97b-22e6-4ac8-8093-393d8f296011",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[33mAssuming unrestricted shared filesystem usage.\u001b[0m\n",
"\u001b[33mBuilding DAG of jobs...\u001b[0m\n",
"\u001b[33mUsing shell: /usr/bin/bash\u001b[0m\n",
"\u001b[33mProvided cores: 1 (use --cores to define parallelism)\u001b[0m\n",
"\u001b[33mRules claiming more threads will be scaled down.\u001b[0m\n",
"\u001b[33mJob stats:\n",
"job count\n",
"--------------- -------\n",
"all 1\n",
"make_more_files 1\n",
"make_some_files 1\n",
"total 3\n",
"\u001b[0m\n",
"\u001b[33mSelect jobs to execute...\u001b[0m\n",
"\u001b[33mExecute 1 jobs...\u001b[0m\n",
"\u001b[32m\u001b[0m\n",
"\u001b[32m[Tue Aug 20 17:30:46 2024]\u001b[0m\n",
"\u001b[32mlocalcheckpoint make_more_files:\n",
" output: second_directory\n",
" jobid: 2\n",
" reason: Missing output files: second_directory\n",
" resources: tmpdir=/tmp\u001b[0m\n",
"\u001b[33mDAG of jobs will be updated after completion.\u001b[0m\n",
"\u001b[32m\u001b[0m\n",
"\u001b[32m[Tue Aug 20 17:30:46 2024]\u001b[0m\n",
"\u001b[32mFinished job 2.\u001b[0m\n",
"\u001b[32m1 of 3 steps (33%) done\u001b[0m\n",
"\u001b[33mSelect jobs to execute...\u001b[0m\n",
"\u001b[33mExecute 1 jobs...\u001b[0m\n",
"\u001b[32m\u001b[0m\n",
"\u001b[32m[Tue Aug 20 17:30:46 2024]\u001b[0m\n",
"\u001b[32mlocalcheckpoint make_some_files:\n",
" output: first_directory\n",
" jobid: 1\n",
" reason: Missing output files: first_directory\n",
" resources: tmpdir=/tmp\u001b[0m\n",
"\u001b[33mDAG of jobs will be updated after completion.\u001b[0m\n",
"\u001b[32m\u001b[0m\n",
"\u001b[32m[Tue Aug 20 17:30:46 2024]\u001b[0m\n",
"\u001b[32mFinished job 1.\u001b[0m\n",
"\u001b[32m2 of 3 steps (67%) done\u001b[0m\n",
"\u001b[33mSelect jobs to execute...\u001b[0m\n",
"\u001b[33mExecute 1 jobs...\u001b[0m\n",
"\u001b[32m\u001b[0m\n",
"\u001b[32m[Tue Aug 20 17:30:46 2024]\u001b[0m\n",
"\u001b[32mlocalrule make_third_files:\n",
" input: first_directory, second_directory\n",
" output: third_directory/32727.17967.tsv\n",
" jobid: 15\n",
" reason: Missing output files: third_directory/32727.17967.tsv\n",
" wildcards: first=32727, second=17967\n",
" resources: tmpdir=/tmp\u001b[0m\n",
"\u001b[32m\u001b[0m\n",
"\u001b[32m[Tue Aug 20 17:30:46 2024]\u001b[0m\n",
"\u001b[32mFinished job 15.\u001b[0m\n",
"\u001b[32m3 of 15 steps (20%) done\u001b[0m\n",
"\u001b[33mSelect jobs to execute...\u001b[0m\n",
"\u001b[33mExecute 1 jobs...\u001b[0m\n",
"\u001b[32m\u001b[0m\n",
"\u001b[32m[Tue Aug 20 17:30:46 2024]\u001b[0m\n",
"\u001b[32mlocalrule make_third_files:\n",
" input: first_directory, second_directory\n",
" output: third_directory/20288.17078.tsv\n",
" jobid: 8\n",
" reason: Missing output files: third_directory/20288.17078.tsv\n",
" wildcards: first=20288, second=17078\n",
" resources: tmpdir=/tmp\u001b[0m\n",
"\u001b[32m\u001b[0m\n",
"\u001b[32m[Tue Aug 20 17:30:46 2024]\u001b[0m\n",
"\u001b[32mFinished job 8.\u001b[0m\n",
"\u001b[32m4 of 15 steps (27%) done\u001b[0m\n",
"\u001b[33mSelect jobs to execute...\u001b[0m\n",
"\u001b[33mExecute 1 jobs...\u001b[0m\n",
"\u001b[32m\u001b[0m\n",
"\u001b[32m[Tue Aug 20 17:30:46 2024]\u001b[0m\n",
"\u001b[32mlocalrule make_third_files:\n",
" input: first_directory, second_directory\n",
" output: third_directory/24182.17967.tsv\n",
" jobid: 12\n",
" reason: Missing output files: third_directory/24182.17967.tsv\n",
" wildcards: first=24182, second=17967\n",
" resources: tmpdir=/tmp\u001b[0m\n",
"\u001b[32m\u001b[0m\n",
"\u001b[32m[Tue Aug 20 17:30:46 2024]\u001b[0m\n",
"\u001b[32mFinished job 12.\u001b[0m\n",
"\u001b[32m5 of 15 steps (33%) done\u001b[0m\n",
"\u001b[33mSelect jobs to execute...\u001b[0m\n",
"\u001b[33mExecute 1 jobs...\u001b[0m\n",
"\u001b[32m\u001b[0m\n",
"\u001b[32m[Tue Aug 20 17:30:46 2024]\u001b[0m\n",
"\u001b[32mlocalrule make_third_files:\n",
" input: first_directory, second_directory\n",
" output: third_directory/9030.8514.tsv\n",
" jobid: 16\n",
" reason: Missing output files: third_directory/9030.8514.tsv\n",
" wildcards: first=9030, second=8514\n",
" resources: tmpdir=/tmp\u001b[0m\n",
"\u001b[32m\u001b[0m\n",
"\u001b[32m[Tue Aug 20 17:30:46 2024]\u001b[0m\n",
"\u001b[32mFinished job 16.\u001b[0m\n",
"\u001b[32m6 of 15 steps (40%) done\u001b[0m\n",
"\u001b[33mSelect jobs to execute...\u001b[0m\n",
"\u001b[33mExecute 1 jobs...\u001b[0m\n",
"\u001b[32m\u001b[0m\n",
"\u001b[32m[Tue Aug 20 17:30:46 2024]\u001b[0m\n",
"\u001b[32mlocalrule make_third_files:\n",
" input: first_directory, second_directory\n",
" output: third_directory/24182.17078.tsv\n",
" jobid: 11\n",
" reason: Missing output files: third_directory/24182.17078.tsv\n",
" wildcards: first=24182, second=17078\n",
" resources: tmpdir=/tmp\u001b[0m\n",
"\u001b[32m\u001b[0m\n",
"\u001b[32m[Tue Aug 20 17:30:46 2024]\u001b[0m\n",
"\u001b[32mFinished job 11.\u001b[0m\n",
"\u001b[32m7 of 15 steps (47%) done\u001b[0m\n",
"\u001b[33mSelect jobs to execute...\u001b[0m\n",
"\u001b[33mExecute 1 jobs...\u001b[0m\n",
"\u001b[32m\u001b[0m\n",
"\u001b[32m[Tue Aug 20 17:30:46 2024]\u001b[0m\n",
"\u001b[32mlocalrule make_third_files:\n",
" input: first_directory, second_directory\n",
" output: third_directory/32727.8514.tsv\n",
" jobid: 13\n",
" reason: Missing output files: third_directory/32727.8514.tsv\n",
" wildcards: first=32727, second=8514\n",
" resources: tmpdir=/tmp\u001b[0m\n",
"\u001b[32m\u001b[0m\n",
"\u001b[32m[Tue Aug 20 17:30:46 2024]\u001b[0m\n",
"\u001b[32mFinished job 13.\u001b[0m\n",
"\u001b[32m8 of 15 steps (53%) done\u001b[0m\n",
"\u001b[33mSelect jobs to execute...\u001b[0m\n",
"\u001b[33mExecute 1 jobs...\u001b[0m\n",
"\u001b[32m\u001b[0m\n",
"\u001b[32m[Tue Aug 20 17:30:46 2024]\u001b[0m\n",
"\u001b[32mlocalrule make_third_files:\n",
" input: first_directory, second_directory\n",
" output: third_directory/20288.17967.tsv\n",
" jobid: 9\n",
" reason: Missing output files: third_directory/20288.17967.tsv\n",
" wildcards: first=20288, second=17967\n",
" resources: tmpdir=/tmp\u001b[0m\n",
"\u001b[32m\u001b[0m\n",
"\u001b[32m[Tue Aug 20 17:30:46 2024]\u001b[0m\n",
"\u001b[32mFinished job 9.\u001b[0m\n",
"\u001b[32m9 of 15 steps (60%) done\u001b[0m\n",
"\u001b[33mSelect jobs to execute...\u001b[0m\n",
"\u001b[33mExecute 1 jobs...\u001b[0m\n",
"\u001b[32m\u001b[0m\n",
"\u001b[32m[Tue Aug 20 17:30:46 2024]\u001b[0m\n",
"\u001b[32mlocalrule make_third_files:\n",
" input: first_directory, second_directory\n",
" output: third_directory/9030.17078.tsv\n",
" jobid: 17\n",
" reason: Missing output files: third_directory/9030.17078.tsv\n",
" wildcards: first=9030, second=17078\n",
" resources: tmpdir=/tmp\u001b[0m\n",
"\u001b[32m\u001b[0m\n",
"\u001b[32m[Tue Aug 20 17:30:46 2024]\u001b[0m\n",
"\u001b[32mFinished job 17.\u001b[0m\n",
"\u001b[32m10 of 15 steps (67%) done\u001b[0m\n",
"\u001b[33mSelect jobs to execute...\u001b[0m\n",
"\u001b[33mExecute 1 jobs...\u001b[0m\n",
"\u001b[32m\u001b[0m\n",
"\u001b[32m[Tue Aug 20 17:30:46 2024]\u001b[0m\n",
"\u001b[32mlocalrule make_third_files:\n",
" input: first_directory, second_directory\n",
" output: third_directory/9030.17967.tsv\n",
" jobid: 18\n",
" reason: Missing output files: third_directory/9030.17967.tsv\n",
" wildcards: first=9030, second=17967\n",
" resources: tmpdir=/tmp\u001b[0m\n",
"\u001b[32m\u001b[0m\n",
"\u001b[32m[Tue Aug 20 17:30:46 2024]\u001b[0m\n",
"\u001b[32mFinished job 18.\u001b[0m\n",
"\u001b[32m11 of 15 steps (73%) done\u001b[0m\n",
"\u001b[33mSelect jobs to execute...\u001b[0m\n",
"\u001b[33mExecute 1 jobs...\u001b[0m\n",
"\u001b[32m\u001b[0m\n",
"\u001b[32m[Tue Aug 20 17:30:46 2024]\u001b[0m\n",
"\u001b[32mlocalrule make_third_files:\n",
" input: first_directory, second_directory\n",
" output: third_directory/32727.17078.tsv\n",
" jobid: 14\n",
" reason: Missing output files: third_directory/32727.17078.tsv\n",
" wildcards: first=32727, second=17078\n",
" resources: tmpdir=/tmp\u001b[0m\n",
"\u001b[32m\u001b[0m\n",
"\u001b[32m[Tue Aug 20 17:30:46 2024]\u001b[0m\n",
"\u001b[32mFinished job 14.\u001b[0m\n",
"\u001b[32m12 of 15 steps (80%) done\u001b[0m\n",
"\u001b[33mSelect jobs to execute...\u001b[0m\n",
"\u001b[33mExecute 1 jobs...\u001b[0m\n",
"\u001b[32m\u001b[0m\n",
"\u001b[32m[Tue Aug 20 17:30:46 2024]\u001b[0m\n",
"\u001b[32mlocalrule make_third_files:\n",
" input: first_directory, second_directory\n",
" output: third_directory/20288.8514.tsv\n",
" jobid: 7\n",
" reason: Missing output files: third_directory/20288.8514.tsv\n",
" wildcards: first=20288, second=8514\n",
" resources: tmpdir=/tmp\u001b[0m\n",
"\u001b[32m\u001b[0m\n",
"\u001b[32m[Tue Aug 20 17:30:46 2024]\u001b[0m\n",
"\u001b[32mFinished job 7.\u001b[0m\n",
"\u001b[32m13 of 15 steps (87%) done\u001b[0m\n",
"\u001b[33mSelect jobs to execute...\u001b[0m\n",
"\u001b[33mExecute 1 jobs...\u001b[0m\n",
"\u001b[32m\u001b[0m\n",
"\u001b[32m[Tue Aug 20 17:30:46 2024]\u001b[0m\n",
"\u001b[32mlocalrule make_third_files:\n",
" input: first_directory, second_directory\n",
" output: third_directory/24182.8514.tsv\n",
" jobid: 10\n",
" reason: Missing output files: third_directory/24182.8514.tsv\n",
" wildcards: first=24182, second=8514\n",
" resources: tmpdir=/tmp\u001b[0m\n",
"\u001b[32m\u001b[0m\n",
"\u001b[32m[Tue Aug 20 17:30:46 2024]\u001b[0m\n",
"\u001b[32mFinished job 10.\u001b[0m\n",
"\u001b[32m14 of 15 steps (93%) done\u001b[0m\n",
"\u001b[33mSelect jobs to execute...\u001b[0m\n",
"\u001b[33mExecute 1 jobs...\u001b[0m\n",
"\u001b[32m\u001b[0m\n",
"\u001b[32m[Tue Aug 20 17:30:46 2024]\u001b[0m\n",
"\u001b[32mlocalrule all:\n",
" input: first_directory, second_directory, third_directory/20288.8514.tsv, third_directory/20288.17078.tsv, third_directory/20288.17967.tsv, third_directory/24182.8514.tsv, third_directory/24182.17078.tsv, third_directory/24182.17967.tsv, third_directory/32727.8514.tsv, third_directory/32727.17078.tsv, third_directory/32727.17967.tsv, third_directory/9030.8514.tsv, third_directory/9030.17078.tsv, third_directory/9030.17967.tsv\n",
" jobid: 0\n",
" reason: Input files updated by another job: third_directory/20288.8514.tsv, third_directory/9030.17967.tsv, third_directory/9030.8514.tsv, third_directory/9030.17078.tsv, third_directory/32727.8514.tsv, third_directory/32727.17967.tsv, third_directory/20288.17967.tsv, third_directory/32727.17078.tsv, third_directory/20288.17078.tsv, third_directory/24182.17078.tsv, third_directory/24182.17967.tsv, third_directory/24182.8514.tsv\n",
" resources: tmpdir=/tmp\u001b[0m\n",
"\u001b[32m\u001b[0m\n",
"\u001b[32m[Tue Aug 20 17:30:46 2024]\u001b[0m\n",
"\u001b[32mFinished job 0.\u001b[0m\n",
"\u001b[32m15 of 15 steps (100%) done\u001b[0m\n",
"\u001b[33mComplete log: .snakemake/log/2024-08-20T173046.786457.snakemake.log\u001b[0m\n"
]
}
],
"source": [
"!snakemake -c 1"
]
},
{
"cell_type": "markdown",
"id": "b991d560-2a70-4608-82c0-abcfbff46c7d",
"metadata": {},
"source": [
"Works. \n",
"\n",
"Let's show the contents of the directories as verificaiton."
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "341dfa93-ffb3-473f-95bb-10b9d1a0431e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"20288.txt 24182.txt 32727.txt 9030.txt\n"
]
}
],
"source": [
"ls first_directory/"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "968ff82a-c275-467d-8558-e40033da48dd",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"17078.txt 17967.txt 8514.txt\n"
]
}
],
"source": [
"ls second_directory/"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "5a09259e-3d08-4714-928c-b763270d55c7",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"20288.17078.tsv 24182.17078.tsv 32727.17078.tsv 9030.17078.tsv\n",
"20288.17967.tsv 24182.17967.tsv 32727.17967.tsv 9030.17967.tsv\n",
"20288.8514.tsv 24182.8514.tsv 32727.8514.tsv 9030.8514.tsv\n"
]
}
],
"source": [
"ls third_directory/"
]
},
{
"cell_type": "markdown",
"id": "c29dd437-e4c7-4305-acb9-e3cc5e52988e",
"metadata": {},
"source": [
"The content of the working `Snakefile` is below **for easier copying and pasting**:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "3f51500f-276f-407e-b5ce-c1f2a3bb50c4",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"OUTDIR = \"first_directory\"\n",
"SNDDIR = \"second_directory\"\n",
"THRDIR = \"third_directory\"\n",
"\n",
"\n",
"def combine(wildcards):\n",
" # read the first set of outputs\n",
" ck_output = checkpoints.make_some_files.get(**wildcards).output[0]\n",
" FIRSTS, = glob_wildcards(os.path.join(ck_output, \"{sample}.txt\"))\n",
" # read the second set of outputs\n",
" sn_output = checkpoints.make_more_files.get(**wildcards).output[0]\n",
" SECONDS, = glob_wildcards(os.path.join(sn_output, \"{smpl}.txt\"))\n",
" return expand(os.path.join(THRDIR, \"{first}.{second}.tsv\"), first=FIRSTS, second=SECONDS)\n",
"\n",
"rule all:\n",
" input: \n",
" OUTDIR,\n",
" SNDDIR,\n",
" combine\n",
"\n",
"checkpoint make_some_files:\n",
" output:\n",
" directory(OUTDIR)\n",
" shell:\n",
" \"\"\"\n",
" mkdir {output};\n",
" N=$(((RANDOM%5)+1));\n",
" for D in $(seq $N); do\n",
" touch {output}/$RANDOM.txt\n",
" done\n",
" \"\"\"\n",
"\n",
"checkpoint make_more_files:\n",
" output:\n",
" directory(SNDDIR)\n",
" shell:\n",
" \"\"\"\n",
" mkdir {output};\n",
" N=$(((RANDOM%5)+1));\n",
" for D in $(seq $N); do\n",
" touch {output}/$RANDOM.txt\n",
" done\n",
" \"\"\"\n",
"\n",
"rule make_third_files:\n",
" input:\n",
" OUTDIR,\n",
" SNDDIR,\n",
" output:\n",
" os.path.join(THRDIR, \"{first}.{second}.tsv\")\n",
" shell:\n",
" \"\"\"\n",
" touch {output}\n",
" \"\"\"\n",
" \n"
]
}
],
"source": [
"cat Snakefile"
]
},
{
"cell_type": "markdown",
"id": "c4bff8b2-c3a9-48ec-b11b-fe843e97040d",
"metadata": {},
"source": [
"------\n",
"\n",
"Change type of the next cell to '`Code`' and run cell below to clean up if want to test with running cell containing `!snakemake -c 1` above again:"
]
},
{
"cell_type": "raw",
"id": "296e033a-8a13-4577-942c-1e4cbb2e78ba",
"metadata": {},
"source": [
"!rm -rf first_directory\n",
"!rm -rf second_directory\n",
"!rm -rf third_directory"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "657654e6-845a-455b-8296-ce034689b8b6",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.9"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment