Skip to content

Instantly share code, notes, and snippets.

View natefoo's full-sized avatar

Nate Coraor natefoo

View GitHub Profile
2020-09-22 17:50:57,516 ERROR [pulsar.managers.stateful][[manager=jetstream_tacc]-[action=preprocess]-[job=1306569]] Failed job preprocessing for job 1306569:
Traceback (most recent call last):
File "/srv/pulsar/test/venv/lib64/python3.6/site-packages/pulsar/managers/stateful.py", line 132, in _handling_of_preprocessing_state
**launch_kwds
File "/srv/pulsar/test/venv/lib64/python3.6/site-packages/pulsar/managers/queued_drmaa.py", line 21, in launch
setup_params=setup_params,
File "/srv/pulsar/test/venv/lib64/python3.6/site-packages/pulsar/managers/base/base_drmaa.py", line 65, in _build_template_attributes
setup_params=setup_params
File "/srv/pulsar/test/venv/lib64/python3.6/site-packages/pulsar/managers/base/directory.py", line 120, in _setup_job_file
command_line = self._expand_command_line(command_line, dependencies_description, job_directory=self.job_directory(job_id).job_directory)
https://zenodo.org/record/3928735/files/AM1.fastq?download=1
https://zenodo.org/record/3928735/files/AM2.fastq?download=1
https://zenodo.org/record/3928735/files/AM3.fastq?download=1
https://zenodo.org/record/3928735/files/EM1.fastq?download=1
https://zenodo.org/record/3928735/files/EM3.fastq?download=1
https://zenodo.org/record/3928735/files/EM4.fastq?download=1
https://zenodo.org/record/3928735/files/EM5.fastq?download=1
if options.requeue_job:
from galaxy.util import directory_hash_id
job_id = options.requeue_job
job = model.context.current.query(model.Job).enable_eagerloads(False).get(job_id)
job_dir_hash = os.path.join(*directory_hash_id(job_id))
job_working_dir = os.path.join(JOB_WORKING_DIR, job_dir_hash, str(job_id))
print('Attempting to requeue job %s/%s' % (job_id, job.job_runner_external_id))
if os.path.exists(job_working_dir):
cleared_dir = os.path.join(JOB_WORKING_DIR, '_cleared_contents', job_dir_hash, str(job_id))
if not os.path.exists(cleared_dir):
@natefoo
natefoo / directory.py.diff
Created April 14, 2020 18:34
CVMFS Parrot Pulsar hack
diff --git a/pulsar/managers/base/directory.py b/pulsar/managers/base/directory.py
index 4a463dc..e527501 100644
--- a/pulsar/managers/base/directory.py
+++ b/pulsar/managers/base/directory.py
@@ -19,6 +19,14 @@ JOB_FILE_TOOL_ID = "tool_id"
JOB_FILE_TOOL_VERSION = "tool_version"
JOB_FILE_CANCELLED = "cancelled"
JOB_FILE_COMMAND_LINE = "command_line"
+JOB_WRAPPER_TEMPLATE = """#!/bin/sh
+PARROT_CVMFS_REPO="data.galaxyproject.org:url=http://cvmfs1-tacc0.galaxyproject.org/cvmfs/data.galaxyproject.org/,pubkey=$HOME/data.pub \
abyss-pe
align_families
alleyoop
bg_diamond
bg_diamond_makedb
bio_hansel
bowtie2
bowtie_color_wrapper
bowtie_wrapper
busco
import logging
from galaxy.jobs.mapper import JobMappingException
log = logging.getLogger(__name__)
DESTINATION_IDS = {
1 : 'slurm',
2 : 'slurm-2c'
}
FAILURE_MESSAGE = 'This tool could not be run because of a misconfiguration in the Galaxy job running system, please report this error'
import logging
from galaxy.jobs.mapper import JobMappingException
log = logging.getLogger(__name__)
DESTINATION_IDS = {
1 : 'slurm',
2 : 'slurm-2c'
}
FAILURE_MESSAGE = 'This tool could not be run because of a misconfiguration in the Galaxy job running system, please report this error'
@natefoo
natefoo / tutorial.md
Last active March 3, 2020 13:37
Installing Data Managers @ GAT 2020 Barcelona

Reference Genomes - Exercise

Adapted from Oslo training

Learning Outcomes

By the end of this tutorial, you should:

  1. Have an understanding of the way in which Galaxy stores and uses reference data
  2. Be able to download and use data managers to add a reference genome and its pre-calculated indices into the Galaxy reference data system
@natefoo
natefoo / 00-README.md
Last active February 27, 2020 20:08
uWSGI Zerg Mode + Mules

Background

Two commonly used [Galaxy][galaxy] server configurations are the use of [uWSGI Zerg Mode][uwsgi-zerg-mode] and [uWSGI Mules][uwsgi-mules] as [Galaxy job handlers][galaxy-scaling]. These features are not easily compatible because Galaxy job handlers rely heavily on having unique server names, and handlers' server names must be persistent across restarts. Because zerg mode results in running two Galaxy servers simultaneously (however briefly), using mules with zerg mode would necessarily mean running mules with overlapping server names.

Solution

In a typical Galaxy zerg mode setup, the newly started zergling (B) terminates the old zergling (A) once B is ready to serve requests. Zergling B then continues to serve requests until another zergling (C) is started and terminates B.

It is possible to get both zerg mode and mules working together by configuring zergling B to start without mules, and perform a double zerg dance on each restart:

#!/usr/bin/env python
import argparse
import sys
import boto3
from jinja2 import Environment
from s3pypi.exceptions import S3PyPiError
from s3pypi.package import Package