Skip to content

Instantly share code, notes, and snippets.

@ionox0
Last active August 14, 2019 15:48
Show Gist options
  • Select an option

  • Save ionox0/2520310ef4e84314502e3dc85553687e to your computer and use it in GitHub Desktop.

Select an option

Save ionox0/2520310ef4e84314502e3dc85553687e to your computer and use it in GitHub Desktop.
Debugging a failed job in the Toil jobstore

Note: The "job" that we're dealing with in this case is actually an instance of the Toil JobGraph class. Here is the JobGraph class hierarchy. You can see that a JobGraph is similar to a Job, as it is one of it's descendants.

Here is the job class for completeness's sake.

Let's try an example

From the Toil log files, you will be able to find the Toil job ID for the failed job. Here we show a job being submitted for the third time after two failed attempts. We can see the cwl file that defines the job, along with resource requirements, and the bsub command that is issued.

1.

There is then logging to indicate the failure of the job with job ID 9ntS9h.

DEBUG:toil.batchSystems.abstractGridEngineBatchSystem:Issued the job command: /home/johnsoni/virtualenvs/pipeline_1.1.14/bin/_toil_worker file:///home/johnsoni/pipeline_1.1.14/ACCESS-Pipeline/cwl_tools/trimgalore/trimgalore.cwl file:/home/johnsoni/juno_ACCESS/5500-FZ/5500-FZ-1.1.14/tmp/jobstore-2888ebd4-bd4a-11e9-a01d-ec0d9a88a15a u/v/job9ntS9h with job id: 45
INFO:toil.leader:Issued job 'file:///home/johnsoni/pipeline_1.1.14/ACCESS-Pipeline/cwl_tools/trimgalore/trimgalore.cwl' u/v/job9ntS9h with job batch system ID: 45 and cores: 2, disk: 20.5 G, and memory: 15.6 G

...

DEBUG:toil.batchSystems.abstractGridEngineBatchSystem:Running ['bsub', '-cwd', '.', '-J', 'toil_job_45', '-R', 'select[mem > 7] rusage[mem=7]', '-M', '7', '-n', '2', '-W', '1200', '-S', '1', '-app', 'anyOS', '-R', 'select[type==CentOS7]', '/home/johnsoni/virtualenvs/pipeline_1.1.14/bin/_toil_worker file:///home/johnsoni/pipeline_1.1.14/ACCESS-Pipeline/cwl_tools/trimgalore/trimgalore.cwl file:/home/johnsoni/juno_ACCESS/5500-FZ/5500-FZ-1.1.14/tmp/jobstore-2888ebd4-bd4a-11e9-a01d-ec0d9a88a15a u/v/job9ntS9h']

...

DEBUG:toil.batchSystems.abstractGridEngineBatchSystem:UpdatedJobsQueue Item: (45, 1)
WARNING:toil.leader:Job failed with exit value 1: 'file:///home/johnsoni/pipeline_1.1.14/ACCESS-Pipeline/cwl_tools/trimgalore/trimgalore.cwl' u/v/job9ntS9h
DEBUG:toil.leader:Job 'file:///home/johnsoni/pipeline_1.1.14/ACCESS-Pipeline/cwl_tools/trimgalore/trimgalore.cwl' u/v/job9ntS9h continues to exist (i.e. has more to do)
WARNING:toil.leader:No log file is present, despite job failing: 'file:///home/johnsoni/pipeline_1.1.14/ACCESS-Pipeline/cwl_tools/trimgalore/trimgalore.cwl' u/v/job9ntS9h
WARNING:toil.jobGraph:Due to failure we are reducing the remaining retry count of job 'file:///home/johnsoni/pipeline_1.1.14/ACCESS-Pipeline/cwl_tools/trimgalore/trimgalore.cwl' u/v/job9ntS9h with ID u/v/job9ntS9h to 0
DEBUG:toil.leader:Added job: 'file:///home/johnsoni/pipeline_1.1.14/ACCESS-Pipeline/cwl_tools/trimgalore/trimgalore.cwl' u/v/job9ntS9h to active jobs

2.

Using the job ID, use the find command to locate the temp dir for the job that failed.

The job file is a python pickle object that represents the metadata for the job (job ID, job name, retry count, resource requirements).

(pipeline_1.1.14)  accessbot@juno /home/johnsoni/juno_ACCESS/5500-FZ/5500-FZ-1.1.14/tmp > find . | grep 9ntS9h
./jobstore-2888ebd4-bd4a-11e9-a01d-ec0d9a88a15a/tmp/u/v/job9ntS9h
./jobstore-2888ebd4-bd4a-11e9-a01d-ec0d9a88a15a/tmp/u/v/job9ntS9h/g
./jobstore-2888ebd4-bd4a-11e9-a01d-ec0d9a88a15a/tmp/u/v/job9ntS9h/g/tmpgSv_cO.tmp
./jobstore-2888ebd4-bd4a-11e9-a01d-ec0d9a88a15a/tmp/u/v/job9ntS9h/job

3.

Use an environment (conda, virtualenv, etc.) that has Toil installed. This is so that you can import Toil and get the classes that are required for unpickling (deserializing) the failed Job object

accessbot@juno /home/johnsoni/juno_ACCESS/5500-FZ/5500-FZ-1.1.14/tmp > source /home/johnsoni/virtualenvs/pipeline_1.1.14/bin/activate

4.

We can now use the python interpreter to unpickle and inspect the job object. Here we see the job's name, retry count, and stack of subsequent jobs to run:

>>> import toil

>>> j = pickle.load(open('./jobstore-2888ebd4-bd4a-11e9-a01d-ec0d9a88a15a/tmp/u/v/job9ntS9h/job', 'rb'))

>>> dir(j)
['__class__', '__delattr__', '__dict__', '__doc__', '__eq__', '__format__', '__getattribute__', '__hash__', '__init__', '__long__', '__module__', '__native__', '__ne__', '__new__', '__nonzero__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__unicode__', '__weakref__', '_config', '_cores', '_disk', '_memory', '_parseResource', '_preemptable', '_requirements', 'chainedJobs', 'checkpoint', 'checkpointFilesToDelete', 'command', 'cores', 'disk', 'displayName', 'errorJobStoreID', 'filesToDelete', 'fromJob', 'fromJobGraph', 'fromJobNode', 'getLogFileHandle', 'jobName', 'jobStoreID', 'logJobStoreFileID', 'memory', 'next', 'predecessorNumber', 'predecessorsFinished', 'preemptable', 'remainingRetryCount', 'restartCheckpoint', 'services', 'setupJobAfterFailure', 'stack', 'startJobStoreID', 'terminateJobStoreID', 'unitName']

>>> j.jobName
'file:///home/johnsoni/pipeline_1.1.14/ACCESS-Pipeline/cwl_tools/trimgalore/trimgalore.cwl'

>>> j.remainingRetryCount
0

>>> j.stack
[
[], 
[
    JobNode( **{
        'jobStoreID': 'Q/S/job0aV3AM',
        '_config': None,
        'displayName': 'JobGraph',
        'predecessorNumber': 2,
        'unitName': None,
        '_preemptable': False,
        'jobName': 'file:///home/johnsoni/pipeline_1.1.14/ACCESS-Pipeline/cwl_tools/bwa-mem/bwa-mem.cwl',
        '_disk': 22045261824,
        '_cores': 4,
        'command': None,
        '_memory': 31457280000
    })
]
]

5.

Here is the result of printing the job (after formatting):

JobGraph( **{
	'predecessorNumber': 1
	'startJobStoreID': None
	'_preemptable': False
	'errorJobStoreID': None
	'remainingRetryCount': 0
	'filesToDelete': []
	'checkpointFilesToDelete': None
	'checkpoint': None
	'_cores': 2
	'logJobStoreFileID': 'u/v/job9ntS9h/g/tmpgSv_cO.tmp'
	'jobStoreID': 'u/v/job9ntS9h'
	'unitName': None
	'chainedJobs': ["'file:///home/johnsoni/pipeline_1.1.14/ACCESS-Pipeline/cwl_tools/trimgalore/trimgalore.cwl' u/v/job9ntS9h"]
	'services': []
	'predecessorsFinished': set([])
	'stack': [
		[]
		[
			JobNode( **{
				'jobStoreID': 'Q/S/job0aV3AM'
				'_config': None
				'displayName': 'JobGraph'
				'predecessorNumber': 2
				'unitName': None
				'_preemptable': False
				'jobName': 'file:///home/johnsoni/pipeline_1.1.14/ACCESS-Pipeline/cwl_tools/bwa-mem/bwa-mem.cwl'
				'_disk': 22045261824
				'_cores': 4
				'command': None
				'_memory': 31457280000')
		]
	]
	'_config': None
	'displayName': 'JobGraph'
	'jobName': 'file:///home/johnsoni/pipeline_1.1.14/ACCESS-Pipeline/cwl_tools/trimgalore/trimgalore.cwl'
	'_disk': 22045261824
	'command': '_toil q/P/jobsdmYZF/g/tmpNWSsZJ-_serialiseJob-stream /home/johnsoni/virtualenvs/pipeline_1.1.14/lib/python2.7/site-packages toil.cwl.cwltoil True'
	'_memory': 16777216000
	'terminateJobStoreID': None
')

6.

Let's take a closer look at the command key, which describes a reference to a _serialisejob-stream file which we will refer to as the "job stream".

Unpickling that file gives us some additional information about the cwlJob that is associated with this Toil job

>>> js = pickle.load(open('./jobstore-2888ebd4-bd4a-11e9-a01d-ec0d9a88a15a/tmp/q/P/jobsdmYZF/g/tmpNWSsZJ-_serialiseJob-stream', 'rb'))

>>> js
<toil.cwl.cwltoil.CWLJob object at 0x7fc8e728f890>

Here are its fields:

>>> dir(js)
['Runner', 'Service', '__class__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__long__', '__module__', '__native__', '__new__', '__nonzero__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__unicode__', '__weakref__', '_addPredecessor', '_checkJobGraphAcylicDFS', '_children', '_config', '_cores', '_createEmptyJobGraphForJob', '_dfs', '_directPredecessors', '_disk', '_executor', '_fileStore', '_followOns', '_fulfillPromises', '_getImpliedEdges', '_isLeafVertex', '_jobName', '_loadJob', '_loadUserModule', '_makeJobGraphs', '_makeJobGraphs2', '_memory', '_parseResource', '_preemptable', '_promiseJobStore', '_requirements', '_run', '_runner', '_rvs', '_serialiseExistingJob', '_serialiseFirstJob', '_serialiseJob', '_serialiseJobGraph', '_serialiseServices', '_services', '_succeeded', '_tempDir', '_unpickle', 'addChild', 'addChildFn', 'addChildJobFn', 'addFollowOn', 'addFollowOnFn', 'addFollowOnJobFn', 'addService', 'checkJobGraphAcylic', 'checkJobGraphConnected', 'checkJobGraphForDeadlocks', 'checkNewCheckpointsAreLeafVertices', 'checkpoint', 'cores', 'cwljob', 'cwltool', 'defer', 'disk', 'displayName', 'encapsulate', 'getRootJobs', 'getTopologicalOrderingOfJobs', 'getUserScript', 'hasChild', 'hasFollowOn', 'jobName', 'log', 'memory', 'next', 'openTempDirs', 'preemptable', 'prepareForPromiseRegistration', 'registerPromise', 'run', 'runtime_context', 'rv', 'step_inputs', 'tempDir', 'unitName', 'userModule', 'workdir', 'wrapFn', 'wrapJobFn']

Here is the cwljob field from the CWlJob object associated with the original JobGraph instance (after formatting):

>>> js.cwljob

Result (sorry for the duplicated entries, working on a better understanding how this works):

{  
   u'stringency':<toil.cwl.cwltoil.StepValueFrom object at 0x7fc8e728f550>,
   u'suppress_warn':<toil.cwl.cwltoil.StepValueFrom object at 0x7fc8e71694d0>,
   
   
   u'adapter':(u'adapter',
   {  
      u'run_tools':ordereddict(      [  
         ('perl_5',
         '/opt/common/CentOS_6/perl/perl-5.20.2/bin/perl'), ('java_7',
         '/opt/common/CentOS_6/java/jdk1.7.0_75/bin/java'), ('java_8',
         '/opt/common/CentOS_6/java/jdk1.8.0_31/bin/java'), ('marianas_path',
         '/home/johnsoni/vendor_tools/Marianas-1.8.0.jar'), ('bioinfo_utils',
         ordereddict(         [  
            ('class',
            'File'),
            ('location',
            'toilfs:9/W/tmphdNHs4-x-BioinfoUtils-1.0.0.jar'),
            ('size',
            372764'),
            ('basename',
            'BioinfoUtils-1.0.0.jar'),
            ('nameroot',
            'BioinfoUtils-1.0.0'),
            ('nameext',
            '.jar')
         ')), ('trimgalore_path',
         '/opt/common/CentOS_6/trim_galore/Trim_Galore_v0.2.5/trim_galore'), ('bwa_path',
         '/opt/common/CentOS_6/bwa/bwa-0.7.5a/bwa'), ('arrg_path',
         '/home/johnsoni/vendor_tools/AddOrReplaceReadGroups-1.96.jar'), ('picard_path',
         '/home/johnsoni/vendor_tools/picard-2.8.1.jar'), ('gatk_path',
         '/opt/common/CentOS_6/gatk/GenomeAnalysisTK-3.3-0/GenomeAnalysisTK.jar'), ('abra_path',
         '/opt/common/CentOS_6/abra2/abra2-2.17/abra2-2.17.jar'), ('fx_path',
         '/opt/common/CentOS_6/picard/picard-tools-1.96/FixMateInformation.jar'), ('waltz_path',
         '/home/johnsoni/vendor_tools/Waltz-2.0.jar')
      '),
      u'add_rg_PL':'Illumina',
      u'add_rg_ID':'C-2VAE39-L001-d',
      u'adapter':'GATCGGAAGAGC',
      u'reference_fasta_fai':'/ifs/depot/resources/dmp/data/pubdata/hg-fasta/VERSIONS/hg19/Homo_sapiens_assembly19.fasta.fai',
      u'add_rg_SM':'C-2VAE39-L001-d',
      u'adapter2':'AGATCGGAAGAGC',
      u'reference_fasta':'/ifs/depot/resources/dmp/data/pubdata/hg-fasta/VERSIONS/hg19/Homo_sapiens_assembly19.fasta',
      u'add_rg_CN':'BergerLab_MSKCC',
      u'fastq2':{  
         u'checksum':u'sha1$68e50852edbc69c061ac6928431d8497340ae4b3',
         u'basename':u'C-2VAE39-L001-d_R2.fastq.gz',
         u'nameext':u'.gz',
         u'nameroot':u'C-2VAE39-L001-d_R2.fastq',
         'http://commonwl.org/cwltool#generation':0,
         u'location':'toilfs:y/y/tmp7NvJtp-writeGlobalFileWrapper-IDH_PT9_IGO_05500_FZ_9_S47_R2_001.fastq.gz',
         u'class':u'File',
         u'size':4981617497
      },
      u'trimgalore__params':ordereddict(      [  
         ('length',
         2'), ('paired',
         Tru'), ('gzip',
         Tru'), ('quality',
         '), ('stringency',
         '), ('suppress_warn',
         Fals')
      '),
      
      
      u'mark_duplicates__params':ordereddict(      [  
         ('create_index',
         Tru'), ('assume_sorted',
         Tru'), ('compression_level',
         '), ('validation_stringency',
         'LENIENT'), ('duplicate_scoring_strategy',
         'SUM_OF_BASE_QUALITIES')
      '),
      
      
      u'add_rg_PU':'bc435-bc435',
      
      
      u'fastq1':{  
         u'checksum':u'sha1$7630e1e7bb50c8bdcb0445bc9d57e1c10a34f4c0',
         u'basename':u'C-2VAE39-L001-d_R1.fastq.gz',
         u'nameext':u'.gz',
         u'nameroot':u'C-2VAE39-L001-d_R1.fastq',
         'http://commonwl.org/cwltool#generation':0,
         u'location':'toilfs:r/G/tmpVS_mAB-writeGlobalFileWrapper-IDH_PT9_IGO_05500_FZ_9_S47_R1_001.fastq.gz',
         u'class':u'File',
         u'size':4367341966
      },
      u'add_or_replace_read_groups__params':ordereddict(      [  
         ('add_rg_PL',
         'Illumina'), ('add_rg_CN',
         'BergerLab_MSKCC'), ('sort_order',
         'coordinate'), ('validation_stringency',
         'LENIENT'), ('compression_level',
         '), ('create_index',
         Tru')
      '),
      u'add_rg_LB':1
   '),
   
   
   u'length':<toil.cwl.cwltoil.StepValueFrom object at 0x7fc8e70fd310>,
   u'perl':<toil.cwl.cwltoil.StepValueFrom object at 0x7fc8e70fd390>,
   u'paired':<toil.cwl.cwltoil.StepValueFrom object at 0x7fc8e7157090>,
   
   
   u'run_tools':(u'run_tools',
   {  
      u'run_tools':ordereddict(      [  
         ('perl_5',
         '/opt/common/CentOS_6/perl/perl-5.20.2/bin/perl'), ('java_7',
         '/opt/common/CentOS_6/java/jdk1.7.0_75/bin/java'), ('java_8',
         '/opt/common/CentOS_6/java/jdk1.8.0_31/bin/java'), ('marianas_path',
         '/home/johnsoni/vendor_tools/Marianas-1.8.0.jar'), ('bioinfo_utils',
         ordereddict(         [  
            ('class',
            'File'),
            ('location',
            'toilfs:9/W/tmphdNHs4-x-BioinfoUtils-1.0.0.jar'),
            ('size',
            372764'),
            ('basename',
            'BioinfoUtils-1.0.0.jar'),
            ('nameroot',
            'BioinfoUtils-1.0.0'),
            ('nameext',
            '.jar')
         ')), ('trimgalore_path',
         '/opt/common/CentOS_6/trim_galore/Trim_Galore_v0.2.5/trim_galore'), ('bwa_path',
         '/opt/common/CentOS_6/bwa/bwa-0.7.5a/bwa'), ('arrg_path',
         '/home/johnsoni/vendor_tools/AddOrReplaceReadGroups-1.96.jar'), ('picard_path',
         '/home/johnsoni/vendor_tools/picard-2.8.1.jar'), ('gatk_path',
         '/opt/common/CentOS_6/gatk/GenomeAnalysisTK-3.3-0/GenomeAnalysisTK.jar'), ('abra_path',
         '/opt/common/CentOS_6/abra2/abra2-2.17/abra2-2.17.jar'), ('fx_path',
         '/opt/common/CentOS_6/picard/picard-tools-1.96/FixMateInformation.jar'), ('waltz_path',
         '/home/johnsoni/vendor_tools/Waltz-2.0.jar')
      '),
      
      
      u'add_rg_PL':'Illumina',
      u'add_rg_ID':'C-2VAE39-L001-d',
      u'adapter':'GATCGGAAGAGC',
      u'reference_fasta_fai':'/ifs/depot/resources/dmp/data/pubdata/hg-fasta/VERSIONS/hg19/Homo_sapiens_assembly19.fasta.fai',
      u'add_rg_SM':'C-2VAE39-L001-d',
      u'adapter2':'AGATCGGAAGAGC',
      u'reference_fasta':'/ifs/depot/resources/dmp/data/pubdata/hg-fasta/VERSIONS/hg19/Homo_sapiens_assembly19.fasta',
      u'add_rg_CN':'BergerLab_MSKCC',
      u'fastq2':{  
         u'checksum':u'sha1$68e50852edbc69c061ac6928431d8497340ae4b3',
         u'basename':u'C-2VAE39-L001-d_R2.fastq.gz',
         u'nameext':u'.gz',
         u'nameroot':u'C-2VAE39-L001-d_R2.fastq',
         'http://commonwl.org/cwltool#generation':0,
         u'location':'toilfs:y/y/tmp7NvJtp-writeGlobalFileWrapper-IDH_PT9_IGO_05500_FZ_9_S47_R2_001.fastq.gz',
         u'class':u'File',
         u'size':4981617497
      },
      u'trimgalore__params':ordereddict(      [  
         ('length',
         2'), ('paired',
         Tru'), ('gzip',
         Tru'), ('quality',
         '), ('stringency',
         '), ('suppress_warn',
         Fals')
      '),
      u'mark_duplicates__params':ordereddict(      [  
         ('create_index',
         Tru'), ('assume_sorted',
         Tru'), ('compression_level',
         '), ('validation_stringency',
         'LENIENT'), ('duplicate_scoring_strategy',
         'SUM_OF_BASE_QUALITIES')
      '),
      u'add_rg_PU':'bc435-bc435',
      u'fastq1':{  
         u'checksum':u'sha1$7630e1e7bb50c8bdcb0445bc9d57e1c10a34f4c0',
         u'basename':u'C-2VAE39-L001-d_R1.fastq.gz',
         u'nameext':u'.gz',
         u'nameroot':u'C-2VAE39-L001-d_R1.fastq',
         'http://commonwl.org/cwltool#generation':0,
         u'location':'toilfs:r/G/tmpVS_mAB-writeGlobalFileWrapper-IDH_PT9_IGO_05500_FZ_9_S47_R1_001.fastq.gz',
         u'class':u'File',
         u'size':4367341966
      },
      u'add_or_replace_read_groups__params':ordereddict(      [  
         ('add_rg_PL',
         'Illumina'), ('add_rg_CN',
         'BergerLab_MSKCC'), ('sort_order',
         'coordinate'), ('validation_stringency',
         'LENIENT'), ('compression_level',
         '), ('create_index',
         Tru')
      '),
      u'add_rg_LB':1
   '),
   
   
   u'params':(u'trimgalore__params',
   {  
      u'run_tools':ordereddict(      [  
         ('perl_5',
         '/opt/common/CentOS_6/perl/perl-5.20.2/bin/perl'), ('java_7',
         '/opt/common/CentOS_6/java/jdk1.7.0_75/bin/java'), ('java_8',
         '/opt/common/CentOS_6/java/jdk1.8.0_31/bin/java'), ('marianas_path',
         '/home/johnsoni/vendor_tools/Marianas-1.8.0.jar'), ('bioinfo_utils',
         ordereddict(         [  
            ('class',
            'File'),
            ('location',
            'toilfs:9/W/tmphdNHs4-x-BioinfoUtils-1.0.0.jar'),
            ('size',
            372764'),
            ('basename',
            'BioinfoUtils-1.0.0.jar'),
            ('nameroot',
            'BioinfoUtils-1.0.0'),
            ('nameext',
            '.jar')
         ')), ('trimgalore_path',
         '/opt/common/CentOS_6/trim_galore/Trim_Galore_v0.2.5/trim_galore'), ('bwa_path',
         '/opt/common/CentOS_6/bwa/bwa-0.7.5a/bwa'), ('arrg_path',
         '/home/johnsoni/vendor_tools/AddOrReplaceReadGroups-1.96.jar'), ('picard_path',
         '/home/johnsoni/vendor_tools/picard-2.8.1.jar'), ('gatk_path',
         '/opt/common/CentOS_6/gatk/GenomeAnalysisTK-3.3-0/GenomeAnalysisTK.jar'), ('abra_path',
         '/opt/common/CentOS_6/abra2/abra2-2.17/abra2-2.17.jar'), ('fx_path',
         '/opt/common/CentOS_6/picard/picard-tools-1.96/FixMateInformation.jar'), ('waltz_path',
         '/home/johnsoni/vendor_tools/Waltz-2.0.jar')
      '),
      u'add_rg_PL':'Illumina',
      u'add_rg_ID':'C-2VAE39-L001-d',
      u'adapter':'GATCGGAAGAGC',
      u'reference_fasta_fai':'/ifs/depot/resources/dmp/data/pubdata/hg-fasta/VERSIONS/hg19/Homo_sapiens_assembly19.fasta.fai',
      u'add_rg_SM':'C-2VAE39-L001-d',
      u'adapter2':'AGATCGGAAGAGC',
      u'reference_fasta':'/ifs/depot/resources/dmp/data/pubdata/hg-fasta/VERSIONS/hg19/Homo_sapiens_assembly19.fasta',
      u'add_rg_CN':'BergerLab_MSKCC',
      u'fastq2':{  
         u'checksum':u'sha1$68e50852edbc69c061ac6928431d8497340ae4b3',
         u'basename':u'C-2VAE39-L001-d_R2.fastq.gz',
         u'nameext':u'.gz',
         u'nameroot':u'C-2VAE39-L001-d_R2.fastq',
         'http://commonwl.org/cwltool#generation':0,
         u'location':'toilfs:y/y/tmp7NvJtp-writeGlobalFileWrapper-IDH_PT9_IGO_05500_FZ_9_S47_R2_001.fastq.gz',
         u'class':u'File',
         u'size':4981617497
      },
      u'trimgalore__params':ordereddict(      [  
         ('length',
         2'), ('paired',
         Tru'), ('gzip',
         Tru'), ('quality',
         '), ('stringency',
         '), ('suppress_warn',
         Fals')
      '),
      u'mark_duplicates__params':ordereddict(      [  
         ('create_index',
         Tru'), ('assume_sorted',
         Tru'), ('compression_level',
         '), ('validation_stringency',
         'LENIENT'), ('duplicate_scoring_strategy',
         'SUM_OF_BASE_QUALITIES')
      '),
      u'add_rg_PU':'bc435-bc435',
      u'fastq1':{  
         u'checksum':u'sha1$7630e1e7bb50c8bdcb0445bc9d57e1c10a34f4c0',
         u'basename':u'C-2VAE39-L001-d_R1.fastq.gz',
         u'nameext':u'.gz',
         u'nameroot':u'C-2VAE39-L001-d_R1.fastq',
         'http://commonwl.org/cwltool#generation':0,
         u'location':'toilfs:r/G/tmpVS_mAB-writeGlobalFileWrapper-IDH_PT9_IGO_05500_FZ_9_S47_R1_001.fastq.gz',
         u'class':u'File',
         u'size':4367341966
      },
      u'add_or_replace_read_groups__params':ordereddict(      [  
         ('add_rg_PL',
         'Illumina'), ('add_rg_CN',
         'BergerLab_MSKCC'), ('sort_order',
         'coordinate'), ('validation_stringency',
         'LENIENT'), ('compression_level',
         '), ('create_index',
         Tru')
      '),
      u'add_rg_LB':1
   '),
   
   
   u'trimgalore':<toil.cwl.cwltoil.StepValueFrom object at 0x7fc8e7169390>,
   u'gzip':<toil.cwl.cwltoil.StepValueFrom object at 0x7fc8e71693d0>,
   
   
   u'fastq2':(u'fastq2',
   {  
      u'run_tools':ordereddict(      [  
         ('perl_5',
         '/opt/common/CentOS_6/perl/perl-5.20.2/bin/perl'), ('java_7',
         '/opt/common/CentOS_6/java/jdk1.7.0_75/bin/java'), ('java_8',
         '/opt/common/CentOS_6/java/jdk1.8.0_31/bin/java'), ('marianas_path',
         '/home/johnsoni/vendor_tools/Marianas-1.8.0.jar'), ('bioinfo_utils',
         ordereddict(         [  
            ('class',
            'File'),
            ('location',
            'toilfs:9/W/tmphdNHs4-x-BioinfoUtils-1.0.0.jar'),
            ('size',
            372764'),
            ('basename',
            'BioinfoUtils-1.0.0.jar'),
            ('nameroot',
            'BioinfoUtils-1.0.0'),
            ('nameext',
            '.jar')
         ')), ('trimgalore_path',
         '/opt/common/CentOS_6/trim_galore/Trim_Galore_v0.2.5/trim_galore'), ('bwa_path',
         '/opt/common/CentOS_6/bwa/bwa-0.7.5a/bwa'), ('arrg_path',
         '/home/johnsoni/vendor_tools/AddOrReplaceReadGroups-1.96.jar'), ('picard_path',
         '/home/johnsoni/vendor_tools/picard-2.8.1.jar'), ('gatk_path',
         '/opt/common/CentOS_6/gatk/GenomeAnalysisTK-3.3-0/GenomeAnalysisTK.jar'), ('abra_path',
         '/opt/common/CentOS_6/abra2/abra2-2.17/abra2-2.17.jar'), ('fx_path',
         '/opt/common/CentOS_6/picard/picard-tools-1.96/FixMateInformation.jar'), ('waltz_path',
         '/home/johnsoni/vendor_tools/Waltz-2.0.jar')
      '),
      u'add_rg_PL':'Illumina',
      u'add_rg_ID':'C-2VAE39-L001-d',
      u'adapter':'GATCGGAAGAGC',
      u'reference_fasta_fai':'/ifs/depot/resources/dmp/data/pubdata/hg-fasta/VERSIONS/hg19/Homo_sapiens_assembly19.fasta.fai',
      u'add_rg_SM':'C-2VAE39-L001-d',
      u'adapter2':'AGATCGGAAGAGC',
      u'reference_fasta':'/ifs/depot/resources/dmp/data/pubdata/hg-fasta/VERSIONS/hg19/Homo_sapiens_assembly19.fasta',
      u'add_rg_CN':'BergerLab_MSKCC',
      u'fastq2':{  
         u'checksum':u'sha1$68e50852edbc69c061ac6928431d8497340ae4b3',
         u'basename':u'C-2VAE39-L001-d_R2.fastq.gz',
         u'nameext':u'.gz',
         u'nameroot':u'C-2VAE39-L001-d_R2.fastq',
         'http://commonwl.org/cwltool#generation':0,
         u'location':'toilfs:y/y/tmp7NvJtp-writeGlobalFileWrapper-IDH_PT9_IGO_05500_FZ_9_S47_R2_001.fastq.gz',
         u'class':u'File',
         u'size':4981617497
      },
      u'trimgalore__params':ordereddict(      [  
         ('length', 2'), ('paired', True'), ('gzip', True'), ('quality', '), ('stringency', '), ('suppress_warn', False')
      '),
      u'mark_duplicates__params':ordereddict(      [  
         ('create_index', True'), ('assume_sorted', True'), ('compression_level', '), ('validation_stringency', 'LENIENT'), ('duplicate_scoring_strategy', 'SUM_OF_BASE_QUALITIES')
      '),
      u'add_rg_PU':'bc435-bc435',
      u'fastq1':{  
         u'checksum':u'sha1$7630e1e7bb50c8bdcb0445bc9d57e1c10a34f4c0',
         u'basename':u'C-2VAE39-L001-d_R1.fastq.gz',
         u'nameext':u'.gz',
         u'nameroot':u'C-2VAE39-L001-d_R1.fastq',
         'http://commonwl.org/cwltool#generation':0,
         u'location':'toilfs:r/G/tmpVS_mAB-writeGlobalFileWrapper-IDH_PT9_IGO_05500_FZ_9_S47_R1_001.fastq.gz',
         u'class':u'File',
         u'size':4367341966
      },
      u'add_or_replace_read_groups__params':ordereddict(      [  
         ('add_rg_PL', 'Illumina'), ('add_rg_CN', 'BergerLab_MSKCC'), ('sort_order', 'coordinate'), ('validation_stringency', 'LENIENT'), ('compression_level', '), ('create_index', True')
      '),
      u'add_rg_LB':1
   '),
   
   
   u'fastq1':(u'fastq1',
   {  
      u'run_tools':ordereddict(      [  
         ('perl_5', '/opt/common/CentOS_6/perl/perl-5.20.2/bin/perl'), ('java_7', '/opt/common/CentOS_6/java/jdk1.7.0_75/bin/java'), ('java_8', '/opt/common/CentOS_6/java/jdk1.8.0_31/bin/java'), ('marianas_path', '/home/johnsoni/vendor_tools/Marianas-1.8.0.jar'), ('bioinfo_utils',
         ordereddict(         [  
            ('class', 'File'),
            ('location', 'toilfs:9/W/tmphdNHs4-x-BioinfoUtils-1.0.0.jar'),
            ('size', 372764'),
            ('basename', 'BioinfoUtils-1.0.0.jar'),
            ('nameroot', 'BioinfoUtils-1.0.0'),
            ('nameext', '.jar')
         ')), ('trimgalore_path', '/opt/common/CentOS_6/trim_galore/Trim_Galore_v0.2.5/trim_galore'), ('bwa_path', '/opt/common/CentOS_6/bwa/bwa-0.7.5a/bwa'), ('arrg_path', '/home/johnsoni/vendor_tools/AddOrReplaceReadGroups-1.96.jar'), ('picard_path', /home/johnsoni/vendor_tools/picard-2.8.1.jar'), ('gatk_path', '/opt/common/CentOS_6/gatk/GenomeAnalysisTK-3.3-0/GenomeAnalysisTK.jar'), ('abra_path', '/opt/common/CentOS_6/abra2/abra2-2.17/abra2-2.17.jar'), ('fx_path', '/opt/common/CentOS_6/picard/picard-tools-1.96/FixMateInformation.jar'), ('waltz_path', '/home/johnsoni/vendor_tools/Waltz-2.0.jar')
      '),
      u'add_rg_PL':'Illumina',
      u'add_rg_ID':'C-2VAE39-L001-d',
      u'adapter':'GATCGGAAGAGC',
      u'reference_fasta_fai':'/ifs/depot/resources/dmp/data/pubdata/hg-fasta/VERSIONS/hg19/Homo_sapiens_assembly19.fasta.fai',
      u'add_rg_SM':'C-2VAE39-L001-d',
      u'adapter2':'AGATCGGAAGAGC',
      u'reference_fasta':'/ifs/depot/resources/dmp/data/pubdata/hg-fasta/VERSIONS/hg19/Homo_sapiens_assembly19.fasta',
      u'add_rg_CN':'BergerLab_MSKCC',
      u'fastq2':{  
         u'checksum':u'sha1$68e50852edbc69c061ac6928431d8497340ae4b3',
         u'basename':u'C-2VAE39-L001-d_R2.fastq.gz',
         u'nameext':u'.gz',
         u'nameroot':u'C-2VAE39-L001-d_R2.fastq',
         'http://commonwl.org/cwltool#generation':0,
         u'location':'toilfs:y/y/tmp7NvJtp-writeGlobalFileWrapper-IDH_PT9_IGO_05500_FZ_9_S47_R2_001.fastq.gz',
         u'class':u'File',
         u'size':4981617497
      },
      u'trimgalore__params':ordereddict(      [  
         ('length', 2'), ('paired', Tru'), ('gzip', Tru'), ('quality', '), ('stringency', '), ('suppress_warn', Fals')
      '),
      u'mark_duplicates__params':ordereddict(      [  
         ('create_index', Tru'), ('assume_sorted', Tru'), ('compression_level', '), ('validation_stringency', 'LENIENT'), ('duplicate_scoring_strategy', 'SUM_OF_BASE_QUALITIES')
      '),
      u'add_rg_PU':'bc435-bc435',
      u'fastq1':{  
         u'checksum':u'sha1$7630e1e7bb50c8bdcb0445bc9d57e1c10a34f4c0',
         u'basename':u'C-2VAE39-L001-d_R1.fastq.gz',
         u'nameext':u'.gz',
         u'nameroot':u'C-2VAE39-L001-d_R1.fastq',
         'http://commonwl.org/cwltool#generation':0,
         u'location':'toilfs:r/G/tmpVS_mAB-writeGlobalFileWrapper-IDH_PT9_IGO_05500_FZ_9_S47_R1_001.fastq.gz',
         u'class':u'File',
         u'size':4367341966
      },
      u'add_or_replace_read_groups__params':ordereddict(      [  
         ('add_rg_PL', 'Illumina'), ('add_rg_CN', 'BergerLab_MSKCC'), ('sort_order', 'coordinate'), ('validation_stringency', 'LENIENT'), ('compression_level', '), ('create_index', Tru')
      '),
      u'add_rg_LB':1
   '),
   
   
   u'quality':<toil.cwl.cwltoil.StepValueFrom object at 0x7fc8e7169450>,
   
   
   u'adapter2':(u'adapter2',
   {  
      u'run_tools':ordereddict(      [  
         ('perl_5', '/opt/common/CentOS_6/perl/perl-5.20.2/bin/perl'), ('java_7',
         '/opt/common/CentOS_6/java/jdk1.7.0_75/bin/java'), ('java_8',
         '/opt/common/CentOS_6/java/jdk1.8.0_31/bin/java'), ('marianas_path', '/home/johnsoni/vendor_tools/Marianas-1.8.0.jar'), ('bioinfo_utils',
         ordereddict(         [  
            ('class', 'File'),
            ('location', 'toilfs:9/W/tmphdNHs4-x-BioinfoUtils-1.0.0.jar'),
            ('size', 372764'),
            ('basename', 'BioinfoUtils-1.0.0.jar'),
            ('nameroot', 'BioinfoUtils-1.0.0'),
            ('nameext', '.jar')
         ')), ('trimgalore_path', '/opt/common/CentOS_6/trim_galore/Trim_Galore_v0.2.5/trim_galore'), ('bwa_path', '/opt/common/CentOS_6/bwa/bwa-0.7.5a/bwa'), ('arrg_path', '/home/johnsoni/vendor_tools/AddOrReplaceReadGroups-1.96.jar'), ('picard_path', '/home/johnsoni/vendor_tools/picard-2.8.1.jar'), ('gatk_path', '/opt/common/CentOS_6/gatk/GenomeAnalysisTK-3.3-0/GenomeAnalysisTK.jar'), ('abra_path', '/opt/common/CentOS_6/abra2/abra2-2.17/abra2-2.17.jar'), ('fx_path', '/opt/common/CentOS_6/picard/picard-tools-1.96/FixMateInformation.jar'), ('waltz_path', '/home/johnsoni/vendor_tools/Waltz-2.0.jar')
      ]),
      u'add_rg_PL':'Illumina',
      u'add_rg_ID':'C-2VAE39-L001-d',
      u'adapter':'GATCGGAAGAGC',
      u'reference_fasta_fai':'/ifs/depot/resources/dmp/data/pubdata/hg-fasta/VERSIONS/hg19/Homo_sapiens_assembly19.fasta.fai',
      u'add_rg_SM':'C-2VAE39-L001-d',
      u'adapter2':'AGATCGGAAGAGC',
      u'reference_fasta':'/ifs/depot/resources/dmp/data/pubdata/hg-fasta/VERSIONS/hg19/Homo_sapiens_assembly19.fasta',
      u'add_rg_CN':'BergerLab_MSKCC',
      u'fastq2':{  
         u'checksum':u'sha1$68e50852edbc69c061ac6928431d8497340ae4b3',
         u'basename':u'C-2VAE39-L001-d_R2.fastq.gz',
         u'nameext':u'.gz',
         u'nameroot':u'C-2VAE39-L001-d_R2.fastq',
         'http://commonwl.org/cwltool#generation':0,
         u'location':'toilfs:y/y/tmp7NvJtp-writeGlobalFileWrapper-IDH_PT9_IGO_05500_FZ_9_S47_R2_001.fastq.gz',
         u'class':u'File',
         u'size':4981617497
      },
      u'trimgalore__params':ordereddict([  
         ('length', 25), ('paired', True), ('gzip', True), ('quality', 1), ('stringency', 3), ('suppress_warn', False)
      ]),
      u'mark_duplicates__params':ordereddict([  
         ('create_index', True), ('assume_sorted', True), ('compression_level', 0), ('validation_stringency', 'LENIENT'), ('duplicate_scoring_strategy', 'SUM_OF_BASE_QUALITIES')
      ]),
      u'add_rg_PU':'bc435-bc435',
      u'fastq1':{  
         u'checksum':u'sha1$7630e1e7bb50c8bdcb0445bc9d57e1c10a34f4c0',
         u'basename':u'C-2VAE39-L001-d_R1.fastq.gz',
         u'nameext':u'.gz',
         u'nameroot':u'C-2VAE39-L001-d_R1.fastq',
         'http://commonwl.org/cwltool#generation':0,
         u'location':'toilfs:r/G/tmpVS_mAB-writeGlobalFileWrapper-IDH_PT9_IGO_05500_FZ_9_S47_R1_001.fastq.gz',
         u'class':u'File',
         u'size':4367341966
      },
      u'add_or_replace_read_groups__params':ordereddict([  
         ('add_rg_PL', 'Illumina'), ('add_rg_CN', 'BergerLab_MSKCC'), ('sort_order', 'coordinate'), ('validation_stringency', 'LENIENT'), ('compression_level', 0), ('create_index', True)
      ]),
      u'add_rg_LB':1
   })
   
   
}

7.

We can see that the workdir in this case refers to this folder:

/home/johnsoni/juno_ACCESS/5500-FZ/5500-FZ-1.1.14/tmp/tmpbdDRo9

Which is different from the tempdirs that are found in the jobstore

8.

From this, we can see that the original fastqs associated with this job were

C-2VAE39-L001-d_R1.fastq.gz
C-2VAE39-L001-d_R2.fastq.gz
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment