Created
February 13, 2025 16:28
-
-
Save mvdbeek/b53d73a5fcba2677b0706bcd7ab10704 to your computer and use it in GitHub Desktop.
workflows.yaml file for brc
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
workflows: | |
- trs_id: '#workflow/github.com/iwc-workflows/assembly-with-flye/main/versions/v0.2' | |
workflow_categories: | |
- ASSEMBLY | |
workflow_name: Genome assembly with Flye | |
workflow_description: Assemble long reads with Flye, then view assembly statistics | |
and assembly graph | |
ploidy: any | |
parameters: | |
Input sequence reads: | |
class: File | |
active: false | |
- trs_id: '#workflow/github.com/iwc-workflows/atacseq/main/versions/v1.0' | |
workflow_categories: | |
- REGULATION | |
workflow_name: ATACseq | |
workflow_description: 'This workflow takes as input a collection of paired fastq. | |
It will remove bad quality and adapters with cutadapt. Map with Bowtie2 end-to-end. | |
Will remove reads on MT and unconcordant pairs and pairs with mapping quality | |
below 30 and PCR duplicates. Will compute the pile-up on 5'' +- 100bp. Will call | |
peaks and count the number of reads falling in the 1kb region centered on the | |
summit. Will compute 2 normalization for coverage: normalized by million reads | |
and normalized by million reads in peaks. Will plot the number of reads for each | |
fragment length.' | |
ploidy: any | |
parameters: | |
PE fastq input: | |
class: Collection | |
reference_genome: | |
class: text | |
effective_genome_size: | |
class: integer | |
bin_size: | |
class: integer | |
active: false | |
- trs_id: '#workflow/github.com/iwc-workflows/average-bigwig-between-replicates/main/versions/v0.2' | |
workflow_categories: | |
- REGULATION | |
- TRANSCRIPTOMICS | |
workflow_name: Average Bigwig between replicates | |
workflow_description: 'We assume the identifiers of the input list are like: | |
sample_name_replicateID. | |
The identifiers of the output list will be: | |
sample_name' | |
ploidy: any | |
parameters: | |
Bigwig to average: | |
class: Collection | |
bin_size: | |
class: integer | |
active: false | |
- trs_id: '#workflow/github.com/iwc-workflows/bacterial-genome-assembly/main/versions/v1.1.5' | |
workflow_categories: | |
- ASSEMBLY | |
workflow_name: Bacterial Genome Assembly using Shovill | |
workflow_description: Assembly of bacterial paired-end short read data with generation | |
of quality metrics and reports | |
ploidy: any | |
parameters: | |
Input adapter trimmed sequence reads (forward): | |
class: File | |
ext: | |
- fastq | |
- fastq.gz | |
- fastqsanger | |
- fastqsanger.gz | |
Input adapter trimmed sequence reads (reverse): | |
class: File | |
ext: | |
- fastq | |
- fastq.gz | |
- fastqsanger | |
- fastqsanger.gz | |
active: false | |
- trs_id: '#workflow/github.com/iwc-workflows/baredsc/baredSC-1d-logNorm/versions/v0.5' | |
workflow_categories: | |
- TRANSCRIPTOMICS | |
workflow_name: baredSC_1d_logNorm | |
workflow_description: Run baredSC in 1 dimension in logNorm for 1 to N gaussians | |
and combine models. | |
ploidy: any | |
parameters: | |
Tabular with raw expression values: | |
class: File | |
ext: tabular | |
Gene name: | |
class: text | |
Maximum value in logNorm: | |
class: float | |
Maximum number of Gaussians to study: | |
class: integer | |
active: false | |
- trs_id: '#workflow/github.com/iwc-workflows/baredsc/baredSC-2d-logNorm/versions/v0.5' | |
workflow_categories: | |
- TRANSCRIPTOMICS | |
workflow_name: baredSC_2d_logNorm | |
workflow_description: Run baredSC in 2 dimensions in logNorm for 1 to N gaussians | |
and combine models. | |
ploidy: any | |
parameters: | |
Tabular with raw expression values: | |
class: File | |
ext: tabular | |
Gene name for x axis: | |
class: text | |
maximum value in logNorm for x-axis: | |
class: float | |
Gene name for y axis: | |
class: text | |
maximum value in logNorm for y-axis: | |
class: float | |
Maximum number of Gaussians to study: | |
class: integer | |
compute p-value: | |
class: boolean | |
active: false | |
- trs_id: '#workflow/github.com/iwc-workflows/brew3r/main/versions/v0.2' | |
workflow_categories: | |
- TRANSCRIPTOMICS | |
workflow_name: BREW3R | |
workflow_description: This workflow takes a collection of BAM (output of STAR) and | |
a gtf. It extends the input gtf using de novo annotation. | |
ploidy: any | |
parameters: | |
Input gtf: | |
class: File | |
ext: gtf | |
BAM collection: | |
class: Collection | |
ext: bam | |
strandedness: | |
class: text | |
minimum coverage: | |
class: integer | |
minimum FPKM for merge: | |
class: float | |
active: false | |
- trs_id: '#workflow/github.com/iwc-workflows/chipseq-pe/main/versions/v0.12' | |
workflow_categories: | |
- REGULATION | |
workflow_name: ChIPseq_PE | |
workflow_description: This workflow takes as input a collection of paired fastqs. | |
Remove adapters with cutadapt, map pairs with bowtie2. Keep MAPQ30 and concordant | |
pairs. MACS2 for paired bam. | |
ploidy: any | |
parameters: | |
PE fastq input: | |
class: Collection | |
adapter_forward: | |
class: text | |
adapter_reverse: | |
class: text | |
reference_genome: | |
class: text | |
effective_genome_size: | |
class: integer | |
normalize_profile: | |
class: boolean | |
active: false | |
- trs_id: '#workflow/github.com/iwc-workflows/chipseq-sr/main/versions/v0.12' | |
workflow_categories: | |
- REGULATION | |
workflow_name: ChIPseq_SR | |
workflow_description: This workflow takes as input a collection of fastqs (single | |
reads). Remove adapters with cutadapt, map with bowtie2. Keep MAPQ30. MACS2 for | |
bam with fixed extension or model. | |
ploidy: any | |
parameters: | |
SR fastq input: | |
class: Collection | |
adapter_forward: | |
class: text | |
reference_genome: | |
class: text | |
effective_genome_size: | |
class: integer | |
normalize_profile: | |
class: boolean | |
active: false | |
- trs_id: '#workflow/github.com/iwc-workflows/consensus-peaks/consensus-peaks-atac-cutandrun/versions/v1.2' | |
workflow_categories: | |
- REGULATION | |
workflow_name: Get Confident Peaks From ATAC or CUTandRUN replicates | |
workflow_description: This workflow takes as input BAM from ATAC-seq or CUT&RUN. | |
It calls peaks on each replicate and intersect them. In parallel, each BAM is | |
subsetted to smallest number of reads. Peaks are called using all subsets combined. | |
Only peaks called using a combination of all subsets which have summits intersecting | |
the intersection of at least x replicates will be kept. | |
ploidy: any | |
parameters: | |
n rmDup BAM: | |
class: Collection | |
ext: bam | |
Minimum number of overlap: | |
class: integer | |
effective_genome_size: | |
class: integer | |
bin_size: | |
class: integer | |
active: false | |
- trs_id: '#workflow/github.com/iwc-workflows/consensus-peaks/consensus-peaks-chip-pe/versions/v1.2' | |
workflow_categories: | |
- REGULATION | |
workflow_name: Get Confident Peaks From ChIP_PE replicates | |
workflow_description: This workflow takes as input PE BAM from ChIP-seq. It calls | |
peaks on each replicate and intersect them. In parallel, each BAM is subsetted | |
to smallest number of reads. Peaks are called using all subsets combined. Only | |
peaks called using a combination of all subsets which have summits intersecting | |
the intersection of at least x replicates will be kept. | |
ploidy: any | |
parameters: | |
n rmDup BAMPE: | |
class: Collection | |
ext: bam | |
Minimum number of overlap: | |
class: integer | |
effective_genome_size: | |
class: integer | |
bin_size: | |
class: integer | |
active: false | |
- trs_id: '#workflow/github.com/iwc-workflows/consensus-peaks/consensus-peaks-chip-sr/versions/v1.2' | |
workflow_categories: | |
- REGULATION | |
workflow_name: Get Confident Peaks From ChIP_SR replicates | |
workflow_description: This workflow takes as input SR BAM from ChIP-seq. It calls | |
peaks on each replicate and intersect them. In parallel, each BAM is subsetted | |
to smallest number of reads. Peaks are called using all subsets combined. Only | |
peaks called using a combination of all subsets which have summits intersecting | |
the intersection of at least x replicates will be kept. | |
ploidy: any | |
parameters: | |
n rmDup BAMSR: | |
class: Collection | |
ext: bam | |
Minimum number of overlap: | |
class: integer | |
effective_genome_size: | |
class: integer | |
bin_size: | |
class: integer | |
active: false | |
- trs_id: '#workflow/github.com/iwc-workflows/cutandrun/main/versions/v0.13' | |
workflow_categories: | |
- REGULATION | |
workflow_name: CUTandRUN | |
workflow_description: This workflow take as input a collection of paired fastq. | |
Remove adapters with cutadapt, map pairs with bowtie2 allowing dovetail. Keep | |
MAPQ30 and concordant pairs. BAM to BED. MACS2 with "ATAC" parameters. | |
ploidy: any | |
parameters: | |
PE fastq input: | |
class: Collection | |
adapter_forward: | |
class: text | |
adapter_reverse: | |
class: text | |
reference_genome: | |
class: text | |
effective_genome_size: | |
class: integer | |
normalize_profile: | |
class: boolean | |
active: false | |
- trs_id: '#workflow/github.com/iwc-workflows/fastq-to-matrix-10x/scrna-seq-fastq-to-matrix-10x-cellplex/versions/v0.5' | |
workflow_categories: | |
- TRANSCRIPTOMICS | |
workflow_name: scRNA-seq_preprocessing_10X_cellPlex | |
workflow_description: This workflow processes the CMO fastqs with CITE-seq-Count | |
and include the translation step required for cellPlex processing. In parallel | |
it processes the Gene Expresion fastqs with STARsolo, filter cells with DropletUtils | |
and reformat all outputs to be easily used by the function 'Read10X' from Seurat. | |
ploidy: any | |
parameters: | |
fastq PE collection GEX: | |
class: Collection | |
reference genome: | |
class: text | |
gtf: | |
class: File | |
cellranger_barcodes_3M-february-2018.txt: | |
class: File | |
Barcode Size is same size of the Read: | |
class: boolean | |
fastq PE collection CMO: | |
class: Collection | |
sample name and CMO sequence collection: | |
class: Collection | |
ext: csv | |
Number of expected cells: | |
class: integer | |
active: false | |
- trs_id: '#workflow/github.com/iwc-workflows/fastq-to-matrix-10x/scrna-seq-fastq-to-matrix-10x-v3/versions/v0.5' | |
workflow_categories: | |
- TRANSCRIPTOMICS | |
workflow_name: scRNA-seq_preprocessing_10X_v3_Bundle | |
workflow_description: This workflow processes the Gene Expresion fastqs with STARsolo, | |
filter cells with DropletUtils and reformat all outputs to be easily used by the | |
function 'Read10X' from Seurat. | |
ploidy: any | |
parameters: | |
fastq PE collection: | |
class: Collection | |
reference genome: | |
class: text | |
gtf: | |
class: File | |
cellranger_barcodes_3M-february-2018.txt: | |
class: File | |
Barcode Size is same size of the Read: | |
class: boolean | |
active: false | |
- trs_id: '#workflow/github.com/iwc-workflows/generic-variant-calling-wgs-pe/main/versions/v0.1.1' | |
workflow_categories: | |
- VARIANT_CALLING | |
workflow_name: Generic variation analysis on WGS PE data | |
workflow_description: Workflow for variant analysis against a reference genome | |
in GenBank format | |
ploidy: any | |
parameters: | |
Paired Collection: | |
class: Collection | |
ext: | |
- fastqsanger | |
- fastqsanger.gz | |
GenBank genome: | |
class: File | |
ext: genbank | |
Name for genome database: | |
class: text | |
active: false | |
- trs_id: '#workflow/github.com/iwc-workflows/goseq/main/versions/v0.1' | |
workflow_categories: | |
- TRANSCRIPTOMICS | |
workflow_name: Goseq GO-KEGG Enrichment Analysis | |
workflow_description: This workflow is used for GO and KEGG enrichment analysis | |
using GOseq tools. | |
ploidy: any | |
parameters: | |
Select genome to use: | |
class: text | |
Differential expression result: | |
class: File | |
ext: tabular | |
Select gene ID format: | |
class: text | |
gene length: | |
class: File | |
ext: tabular | |
KEGG pathways: | |
class: File | |
ext: tabular | |
active: false | |
- trs_id: '#workflow/github.com/iwc-workflows/haploid-variant-calling-wgs-pe/main/versions/v0.1' | |
workflow_categories: | |
- VARIANT_CALLING | |
workflow_name: Paired end variant calling in haploid system | |
workflow_description: Workflow for variant analysis against a reference genome in | |
GenBank format | |
ploidy: any | |
parameters: | |
Paired Collection: | |
class: Collection | |
ext: | |
- fastqsanger | |
- fastqsanger.gz | |
Annotation GTF: | |
class: File | |
Genome fasta: | |
class: File | |
active: false | |
- trs_id: '#workflow/github.com/iwc-workflows/hic-hicup-cooler/chic-fastq-to-cool-hicup-cooler/versions/v0.3' | |
workflow_categories: | |
- REGULATION | |
workflow_name: cHi-C_fastqToCool_hicup_cooler | |
workflow_description: This workflow take as input a collection of paired fastq. | |
It uses HiCUP to go from fastq to validPair file. The pairs are filtered for MAPQ | |
and for the region captured. Then, they are sorted by cooler to generate a tabix | |
dataset. Cooler is used to generate a balanced cool file to the desired resolution. | |
ploidy: any | |
parameters: | |
PE fastq input: | |
class: Collection | |
genome name: | |
class: text | |
Restriction enzyme: | |
class: text | |
No fill-in: | |
class: boolean | |
minimum MAPQ: | |
class: integer | |
Bin size in bp: | |
class: integer | |
Interactions to consider to calculate weights in normalization step: | |
class: text | |
capture region (chromosome): | |
class: text | |
capture region (start): | |
class: integer | |
capture region (end): | |
class: integer | |
active: false | |
- trs_id: '#workflow/github.com/iwc-workflows/hic-hicup-cooler/hic-fastq-to-cool-hicup-cooler/versions/v0.3' | |
workflow_categories: | |
- REGULATION | |
workflow_name: Hi-C_fastqToCool_hicup_cooler | |
workflow_description: This workflow takes as input a collection of paired fastq. | |
It uses HiCUP to go from fastq to validPair file using the middle of the fragment | |
as coordinates. The pairs are filtered for MAPQ and sorted by cooler to generate | |
a tabix dataset. Cooler is used to generate a balanced cool file to the desired | |
resolution. | |
ploidy: any | |
parameters: | |
PE fastq input: | |
class: Collection | |
genome name: | |
class: text | |
Restriction enzyme: | |
class: text | |
No fill-in: | |
class: boolean | |
minimum MAPQ: | |
class: integer | |
Bin size in bp: | |
class: integer | |
Interactions to consider to calculate weights in normalization step: | |
class: text | |
region for matrix plotting: | |
class: text | |
active: false | |
- trs_id: '#workflow/github.com/iwc-workflows/hic-hicup-cooler/hic-fastq-to-pairs-hicup/versions/v0.3' | |
workflow_categories: | |
- REGULATION | |
workflow_name: Hi-C_fastqToPairs_hicup | |
workflow_description: This workflow takes as input a collection of paired fastq. | |
It uses HiCUP to go from fastq to validPair file. First truncate the fastq using | |
the cutting sequence to guess the fill-in. Then map the truncated fastq. Then | |
asign to fragment and filter the self-ligated and dandling ends or internal (it | |
can also filter for the size). Then it removes the duplicates. Convert the output | |
to be compatible with juicebox or cooler using the middle of the fragment as coordinates. | |
Finally filter for mapping quality | |
ploidy: any | |
parameters: | |
PE fastq input: | |
class: Collection | |
genome name: | |
class: text | |
Restriction enzyme: | |
class: text | |
No fill-in: | |
class: boolean | |
minimum MAPQ: | |
class: integer | |
active: false | |
- trs_id: '#workflow/github.com/iwc-workflows/hic-hicup-cooler/hic-juicermediumtabix-to-cool-cooler/versions/v0.3' | |
workflow_categories: | |
- REGULATION | |
workflow_name: Hi-C_juicermediumtabixToCool_cooler | |
workflow_description: This workflow uses as input a collection of juicer medium | |
tabix files and a genome name. It builds balanced cool file to the desired resolution. | |
ploidy: any | |
parameters: | |
Bin size in bp: | |
class: integer | |
genome name: | |
class: text | |
Juicer Medium Tabix with validPairs: | |
class: Collection | |
Interactions to consider to calculate weights in normalization step: | |
class: text | |
active: false | |
- trs_id: '#workflow/github.com/iwc-workflows/polish-with-long-reads/main/versions/v0.1' | |
workflow_categories: | |
- ASSEMBLY | |
workflow_name: Assembly polishing with long reads | |
workflow_description: Racon polish with long reads, x4 | |
ploidy: any | |
parameters: | |
Assembly to be polished: | |
class: File | |
long reads: | |
class: File | |
'minimap setting (for long reads) ': | |
class: text | |
active: false | |
- trs_id: '#workflow/github.com/iwc-workflows/pseudobulk-worflow-decoupler-edger/main/versions/v0.1.1' | |
workflow_categories: | |
- TRANSCRIPTOMICS | |
workflow_name: Differential gene expression for single-cell data using pseudo-bulk | |
counts with edgeR | |
workflow_description: This workflow uses the decoupler tool in Galaxy to generate | |
pseudobulk counts from an annotated AnnData file obtained from scRNA-seq analysis. | |
Following the pseudobulk step, differential expression genes (DEG) are calculated | |
using the edgeR tool. The workflow also includes data sanitation steps to ensure | |
smooth operation of edgeR and minimizing potential issues. Additionally, a Volcano | |
plot tool is used to visualize the results after the DEG analysis. | |
ploidy: any | |
parameters: | |
Source AnnData file: | |
class: File | |
ext: | |
- h5 | |
- h5ad | |
'Pseudo-bulk: Fields to merge': | |
class: text | |
Group by column: | |
class: text | |
Sample key column: | |
class: text | |
Name Your Raw Counts Layer: | |
class: text | |
Factor fields: | |
class: text | |
Formula: | |
class: text | |
Gene symbol column: | |
class: text | |
active: false | |
- trs_id: '#workflow/github.com/iwc-workflows/quality-and-contamination-control/main/versions/v1.1.6' | |
workflow_categories: | |
- ASSEMBLY | |
workflow_name: Quality and Contamination Control For Genome Assembly | |
workflow_description: Short paired-end read analysis to provide quality analysis, | |
read cleaning and taxonomy assignation | |
ploidy: any | |
parameters: | |
Input sequence reads (forward): | |
class: File | |
ext: | |
- fastq | |
- fastq.gz | |
- fastqsanger | |
- fastqsanger.gz | |
Input sequence reads (reverse): | |
class: File | |
ext: | |
- fastq | |
- fastq.gz | |
- fastqsanger | |
- fastqsanger.gz | |
Select a taxonomy database: | |
class: text | |
Select a NCBI taxonomy database: | |
class: text | |
active: false | |
- trs_id: '#workflow/github.com/iwc-workflows/rnaseq-de/main/versions/v0.2' | |
workflow_categories: | |
- TRANSCRIPTOMICS | |
workflow_name: RNAseq_DE_filtering_plotting | |
workflow_description: 'This workflow can only work on an experimental setup with | |
exactly 2 conditions. It takes two collections of count tables as input and performs | |
differential expression analysis. Additionally it filters for DE genes based on | |
adjusted p-value and log2 fold changes thresholds. It also generates informative | |
plots. | |
' | |
ploidy: any | |
parameters: | |
Counts from changed condition: | |
class: Collection | |
Counts from reference condition: | |
class: Collection | |
Count files have header: | |
class: boolean | |
Gene Annotaton: | |
class: File | |
Adjusted p-value threshold: | |
class: float | |
log2 fold change threshold: | |
class: float | |
active: false | |
- trs_id: '#workflow/github.com/iwc-workflows/rnaseq-pe/main/versions/v1.1' | |
workflow_categories: | |
- TRANSCRIPTOMICS | |
workflow_name: RNA-seq for Paired-end fastqs | |
workflow_description: 'This workflow takes as input a list of paired-end fastqs. | |
Adapters and bad quality bases are removed with fastp. Reads are mapped with STAR | |
with ENCODE parameters and genes are counted simultaneously as well as normalized | |
coverage (per million mapped reads) on uniquely mapped reads. The counts are reprocessed | |
to be similar to HTSeq-count output. Alternatively, featureCounts can be used | |
to count the reads/fragments per gene. FPKM are computed with cufflinks and/or | |
with StringTie. The unstranded normalized coverage is computed with bedtools. | |
' | |
ploidy: any | |
parameters: | |
Collection paired FASTQ files: | |
class: Collection | |
Forward adapter: | |
class: text | |
Reverse adapter: | |
class: text | |
Generate additional QC reports: | |
class: boolean | |
Reference genome: | |
class: text | |
GTF file of annotation: | |
class: File | |
Strandedness: | |
class: text | |
Use featureCounts for generating count tables: | |
class: boolean | |
Compute Cufflinks FPKM: | |
class: boolean | |
GTF with regions to exclude from FPKM normalization with Cufflinks: | |
class: File | |
Compute StringTie FPKM: | |
class: boolean | |
active: false | |
- trs_id: '#workflow/github.com/iwc-workflows/rnaseq-sr/main/versions/v1.1' | |
workflow_categories: | |
- TRANSCRIPTOMICS | |
workflow_name: RNA-seq for Single-read fastqs | |
workflow_description: 'This workflow takes as input a list of single-end fastqs. | |
Adapters and bad quality bases are removed with fastp. Reads are mapped with STAR | |
with ENCODE parameters and genes are counted simultaneously as well as normalized | |
coverage (per million mapped reads) on uniquely mapped reads. The counts are reprocessed | |
to be similar to HTSeq-count output. Alternatively, featureCounts can be used | |
to count the reads/fragments per gene. FPKM are computed with cufflinks and/or | |
with StringTie. The unstranded normalized coverage is computed with bedtools. | |
' | |
ploidy: any | |
parameters: | |
Collection of FASTQ files: | |
class: Collection | |
Forward adapter: | |
class: text | |
Generate additional QC reports: | |
class: boolean | |
Reference genome: | |
class: text | |
GTF file of annotation: | |
class: File | |
Strandedness: | |
class: text | |
Use featureCounts for generating count tables: | |
class: boolean | |
Compute Cufflinks FPKM: | |
class: boolean | |
GTF with regions to exclude from FPKM normalization with Cufflinks: | |
class: File | |
Compute StringTie FPKM: | |
class: boolean | |
active: false | |
- trs_id: '#workflow/github.com/iwc-workflows/variation-reporting/main/versions/v0.1.1' | |
workflow_categories: | |
- VARIANT_CALLING | |
workflow_name: Generic variation analysis reporting | |
workflow_description: This workflow takes a VCF dataset of variants produced by | |
any of the variant calling workflows in https://github.com/galaxyproject/iwc/tree/main/workflows/sars-cov-2-variant-calling | |
and generates tabular lists of variants by Samples and by Variant, and an overview | |
plot of variants and their allele-frequencies. | |
ploidy: any | |
parameters: | |
Variation data to report: | |
class: Collection | |
ext: | |
- vcf | |
- vcf_bgzip | |
AF Filter: | |
class: float | |
DP Filter: | |
class: integer | |
DP_ALT Filter: | |
class: integer | |
active: false |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment