These experiments had columns that were all NA, and our stuff failed on them:
- E-MTAB-2039 Oryza sativa, Nipponbare RNA-Seq - Conserved Poaceae Specific Genes Project
- E-MTAB-4400 Sorghum bicolor, BTx623 RNA-Seq - Conserved Poaceae Specific Genes Project
- E-MTAB-4401 Brachypodium distachyon, Bd21 RNA-Seq - Conserved Poaceae Specific Genes Project
- E-MTAB-4818 Solanum lycopersicumTranscriptome or Gene expression
This probably also needs an ISL rerun because not all assays are present in files:
- E-GEOD-42871
Most columns in that datafile are NA:
[wbazant@ebi-cli-001 ~]$ head /nfs/production3/ma/home/atlas3-production/analysis/baseline/rna-seq/experiments/E-MTAB-2039/E-MTAB-2039-transcripts-tpms.tsv.undecorated
Transcript ID SRR352184 SRR352187 SRR352189 SRR352190 SRR352192 SRR352194 SRR352204 SRR352206 SRR352207 SRR352209 SRR352211
OS11T0599200-01 33.18 107.81 NA NA NA NA NA NA NA NA NA
OS11T0599101-01 0 0.09 NA NA NA NA NA NA NA NA NA
OS11T0599100-01 0.01 0.06 NA NA NA NA NA NA NA NA NA
OS11T0599000-00 0 0.03 NA NA NA NA NA NA NA NA NA
OS11T0598900-00 0 0.67 NA NA NA NA NA NA NA NA NA
OS11T0598800-01 0 3.67 NA NA NA NA NA NA NA NA NA
OS11T0598700-00 0 0.14 NA NA NA NA NA NA NA NA NA
OS11T0598500-00 0.3 0.32 NA NA NA NA NA NA NA NA NA
OS11T0598300-00 3.38 7.71 NA NA NA NA NA NA NA NA NA
It fails us like so:
INFO - Reading XML config from /nfs/production3/ma/home/atlas3-production/analysis/baseline/rna-seq/experiments/E-MTAB-2039/E-MTAB-2039-configuration.xml ...
INFO - Successfully read XML config.
INFO - Reading matrix from /nfs/production3/ma/home/atlas3-production/analysis/baseline/rna-seq/experiments/E-MTAB-2039/E-MTAB-2039-transcripts-tpms.tsv.undecorated ...
INFO - Successfully read /nfs/production3/ma/home/atlas3-production/analysis/baseline/rna-seq/experiments/E-MTAB-2039/E-MTAB-2039-transcripts-tpms.tsv.undecorated
INFO - Running quantile normalization...
INFO - Running quantile normalization in R...
FATAL - Errors during R script execution, details below:
Error in xy.coords(x, y) : 'x' and 'y' lengths differ
Calls: lapply ... normalizeQuantiles -> approx -> regularize.values -> xy.coords
Execution halted
Errors during R script execution, details below:
Error in xy.coords(x, y) : 'x' and 'y' lengths differ
Calls: lapply ... normalizeQuantiles -> approx -> regularize.values -> xy.coords
Execution halted
I think there are NA's in the data file because the raw counts for these assays are all zero.
/nfs/production3/ma/home/irap_prod/single_lib/studies/E-MTAB-2039/oryza_sativa/transcripts.raw.kallisto.tsv
This is only for transcript level results, the other pipeline does pick up counts:
/nfs/production3/ma/home/irap_prod/single_lib/studies/E-MTAB-2039/oryza_sativa/genes.raw.tsv
I suspect the same as E-MTAB-2039 because it's the same error. File looks like:
Transcript ID SRR349643 SRR349644 SRR349645 SRR349646 SRR349754 SRR349767 SRR349768 SRR349769 SRR349771 SRR349772
EES11166 2.97 11.41 28.81 NA NA NA NA NA NA NA
EES11195 4.06 4.12 1.93 NA NA NA NA NA NA NA
EES11310 1.29 5.78 9.52 NA NA NA NA NA NA NA
EES12864 106.56 332.3 537.51 NA NA NA NA NA NA NA
KXG27123 13.73 22.64 3.17 NA NA NA NA NA NA NA
KXG27122 3.36 8.08 8 NA NA NA NA NA NA NA
EES12867 0 0 0 NA NA NA NA NA NA NA
KXG25909 102.97 56.82 46.25 NA NA NA NA NA NA NA
KXG25908 0 0 39.23 NA NA NA NA NA NA NA
Same:
[fg_atlas@ebi-cli-001 ~]$ head /nfs/production3/ma/home/atlas3-production/analysis/baseline/rna-seq/experiments/E-MTAB-4401/E-MTAB-4401-transcripts-tpms.tsv.undecorated
Transcript ID SRR349785 SRR349786 SRR349787 SRR352137 SRR352138 SRR352139 SRR352140 SRR352141 SRR352142 SRR352143 SRR352144
BRADI0007S00200.2 0.03 NA NA NA NA NA NA NA NA NA NA
BRADI0007S00210.2 565.61 NA NA NA NA NA NA NA NA NA NA
BRADI0007S00220.2 0.03 NA NA NA NA NA NA NA NA NA NA
BRADI0007S00233.1 0 NA NA NA NA NA NA NA NA NA NA
BRADI0009S00203.1 0.07 NA NA NA NA NA NA NA NA NA NA
BRADI0009S00210.5 25.52 NA NA NA NA NA NA NA NA NA NA
BRADI0009S00210.6 0.52 NA NA NA NA NA NA NA NA NA NA
BRADI0009S00210.7 172.63 NA NA NA NA NA NA NA NA NA NA
BRADI0009S00220.2 0 NA NA NA NA NA NA NA NA NA NA
Same:
head /nfs/production3/ma/home/atlas3-production/analysis/baseline/rna-seq/experiments/E-MTAB-4818/E-MTAB-4818-transcripts-tpms.tsv.undecorated
Transcript ID SRR346617 SRR346618 SRR346619 SRR346620 SRR346621 SRR346622 SRR346623 SRR346624 SRR346625 SRR346626 SRR346627 SRR346628 SRR346629 SRR346630 SRR346631 SRR346632 SRR346633 SRR346634 SRR346635 SRR346636
Solyc02g062170.2.1 NA NA NA 3.18 NA NA NA 0 NA NA NA 0 NA 0 0.41 0 0.18 0 2.17 6.13
Solyc02g070370.2.1 NA NA NA 1.19 NA NA NA 1.69 NA NA NA 0.43 NA 0.57 0.43 1.56 0 0.22 0.28 0
Solyc02g085580.2.1 NA NA NA 0.45 NA NA NA 1.37 NA NA NA 2.05 NA 13.51 12.12 18.79 0.08 0.08 2.64 4.33
Solyc02g088390.2.1 NA NA NA 200.07 NA NA NA 41.31 NA NA NA 20.48 NA 33.14 51.74 113.17 549.54 247.02 247.46 44.25
Solyc03g097980.2.1 NA NA NA 6.05 NA NA NA 7.77 NA NA NA 7.31 NA 6.97 7.28 8.83 0.48 0.2 1.46 0
Solyc03g111820.2.1 NA NA NA 0.37 NA NA NA 0 NA NA NA 0 NA 0 0.86 0 2.3 3.36 46.51 0
Solyc03g121880.2.1 NA NA NA 34.2 NA NA NA 82.24 NA NA NA 80.96 NA 90.29 94.3 86.36 140.75 21.66 60.32 58.36
Solyc04g071620.2.1 NA NA NA 38.52 NA NA NA 582.73 NA NA NA 192.3 NA 17.05 36.8 66.95 11.58 8.73 217.61 530.28
Solyc04g076870.2.1 NA NA NA 50.14 NA NA NA 75.43 NA NA NA 33.47 NA 52.65 43.39 31.55 76.18 14.76 42.83 77.61
The results in ISL miss columns and were not synced to $ATLAS_PROD/analysis.
[fg_atlas@ebi-cli-001 analysis_archive]$ ls -latorh /nfs/production3/ma/home/atlas3-production/analysis/baseline/rna-seq/experiments/E-GEOD-42871/E-GEOD-42871-configuration.xml -rw-r--r-- 1 fg_atlas 1.8K Feb 8 2016 /nfs/production3/ma/home/atlas3-production/analysis/baseline/rna-seq/experiments/E-GEOD-42871/E-GEOD-42871-configuration.xml [fg_atlas@ebi-cli-001 analysis_archive]$ grep '' /nfs/production3/ma/home/atlas3-production/analysis/baseline/rna-seq/experiments/E-GEOD-42871/E-GEOD-42871-configuration.xml | wc -l 16
find /nfs/production3/ma/home/atlas3-production/analysis/baseline/rna-seq/experiments/E-GEOD-42871 -type f -name *tsv* | while read -r file ; do ls -latorh $file; head -n1 $file | tr $'\t' $'\n' | wc -l ; done
-rw-r--r-- 1 fg_atlas 2.7K Feb 2 2016 /nfs/production3/ma/home/atlas3-production/analysis/baseline/rna-seq/experiments/E-GEOD-42871/qc/E-GEOD-42871-findCRAMFiles-report.tsv
5
-rw-r--r-- 1 fg_atlas 1.1K Feb 2 2016 /nfs/production3/ma/home/atlas3-production/analysis/baseline/rna-seq/experiments/E-GEOD-42871/E-GEOD-42871-analysis-methods.tsv
2
-rw-r--r-- 1 fg_atlas 3.4M Feb 2 2016 /nfs/production3/ma/home/atlas3-production/analysis/baseline/rna-seq/experiments/E-GEOD-42871/E-GEOD-42871-raw-counts.tsv.undecorated
17
-rw-r--r-- 1 fg_atlas 4.5M May 19 12:58 /nfs/production3/ma/home/atlas3-production/analysis/baseline/rna-seq/experiments/E-GEOD-42871/E-GEOD-42871-tpms.tsv.undecorated
17
-rw-r--r-- 1 fg_atlas 6.9M May 19 12:59 /nfs/production3/ma/home/atlas3-production/analysis/baseline/rna-seq/experiments/E-GEOD-42871/E-GEOD-42871-tpms.tsv.undecorated.aggregated
10
-rw-r--r-- 1 fg_atlas 6.8M Sep 27 04:12 /nfs/production3/ma/home/atlas3-production/analysis/baseline/rna-seq/experiments/E-GEOD-42871/E-GEOD-42871-fpkms.tsv
11
-rw-r--r-- 1 fg_atlas 62M Jul 11 12:06 /nfs/production3/ma/home/atlas3-production/analysis/baseline/rna-seq/experiments/E-GEOD-42871/E-GEOD-42871-coexpressions.tsv.gz
1
-rw-r--r-- 1 fg_atlas 6.8M Feb 2 2016 /nfs/production3/ma/home/atlas3-production/analysis/baseline/rna-seq/experiments/E-GEOD-42871/E-GEOD-42871-fpkms.tsv.undecorated.aggregated
10
-rw-r--r-- 1 fg_atlas 6.9M Sep 27 04:14 /nfs/production3/ma/home/atlas3-production/analysis/baseline/rna-seq/experiments/E-GEOD-42871/E-GEOD-42871-tpms.tsv
11
Missing these runs: SRR639163 SRR639165
find /nfs/production3/ma/home/irap_prod/single_lib/studies/E-GEOD-42871/glycine_max -type f -name '*.tsv' | while read -r file ; do ls -latorh $file; head -n1 $file | tr $'\t' $'\n' | wc -l ; done
-rw-r--r-- 1 fg_atlas 26M Jun 30 2016 /nfs/production3/ma/home/irap_prod/single_lib/studies/E-GEOD-42871/glycine_max/exons.tpm.dexseq.tsv
17
-rw-r--r-- 1 fg_atlas 4.9M Sep 4 15:14 /nfs/production3/ma/home/irap_prod/single_lib/studies/E-GEOD-42871/glycine_max/transcripts.fpkm.kallisto.tsv
14
-rw-r--r-- 1 fg_atlas 3.7M Sep 4 15:14 /nfs/production3/ma/home/irap_prod/single_lib/studies/E-GEOD-42871/glycine_max/genes.fpkm.htseq2.tsv
14
-rw-r--r-- 1 fg_atlas 27M Jun 30 2016 /nfs/production3/ma/home/irap_prod/single_lib/studies/E-GEOD-42871/glycine_max/exons.fpkm.dexseq.tsv
17
-rw-r--r-- 1 fg_atlas 1.2K Sep 4 15:12 /nfs/production3/ma/home/irap_prod/single_lib/studies/E-GEOD-42871/glycine_max/irap.versions.tsv
4
-rw-r--r-- 1 fg_atlas 19M Jun 30 2016 /nfs/production3/ma/home/irap_prod/single_lib/studies/E-GEOD-42871/glycine_max/exons.raw.dexseq.tsv
17
-rw-r--r-- 1 fg_atlas 3.7M Sep 4 15:14 /nfs/production3/ma/home/irap_prod/single_lib/studies/E-GEOD-42871/glycine_max/genes.tpm.htseq2.tsv
14
-rw-r--r-- 1 fg_atlas 2.9M Sep 4 15:14 /nfs/production3/ma/home/irap_prod/single_lib/studies/E-GEOD-42871/glycine_max/genes.raw.htseq2.tsv
14
-rw-r--r-- 1 fg_atlas 5.0M Sep 4 15:14 /nfs/production3/ma/home/irap_prod/single_lib/studies/E-GEOD-42871/glycine_max/transcripts.tpm.kallisto.tsv
14
-rw-r--r-- 1 fg_atlas 3.7M Sep 4 15:14 /nfs/production3/ma/home/irap_prod/single_lib/studies/E-GEOD-42871/glycine_max/genes.tpm.kallisto.tsv
14
-rw-r--r-- 1 fg_atlas 3.7M Sep 4 15:14 /nfs/production3/ma/home/irap_prod/single_lib/studies/E-GEOD-42871/glycine_max/genes.fpkm.kallisto.tsv
14
-rw-r--r-- 1 fg_atlas 3.6M Sep 4 15:14 /nfs/production3/ma/home/irap_prod/single_lib/studies/E-GEOD-42871/glycine_max/transcripts.riu.kallisto.tsv
14
-rw-r--r-- 1 fg_atlas 6.0M Sep 4 15:14 /nfs/production3/ma/home/irap_prod/single_lib/studies/E-GEOD-42871/glycine_max/transcripts.raw.kallisto.tsv
14
-rw-r--r-- 1 fg_atlas 4.4M Sep 4 15:14 /nfs/production3/ma/home/irap_prod/single_lib/studies/E-GEOD-42871/glycine_max/genes.raw.kallisto.tsv
14