Skip to content

Instantly share code, notes, and snippets.

@philippbayer
Created September 28, 2017 02:49
Show Gist options
  • Select an option

  • Save philippbayer/468f5a8c2d066da5f11e576b0c351075 to your computer and use it in GitHub Desktop.

Select an option

Save philippbayer/468f5a8c2d066da5f11e576b0c351075 to your computer and use it in GitHub Desktop.
SRA libraries for B. oleracea and rapa where len(R1) != len(R2)
Oleracea/fastq/SRR1212999_1.fastq has different read lengths, 100 and 99
Oleracea/fastq/SRR1213000_1.fastq has different read lengths, 100 and 99
Oleracea/fastq/SRR1212998_1.fastq has different read lengths, 100 and 99
Oleracea/fastq/SRR1213087_1.fastq has different read lengths, 100 and 0
Rapa/fastq/SRR1777752_1.fastq has different read lengths, 71 and 76
Rapa/fastq/SRR496614_1.fastq has different read lengths, 156 and 142
Rapa/fastq/SRR396843_1.fastq has different read lengths, 156 and 142
Rapa/fastq/SRR385951_1.fastq has different read lengths, 156 and 142
Rapa/fastq/SRR385948_1.fastq has different read lengths, 106 and 92
Rapa/fastq/SRR385947_1.fastq has different read lengths, 156 and 142
Rapa/fastq/SRR385946_1.fastq has different read lengths, 106 and 92
Rapa/fastq/SRR396842_1.fastq has different read lengths, 106 and 92
Rapa/fastq/SRR385942_1.fastq has different read lengths, 106 and 92
Rapa/fastq/SRR385944_1.fastq has different read lengths, 156 and 142
Rapa/fastq/SRR385945_1.fastq has different read lengths, 156 and 142
@osris
Copy link

osris commented Sep 29, 2017

three of the Oleracea are really 101x100. the mate1 files all begin with an 'N'. pulling SRR1213087 from tape to take a look
the Rapa were misloaded due to a typo in submission. - we catch this situation for the past couple years but haven't cleaned up all old mistakes. before downloading, you can check stats from statistics block on RunBrowser web page or pull xml by adding &retmode=xml param like this:
https://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?run=SRR496614&retmode=xml
right now it looks like this:
<Statistics nreads="2" nspots="125742182"> <Read index="0" count="125742182" average="157" stdev="0"/> <Read index="1" count="125742182" average="143" stdev="0"/> </Statistics>

these are queued for reload tonight (along with a few hundred others)
after reload these will be even 150/150 , 100/100, or 75/75

@osris
Copy link

osris commented Oct 11, 2017

forgot to update this, but these have been fixed. feel free to send me other SRA/NCBI data issues you come across.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment