Skip to content

Instantly share code, notes, and snippets.

@mmterpstra
Created September 21, 2016 07:41
Show Gist options
  • Save mmterpstra/2d29011a7f7266a243eac13e18641e73 to your computer and use it in GitHub Desktop.
Save mmterpstra/2d29011a7f7266a243eac13e18641e73 to your computer and use it in GitHub Desktop.
Basecall Recepies
#!/bin/bash
set -e
set -x
set -o pipefail
cd $(dirname $0)
BCLRUN=run1
ml purge
ml bcl2fastq2/v2.17.1.14-foss-2015b
bcl2fastq \
--sample-sheet SampleSheet.csv \
-i Data/Intensities/BaseCalls/ \
-R ./ \
--intensities-dir Data/Intensities/ \
-o $BCLRUN/ \
--use-bases-mask y*,i* \
--barcode-mismatches 1 \
--minimum-trimmed-read-length 0 \
--create-fastq-for-index-reads
cp $0 $BCLRUN/
#manual zip
HTMLREPORTZIP=$(pwd)/$BCLRUN/$(basename $(pwd))_report.zip
(cd $BCLRUN/Reports/ && zip -ru $HTMLREPORTZIP html )&
#create md5sums
for i in $(find $(pwd)/$BCLRUN/ -name \*.fastq.gz -type f); do
(cd $(dirname $i) && md5sum $(basename $i) | tee $i.md5);
done
#cp to prm
RAWDIR=$PRM02/data/raw/$(basename $(pwd))/
mkdir -p $RAWDIR
#for i in $(echo $(find $BCLRUN -name *.fastq.gz* -type f) $(find $BCLRUN -name *.zip -type f)); do
# cp "$i" "$RAWDIR";
#done
#cp SampleSheet.csv $0 $RAWDIR
#(for i in $(find $RAWDIR -name \*.fastq.gz.md5 -type f); do
# (cd $(dirname $i) && md5sum -c $i );
#done)| column -t
#!/bin/bash
set -e
set -x
set -o pipefail
cd $(dirname $0)
BCLRUN=run1
ml purge
ml bcl2fastq2/v2.17.1.14-foss-2015b
bcl2fastq \
--sample-sheet SampleSheet.csv \
-i Data/Intensities/BaseCalls/ \
-R ./ \
--intensities-dir Data/Intensities/ \
-o $BCLRUN/ \
--use-bases-mask y*,i8y*,y* \
--barcode-mismatches 1 \
--minimum-trimmed-read-length 0 \
--create-fastq-for-index-reads
cp $0 $BCLRUN/
#manual zip
HTMLREPORTZIP=$(pwd)/$BCLRUN/$(basename $(pwd))_report.zip
(cd $BCLRUN/Reports/ && zip -ru $HTMLREPORTZIP html )&
#create md5sums
for i in $(find $(pwd)/$BCLRUN/ -name \*.fastq.gz -type f); do
(cd $(dirname $i) && md5sum $(basename $i) | tee $i.md5);
done
#cp to prm
RAWDIR=$PRM02/data/raw/$(basename $(pwd))/
mkdir -p $RAWDIR
#for i in $(echo $(find $BCLRUN -name *.fastq.gz* -type f) $(find $BCLRUN -name *.zip -type f)); do
# cp "$i" "$RAWDIR";
#done
#cp SampleSheet.csv $0 $RAWDIR
#(for i in $(find $RAWDIR -name \*.fastq.gz.md5 -type f); do
# (cd $(dirname $i) && md5sum -c $i );
#done)| column -t
#!/bin/bash
set -e
set -x
set -o pipefail
cd $(dirname $0)
BCLRUN=run1
ml purge
ml bcl2fastq2/v2.17.1.14-foss-2015b
bcl2fastq \
--sample-sheet SampleSheet.csv \
-i Data/Intensities/BaseCalls/ \
-R ./ \
--intensities-dir Data/Intensities/ \
-o $BCLRUN/ \
--use-bases-mask y*,i8y* \
--barcode-mismatches 1 \
--minimum-trimmed-read-length 0 \
--create-fastq-for-index-reads
cp $0 $BCLRUN/
#manual zip
HTMLREPORTZIP=$(pwd)/$BCLRUN/$(basename $(pwd))_report.zip
(cd $BCLRUN/Reports/ && zip -ru $HTMLREPORTZIP html )&
#create md5sums
for i in $(find $(pwd)/$BCLRUN/ -name \*.fastq.gz -type f); do
(cd $(dirname $i) && md5sum $(basename $i) | tee $i.md5);
done
#cp to prm
RAWDIR=$PRM02/data/raw/$(basename $(pwd))/
mkdir -p $RAWDIR
#for i in $(echo $(find $BCLRUN -name *.fastq.gz* -type f) $(find $BCLRUN -name *.zip -type f)); do
# cp "$i" "$RAWDIR";
#done
#cp SampleSheet.csv $0 $RAWDIR
#(for i in $(find $RAWDIR -name \*.fastq.gz.md5 -type f); do
# (cd $(dirname $i) && md5sum -c $i );
#done)| column -t
We can make this file beautiful and searchable if this error is corrected: It looks like row 2 should actually have 1 column, instead of 2 in line 1.
[Header]
IEMFileVersion,4
Investigator Name,JohnDoe
Experiment Name,Experiment1
Date,01/01/2000
Workflow,GenerateFASTQ
Application,FASTQ Only
Assay,TruSeq LT
Description,Nugene samples
Chemistry,Default
[Reads]
151
151
[Settings]
ReverseComplement,0
Adapter,AGATCGGAAGAGCACACGTCTGAACTCCAGTCA
AdapterRead3,AGATCGGAAGAGCGTCGTGTAGGGAAAGA
[Data]
Sample_ID,Sample_Name,Sample_Plate,Sample_Well,I7_Index_ID,index,Sample_Project,Description
s1,s1,,,A001,ACGTCGTT,Experiment1,
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment