David Rio drio

🐢

I don't know

I have no idea how to do that, let me get started.

94 followers · 79 following

View GitHub Profile

Recently created

Least recently created

Recently updated

Least recently updated

drio / gist:392508

Created May 6, 2010 18:28

	fprintf(stderr, "DRD>> name:", a->readName);
	fprintf(stderr, "DRD>> length: %d", a->readNameLength);
	fprintf(stderr, "DRD>> space: %d", a->space);
	fprintf(stderr, "DRD>> %d", a->numEnds);

drio / go2.sh

Created May 7, 2010 15:54

$ cat go.sh | ruby -pe 'gsub!(%r{/stornext/snfs1/next-gen/solid/hgsc.solid.pipeline/hgsc.bfast.pipe}, "/stornext/snfs1/next-gen/drio-scratch/working.copies/hgsc.bfast.pipe")' > go2.sh

drio / lsf script

Created May 13, 2010 21:09

lsf script mode

	$ cat test.lsf
	#BSUB-o output.txt
	#BSUB-e error.txt
	#BSUB-q normal
	touch ./great.txt

drio / gist:402918

Created May 16, 2010 15:03

dnaa patch

	Subject: [PATCH] There may be cases in PE/MP data where the second read is not available.
	We have to consider those reads (singletons) as read1 not read2.

	---
	dqc/dqc_postalignqc.c \| 12 +++++++++++-
	1 files changed, 11 insertions(+), 1 deletions(-)

	diff --git a/dqc/dqc_postalignqc.c b/dqc/dqc_postalignqc.c
	index d3a4881..250b4d1 100644
	--- a/dqc/dqc_postalignqc.c

drio / gist:404080

Created May 17, 2010 18:38

bfast2

	DRD>> matching one end: RN : T30100230100230100230100230100101002301002301000000
	DRD>> matching one end: #Entries: 1
	DRD>> i: 0 referencePositions[ctr]: -1923810816 m->positions[i]: 80
	DRD>> matching one end: RN : T3100230100230100230100023
	DRD>> matching one end: #Entries: 27
	DRD>> i: 0 referencePositions[ctr]: -1923810816 m->positions[i]: -1272874568

	bfast: Align.c:235: AlignRGMatchesOneEnd: Assertion `readStartInsertionLengths[ctr] + readEndInsertionLengths[ctr] <= readLength' failed.
	./run_bfast.sh: line 48: 16959 Aborted (core dumped) $bbin localalign -U $space -t -f $ref -n1 -m $seed.bmf > $seed.baf

drio / gist:405196

Created May 18, 2010 16:32

bfast script

	$ cat run_bfast.sh
	#!/bin/bash
	#
	set -e
	#set -x

	dist=`pwd`
	#fq="$dist/reads/ecoli.reads.fastq"
	fq="$dist/reads/reads.problem_zlib.fastq"
	ref_h="/stornext/snfs4/next-gen/solid/bf.references/h/hsap.36.1.hg18/hsap_36.1_hg18.fa"

drio / gist:406715

Created May 19, 2010 19:15

	BOOM!: read name: 429_1207_1471
	[bns_coor_pac2real] bug! Coordinate is longer than sequence (4294967294>=3080436051). Abort!
	./run_bfast.sh: line 42: 27646 Aborted (core dumped) $bbin bwaaln -c -t8 $ref $fq > $seed.bmf

drio / gist:407751

Created May 20, 2010 16:15

	[bwa_aln_core] write to the disk...
	>> read name: 429_1207_1471 (p->aln[j].a= 0)

	>> bwt[1]->seq_len: 3080436051

	>> bwt_sa(bwt[1]): 3080436002

	>> p->len: 51

	[bns_coor_pac2real] bug! Coordinate is longer than sequence (4294967294>=3080436051). Abort!

drio / gist:409433

Created May 21, 2010 21:05

	For that step (match), the software first reads into memory a binary version of the reference
	genome:

	/stornext/snfs4/next-gen/solid/bf.references/h/hsap.36.1.hg18/hsap_36.1_hg18.fa.nt.brg

	Then it splits the input data (reads from stornext) into 8 tmp files (/space1/tmp).

	Then per each of the indexes (13G files located in
	/stornext/snfs4/next-gen/solid/bf.references/h/hsap.36.1.hg18/hsap_36.1_hg18.fa.cs.*.bif)
	loads one a time and spawns 8 threads each processing the data from the tmp files (8 files).

drio / gist:410252

Created May 22, 2010 18:11

	#!/usr/bin/env ruby19
	#
	# Total time loading the reference genome: 0 hour, 3 minutes and 2 seconds.
	# Total time loading and deleting indexes: 6 hour, 28 minutes and 36 seconds.
	# Total time searching indexes: 1 hour, 30 minutes and 2 seconds.
	# Total time merging and writing output: 0 hour, 12 minutes and 25 seconds.
	# Total time elapsed: 8 hours, 20 minutes and 12 seconds.
	#
	require 'find'

Older Newer