QC Illumina flowcell, demultiplex, and QC mapped BAM
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| from collections import defaultdict | |
| import matplotlib.pyplot as plt | |
| import numpy as np | |
| import vcf | |
| from palettable.colorbrewer.qualitative import Paired_12 | |
| from scipy.stats import gaussian_kde | |
| from vcf_figures.helpers import * |
I hereby claim:
- I am clintval on github.
- I am clintval (https://keybase.io/clintval) on keybase.
- I have a public key ASA1U0QRFP5NAtESP-6ztuMfcyXE23LNCGe9r7Vlb71nRgo
To claim this, I am signing this object:
We can make this file beautiful and searchable if this error is corrected: It looks like row 4 should actually have 10 columns, instead of 1 in line 3.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Name,Slack Call Sign,Location,Interests,Skills (General),Preferred Language,Programming Languages,Education Background,Github Username,Preferred group within collab | |
| Mark van der Sman,mvdsman,"Leiden, The Netherlands","Visualisation, genomics, pattern recognition/ML, statistical analysis, phylogenetics/evolution (and many more)","Visualisation, genomics, web development, some transcriptomics and some Machine Learning/Natural Computing. Biologist going for MSc BI",R,"Most experienced in R/RShiny, decent in Python and HTML/CSS/Markdown, basics in C++, PHP, SQL. Flaming hatred towards Matlab","B.Sc. Biology + minor Data Science, going for M.Sc. Bioinformatics",MvdSman,Visualisation | |
| Anthony Fejes,apfejes,"Bay Area, California","Visualization, transcriptomics, epigenomics, genomics,","Biologist, biochemist, programmer, entrepreneur",Python,"Python, C, SQL/No-SQL, Java, etc. (Anything except R, Javascript)","B.Sc. Biochemitry | |
| B.I.S. | |
| M.Sc. Microbiology and Immunology | |
| PhD. Bioinformatics",apfejes, | |
| Dimitrios - Georgios |
The International Nucleotide Sequence Database Collaboration (INSDC) consists of the DNA DataBank of Japan (DDBJ), the European Molecular Biology Laboratory (EMBL) and GenBank at NCBI. As part of the Collaboration, all three organizations accept new sequence submissions and share sequence data among the three databases. To facilitate the exchange of data, each member of the collaboration is assigned certain accession prefixes. In addition to the accession number, GenBank records also have a GI number. The GI number is simply a series of digits assigned consecutively to sequences submitted to NCBI.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # Requires the STAR executable to be at: | |
| # /pipeline/packages/star | |
| # | |
| # Overhang is set to the read length (template cycles) of 142 - 1: | |
| # | |
| OVERHANG=141 | |
| git clone \ | |
| https://github.com/dpryan79/ChromosomeMappings.git \ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # After a `pip install sample-sheet pendant` | |
| from sample_sheet import SampleSheet | |
| from pendant.aws.s3 import S3Uri | |
| def s3_validate_sample_sheet(path): | |
| for sample in SampleSheet(path): | |
| left = S3Uri(sample.PathToFastq1) | |
| right = S3Uri(sample.PathToFastq2) | |
| assert left.object_exists() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| object SampleUtil { | |
| /** Join all of the data across a collection of samples. All fields will be joined on the delimiter `";"`. Regardless | |
| * of the lanes the libraries were sequenced on, the resulting sample will have the lanes field cleared to [[None]]. | |
| * The merged sample will have its ordinal set to zero. | |
| * | |
| * @throws IllegalArgumentException when there are no libraries to merge | |
| * @throws IllegalArgumentException when trying to join samples with different sample names | |
| */ | |
| def merge(samples: Seq[Sample]): Sample = { |
Fooled by the timezone you live in?
EDT - implicit
❯ aws s3 ls s3://example-ngs-data/30-415555663/ | head -n1
2020-10-14 17:57:17 24784494053 sample-1_S21_L003_R1_001.fastq.gzPDT - explicit