Downloaded "Antigenic Formulae Of The Salmonella Serovars 2007 9th edition" from:
https://www.pasteur.fr/ip/portal/action/WebdriveActionEvent/oid/01s-000036-089
Remove page numbers:
- match
| #! /usr/bin/Rscript --vanilla | |
| library(getopt) | |
| spec <- matrix(c( | |
| 'msa_dir_path','d',1,'character','MSA directory path (required)' | |
| ,'msa_file_ext','e',2,'character','MSA file extension (optional; default: "aln")' | |
| ,'out','o',2,'character','Output core SNP matrix CSV filename (optional; default: "core_distance_matrix.csv")' | |
| ,'n_cores','c',2,'integer','Number of cores to use for computation (optional; default: 2)' | |
| ,'dna_distance_model','m',2,'character','DNA distance model (default: "N"). |
Downloaded "Antigenic Formulae Of The Salmonella Serovars 2007 9th edition" from:
https://www.pasteur.fr/ip/portal/action/WebdriveActionEvent/oid/01s-000036-089
Remove page numbers:
| #!/usr/bin/env python | |
| import argparse | |
| import textwrap | |
| import os | |
| import sys | |
| import json | |
| import re | |
| /*bitbucket.org dark css theme*/ | |
| body, aside { | |
| background: #222 !important; | |
| background-color: #222 !important; | |
| color: #bbb !important; | |
| } | |
| h1, h2, h3, h4, h5, span { | |
| background-color: transparent !important; | |
| color: #FFC963 !important; |
This JS+D3 gist creates a scatterplot with zooming and panning enabled as well as a brush for selecting or deselecting points using the iris dataset within data.tsv.
The "Get Selection" button gets the current selection of points and prints their ids to the JS console (i.e. console.log(selection);).
The "Clear Selection" button clears the current selection.
| import argparse | |
| import textwrap | |
| import os | |
| import sys | |
| from datetime import timedelta, datetime | |
| # function for reading a multifasta file | |
| # returns a dictionary with sequence headers and nucleotide sequences | |
| def get_seqs_from_fasta(filepath): |
| """ | |
| SAM-based reboot | |
| """ | |
| import sys, os, subprocess, itertools, array, datetime, socket, heapq, tempfile | |
| library(RColorBrewer) | |
| qualitative_colours <- function(n, light=FALSE) { | |
| # Get a specified number of qualitative colours if possible. | |
| # This function will default to a continuous color scheme if there are more | |
| # than 21 colours needed. | |
| # rainbow12equal <- c("#BF4D4D", "#BF864D", "#BFBF4D", "#86BF4D", "#4DBF4D", "#4DBF86", "#4DBFBF", "#4D86BF", "#4D4DBF", "#864DBF", "#BF4DBF", "#BF4D86") | |
| rich12equal <- c("#000040", "#000093", "#0020E9", "#0076FF", "#00B8C2", "#04E466", "#49FB25", "#E7FD09", "#FEEA02", "#FFC200", "#FF8500", "#FF3300") |
| aln_snps = {} | |
| for aln in aln_files: | |
| recs = [f for f in SeqIO.parse(aln, 'fasta')] | |
| # strain names should be the last dash delimited element in fasta header | |
| strains = [rec.name.split('-')[-1] for rec in recs] | |
| # get a dictionary of strain names and sequences | |
| strain_seq = {rec.name.split('-')[-1]:''.join([nt for nt in rec.seq]) \ | |
| for rec in recs} | |
| # get length of the MSA and check that all of the seq are the same length |
| # This file contains a set of functions for parsing out some useful information | |
| # from BLAST results files saved in BLAST's tabular output format ("-outfmt 6"). | |
| # Biopython is required for reading multifasta files and storing sequences. | |
| from Bio.Seq import Seq | |
| from Bio.SeqRecord import SeqRecord | |
| from Bio.Alphabet import IUPAC | |
| # if all of your genome sequences are within one multifasta file | |
| recs = [rec for rec in SeqIO.parse('all_genomes.fasta', 'fasta')] |