Day 1: 25 June 2019
- Martin providing some "brief logistics"
Day 1: 25 June 2019
Conference info: https://bioc2018.bioconductor.org/
My first Bioconductor meeting, and I'm not a BioC or R expert so these notes are probably going to be naïve!
Working through running HiFive on a Hi-C datasets.
First, a note on memory an performance: bin size influences everything. Starting with a bin size of 40kb, loading data in hg38 seems to stay under ~16GB. At fend level resolution memory requirements approach ~32GB and running time increases several fold.
HiFive stores a fend
file with information on the locations of restriction
fragments in the genome. We need to get the locations of the RE sites into a BED
Once upon a time there was the Genome ALignment and Annotation database or GALA, which allowed for analysis of genomic elements alongside comparative genomic information. However, this tool supported only a few analyses. What-would-be-galaxy was born from the idea of being able to easily take any existing analysis tool and quickly integrate it into this platform. But what should we call this next direction? Bob Harris suggested the use of X/Y to represent this "next dimension" of analysis. GALA + XY ⟶ GALAXY ⟶ Galaxy.
Or at least this is how I remember it.
#usegalaxy
# Mostly based on this: | |
# https://github.com/Homebrew/linuxbrew/wiki/Standalone-Installation | |
# But I started with nothing (no ruby, no gcc) | |
# Ruby and GCC will go here | |
mkdir bootstrap | |
# Get GCC 4.4 and install under bootstrap | |
# We also need libstdc++ when we get to building gcc-4.9 because somebody decided it was a good idea to start writing GCC in C++ | |
wget http://ftp1.scientificlinux.org/linux/scientific/55/x86_64/SL/gcc44-4.4.0-6.el5.x86_64.rpm |
/** | |
* usage: node scrape_gs.js USERKEY | |
* | |
* Determine h-index for papers published AFTER each year found in a Google | |
* scholar profile. The USERKEY is found in your Google scholar citations | |
* page url. | |
*/ | |
var request = require('request'); | |
var cheerio = require('cheerio'); |
/* | |
* BLAST - Search two DNA sequences for locally maximal segment pairs. The basic | |
* command syntax is | |
* | |
* BLAST sequence1 sequence2 | |
* | |
* where sequence1 and sequence2 name files containing DNA sequences. Lines | |
* at the beginnings of the files that don't start with 'A', 'C', 'T' or 'G' | |
* are discarded. Thus a typical sequence file might begin: | |
* |