Skip to content

Instantly share code, notes, and snippets.

View blahah's full-sized avatar

Rik Smith-Unna blahah

  • upward spiral ∞⟨X∴↯⟩∞
  • Bristol / Berlin / Nairobi
  • X @blahah404
View GitHub Profile
@blahah
blahah / summarise_assemblies.R
Last active December 4, 2015 15:40
Summary statistics and plots of transcriptome assemblies (for ICIPE TReND NGS students)
setwd("~/project_icipe")
library(ggplot2)
# 1. load the files
oases <- read.csv('./oases.csv')
soap <- read.csv('./soapdenovotrans.csv')
# check the number of rows and columns
dim(oases)
dim(soap)
@blahah
blahah / refseq_to_gene_symbol_biomart.R
Created December 15, 2015 16:28
get RefSeq mRNA mapping to human gene symbol via ensembl biomart
# This script demonstrates how to download a mapping from RefSeq mRNA to human gene symbols by using the ensembl biomart service and the bioconductor `biomaRt` package in R.
source("https://bioconductor.org/biocLite.R")
biocLite("biomaRt")
library("biomaRt")
# work around bug in resolving host (https://support.bioconductor.org/p/74304/)
listMarts(host="www.ensembl.org")
@blahah
blahah / get_elsevier_pdf.sh
Created January 19, 2016 19:25
wget PDFs from sciencedirect.com (elsevier)
# with example PDF URL
PDF_URL"=http://www.sciencedirect.com/science/article/pii/S0191886909000877/pdfft\?md5\=25fb6461c7b1ebe114acb2cd5ab91ddd\&pid\=1-s2.0-S0191886909000877-main.pdf"
wget \
--header "User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.111 Safari/537.36" \
--verbose \
--save-cookies cookies.txt \
--keep-session-cookies \
$PDF_URL
prefix publisher journals dois
10.12679 0 0
10.7579 123Doc Education 0 0
10.3731 21st Century COE Program (Toplogical Science and Technology) 1 40
10.5775 A. I. Rosu Cultural Scientific Foundation Fundatia cultural-stiintifica A. I. Rosu 1 80
10.4037 AACN Publishing 2 766
10.1306 AAPG/Datapages 4 21817
10.3183 AB Svensk Papperstidning 1 1550
10.5769 ABEAT - Associacao Brasileira de Especialistas em Alta Tecnologia 1 57
10.7597 ACOPIOS - Revista Iberica de Mineralogia 1 9
@blahah
blahah / concept-graph.js
Last active May 21, 2019 18:58 — forked from kanesee/concept-graph.js
d3 2-way tree
var CollapsibleTree = function(elt) {
var m = [20, 120, 20, 120],
w = 1280 - m[1] - m[3],
h = 580 - m[0] - m[2],
i = 0,
root,
root2;
var tree = d3.layout.tree()
{
"options": {
"died": true,
"multigraph": true,
"compound": true
},
"nodes": [
{
"v": "questions",
"value": {
@blahah
blahah / .block
Last active April 5, 2016 12:01 — forked from mbostock/.block
Programmatic Pan+Zoom
license: gpl-3.0
@blahah
blahah / demo.sh
Last active April 11, 2016 16:25
UNIX one-liner to split a EuropePMC XML archive into a stream of articles
sed '1d;$d' | sed 's/<\/article>/<\/article>♛/g' | tr -d '\n' | tr '♛' '\n' | less -S
# e.g. download every EuropePMC archive and convert them to a stream of all articles
curl --silent http://europepmc.org/ftp/oa/ | \
grep -o 'PMC[0-9]*_PMC[0-9]*\.xml\.gz' | \
sort | uniq | \
sed 's/^/http:\/\/europepmc.org\/ftp\/oa\//' | \
xargs -n 1 curl --silent | gunzip | grep -vP '<\/?articles>' | \
sed 's/<\/article>/<\/article>♛/g' | tr -d '\n' | tr '♛' '\n' | less -S
@blahah
blahah / README.md
Created April 17, 2016 07:34
Comparison of embedded NoSQL databases that can be emedded in a Node.js - Electron app

Making an Electron app and want to embed a database? Here's a table to help you choose the right database software.

Work in progress

Project Language NPM package? Flat file? In memory? Full-text search? Maturity Embedded size Dependencies
NeDB Javascript Yes Always No High Low Simple
Lunr Javascript No Yes Yes High Tiny None
Search-index Javascript
Sqlite3 C Yes No Yes (fts extens
@blahah
blahah / optimise_eupmc.js
Last active April 18, 2016 17:37
europe pubmed central bibJSON -> search index optimised term array
#!/usr/bin/env node
// this script optimises the eupmc metadata for inclusion in a full-text
// search index.
var Transform = require("stream").Transform
var util = require("util")
var natural = require('natural')
var tokenize = (new natural.TreebankWordTokenizer()).tokenize