Skip to content

Instantly share code, notes, and snippets.

View CnrLwlss's full-sized avatar

Conor Lawless CnrLwlss

View GitHub Profile
@CnrLwlss
CnrLwlss / SciProg2020.md
Last active May 14, 2020 09:32
Scientific programming

What makes a great scientific computing language?

  • Expressivity: should be possible to express mathematical & programming concepts concisely & elegantly. This is important to reduce the mental overhead required for writing down and reading models and algorithms in the form of code.
  • Speed: should be possible to convert code into software/program that gets the most out of modern computer hardware
    • Functional Programming
    • Parallel computing
  • Extensibility: should have a powerful, unambiguous, easy to use package management system built right into the core of the language
  • Visualisation: a large and very important part of scientific programming is generating side-effects: writing reports, generating plots, creating interactive documents

R logo

@CnrLwlss
CnrLwlss / SciProg2020.md
Created May 13, 2020 22:53
Scientific programming

Why Julia?

  • Scientific programming language
  • Open source
  • Package management
  • Supports functional programming
  • Designed to take advantage of computers with multiple CPUs

Why R?

@CnrLwlss
CnrLwlss / usingSpeedTest.R
Created November 13, 2019 12:06
Generate broadband speed test reports
#https://github.com/hrbrmstr/speedtest
library(speedtest)
fname = "speedtest_results2.txt"
makeplots = FALSE
if(!makeplots){
config = spd_config()
servers = spd_servers(config = config)
servers = spd_closest_servers(servers, config)
best = spd_best_servers(servers, config, max = 3)
@CnrLwlss
CnrLwlss / StripchartOpacity.R
Last active June 14, 2019 16:57
R script demonstrating a few things: 1) drawing boxplots with notches roughly indicating significance of differences 2) stripcharts using transparency to highlight values of high density 3) overlaying boxplot on stripcharts and 4) writing plots as multi-page .pdf reports
# Info about boxplot notches https://sites.google.com/site/davidsstatistics/home/notched-box-plots
# Article about why not to use barplots https://doi.org/10.1371/journal.pbio.1002128
# Article about being wary of summary statistics and why raw data plots are better than boxplots https://www.autodeskresearch.com/publications/samestats
# Generate some fake data
concs = seq(0,10,1)
concobs = rep(concs,each=500)
mdel = function(x) -x^2+10*x+20
vals = mdel(concobs) + rnorm(length(concobs),0,12)
dat = data.frame(conc=concobs,val=vals)
@CnrLwlss
CnrLwlss / mtDNASpecies.R
Created March 14, 2019 13:38
Testing whether small deletions are under-represented within fibres containing multiple deletion species
# Testing whether small deletions are under-represented within fibres containing multiple deletion species
# Assume 200 mtDNA molecules per fibre section
Nassume = 200
# P7, P15 and P16 are the proportions of smaller mtDNA species in fibres with two or more mtDNA species
P7 = c(0.6,0.2,0.49,0.33,0.57,0.75,0.47,0.29,0.27,0.23,0.51,0.54)
N7 = rep(Nassume,length(P7))
Ndel7 = c(2,2,3,2,2,2,2,3,2,2,2,2)
P15 = c(0.39,0.73,0.37,0.17,0.43,0.53,0.54,0.57)
@CnrLwlss
CnrLwlss / mutationLoadMeasures.R
Last active September 19, 2018 09:35
Comparing measures of mtDNA mutation load in single cells.
library(grDevices)
library(gtools)
dat = read.delim("data/RTdata.txt",sep="\t",stringsAsFactors=FALSE)
dat$PNUM = as.numeric(gsub("P","",dat$Patient))
dat$ID = sprintf("P%02d_%04d",dat$PNUM,dat$Cell.number)
dat$PAT = sprintf("P%02d",dat$PNUM)
colfunc = colorRamp(c("blue","yellow","red"),space="Lab")
colfun = function(x, alpha=1.0) {
@CnrLwlss
CnrLwlss / explore_mitocyto.R
Created May 25, 2018 09:59
Generating some exploratory plots of mitocyto data.
# Read data and rename columns
dat = read.delim("mitocyto_merged_results.csv",sep=",",stringsAsFactors=FALSE)
colnames(dat) = c("value","id","channel","patient_id","patient_type")
# Specify which ids correspond to patients, and which to control
dat$patient_type = ifelse(dat$patient_id=="M1105","patient","control")
dat$patient_id = paste(toupper(substr(dat$patient_type,1,1)),dat$patient_id,sep="_")
# Specify some colours for plotting
dat$colour = "black"
@CnrLwlss
CnrLwlss / estimate_mu.R
Last active November 13, 2019 09:46
How many replicate samples do we need to estimate the mean of a distribution? 2-panel plot, random output.
mu = 5
stdev = 2
N = 10000
data = rnorm(N,mu,stdev)
pdf = function(x) dnorm(x,mu,stdev)
bestmu = function(N,x) sum(x[1:N])/N
op=par(mfrow=c(1,2))
#install.packages(c("mixOmics","RVAideMemoire"))
library(mixOmics)
library(RVAideMemoire)
# Calculate whether measure is greater in control group after scaling
# Used to colour points in VIP plots
direction=function(dt,measure){
dts = as.data.frame(scale(dt[,-1]))
dts$Group = dt$Group
res = median(dts[[measure]][dts$Group=="Control"],na.rm=TRUE) > median(dts[[measure]][dts$Group!="Control"],na.rm=TRUE)
@CnrLwlss
CnrLwlss / checktiff.py
Created February 9, 2018 12:01
Converting non-image .tiff data to pseudo-images (8-bit greyscale, high contrast).
from PIL import Image
import numpy as np
import os
for fname in os.listdir("."):
if fname.endswith(".tiff"):
im=Image.open(fname)
arr=np.array(im,dtype=np.uint16)
maxval = np.max(arr[arr>0])
minval = np.min(arr[arr>0])