Skip to content

Instantly share code, notes, and snippets.

View cgpu's full-sized avatar
:octocat:

Christina Chatzipantsiou cgpu

:octocat:
View GitHub Profile
@cgpu
cgpu / r5.metal_r3.8xlarge_MapReads_trace.csv
Last active January 29, 2020 17:29
Benchmarks for MapReads {r5.metal, r3.8xlarge}
We can make this file beautiful and searchable if this error is corrected: It looks like row 8 should actually have 12 columns, instead of 3 in line 7.
name,instance,realtime,status,exit,submit,duration,X.cpu,peak_rss,peak_vmem,rchar,wchar
MapReads_BWA_k19 (NA12878-NA12878_2-k default (19)),r3.8xlarge,8m 19s,COMPLETED,0,2020-01-29 14:55:46.318,8m 26s,3089.8%,6.9 GB,8.3 GB,6.9 GB,6.7 GB
MapReads_BWA_k19 (NA12878-NA12878_2-k default (19)),r5.metal,6m 22s,CACHED,0,2020-01-29T11:57:05.696Z,7m 12s,3043.8%,6.9 GB,8.3 GB,6.9 GB,6.7 GB
MapReads_BWA_k23 (NA12878-NA12878_2-k 23),r3.8xlarge,7m 26s,COMPLETED,0,2020-01-29 15:04:12.358,7m 34s,3082.5%,6.8 GB,8.3 GB,6.9 GB,6.7 GB
MapReads_BWA_k23 (NA12878-NA12878_2-k 23),r5.metal,5m 48s,CACHED,0,2020-01-29T11:57:05.682Z,6m 37s,3012.2%,6.8 GB,8.3 GB,6.9 GB,6.7 GB
MapReads_sambamba_NO_pipe (NA12878-NA12878_2-m 220GB),r5.metal,7m 16s,COMPLETED,0,2020-01-29T12:08:43.046Z,7m 19s,3034.4%,6.9 GB,173.8 GB,15.6 GB,10.7 GB
MapReads_sambamba_pipe (NA12878-NA12878_2-m 220GB),r3.8xlarge,9m 29s,COMPLETED,0,2020-01-29 15:11:46.208,9m 33s,3081.3%,12.4 GB,184.1 GB,15.6 GB,10.7 GB
MapReads_sambamba_pipe (NA12878-NA12878_2-m 220GB),r5.metal,4
@cgpu
cgpu / r3.8xlarge_MapReads_trace.csv
Last active January 29, 2020 17:19
MapReads Benchmark r3.8xlarge instance
We can make this file beautiful and searchable if this error is corrected: It looks like row 8 should actually have 12 columns, instead of 1 in line 7.
name,status,exit,submit,duration,realtime,X.cpu,peak_rss,peak_vmem,rchar,wchar,instance
ScatterIntervalList (wgs_calling_regions.hg38.interval_list),COMPLETED,0,2020-01-29 14:25:42.103,47.8s,1.8s,292.3%,275.2 MB,35.4 GB,11 MB,18.8 MB,r3.8xlarge
MapReads_BWA_k23 (NA12878-NA12878_2-k 23),COMPLETED,0,2020-01-29 15:04:12.358,7m 34s,7m 26s,3082.5%,6.8 GB,8.3 GB,6.9 GB,6.7 GB,r3.8xlarge
MapReads_BWA_k19 (NA12878-NA12878_2-k default (19)),COMPLETED,0,2020-01-29 14:55:46.318,8m 26s,8m 19s,3089.8%,6.9 GB,8.3 GB,6.9 GB,6.7 GB,r3.8xlarge
MapReads_sambamba_pipe (NA12878-NA12878_2-m 220GB),COMPLETED,0,2020-01-29 15:11:46.208,9m 33s,9m 29s,3081.3%,12.4 GB,184.1 GB,15.6 GB,10.7 GB,r3.8xlarge
MapReads_samsort_pipe (NA12878-NA12878_2),COMPLETED,0,2020-01-29 14:26:29.933,9m 45s,8m 30s,3052.0%,12.9 GB,32.6 GB,13.6 GB,8.7 GB,r3.8xlarge
MapReads_samsort_NO_pipe_mem (NA12878-NA12878_2-m 7G),COMPLETED,0,2020-01-29 14:46:39.327,9m 7s,9m 4s,2898.5%,6.9 GB,227.5 GB,15.8 GB,10.9 GB,r3.8xlarge
MapReads_samsort_NO_pipe (NA12878-NA12878_2
@cgpu
cgpu / MapReads_trace_r5.metal.csv
Last active January 29, 2020 14:17
Benchmarks for MapReads process
We can make this file beautiful and searchable if this error is corrected: It looks like row 8 should actually have 11 columns, instead of 10 in line 7.
name,status,exit,submit,duration,realtime,%cpu,peak_rss,peak_vmem,rchar,wchar
MapReads_sambamba_pipe (NA12878-NA12878_2-m 220GB),COMPLETED,0,2020-01-29T12:16:02.039Z,4m 45s,4m 42s,3375.2%,12.2 GB,184.1 GB,15.6 GB,10.7 GB
MapReads_BWA_k23 (NA12878-NA12878_2-k 23),CACHED,0,2020-01-29T11:57:05.682Z,6m 37s,5m 48s,3012.2%,6.8 GB,8.3 GB,6.9 GB,6.7 GB
MapReads_BWA_k19 (NA12878-NA12878_2-k default (19)),CACHED,0,2020-01-29T11:57:05.696Z,7m 12s,6m 22s,3043.8%,6.9 GB,8.3 GB,6.9 GB,6.7 GB
MapReads_sambamba_NO_pipe (NA12878-NA12878_2-m 220GB),COMPLETED,0,2020-01-29T12:08:43.046Z,7m 19s,7m 16s,3034.4%,6.9 GB,173.8 GB,15.6 GB,10.7 GB
MapReads_samsort_NO_pipe_mem (NA12878-NA12878_2-m 7G),COMPLETED,0,2020-01-29T12:08:43.150Z,7m 21s,7m 18s,2794.7%,6.8 GB,228 GB,15.8 GB,10.9 GB
MapReads_samsort_NO_pipe (NA12878-NA12878_2),COMPLETED,0,2020-01-29T12:08:43.055Z,7m 32s,7m 28s,2663.8%,6.9 GB,27.9 GB,15.8 GB,10.9 GB
MapReads_samsort_pipe (NA12878-NA12878_2),CACHED,0,2020-01-29T11:57:05.691Z,8m 20s,7m 17s,2712.3%,12.7 GB,32.6 GB,13.6
@cgpu
cgpu / gamlss.family.csv
Created December 22, 2019 15:26
gamlss.family {gamlss.dist}
Distributions R.names No.of.parameters
Beta BE() 2
Beta Binomial BB() 2
Beta negative binomial BNB() 3
Beta one inflated BEOI() 3
Beta zero inflated BEZI() 3
Beta inflated BEINF() 4
Binomial BI() 1
Box-Cox Cole and Green BCCG() 3
Box-Cox Power Exponential BCPE() 4
@cgpu
cgpu / instancetypes.csv
Created December 20, 2019 17:21
AWS instance types
We can make this file beautiful and searchable if this error is corrected: Unclosed quoted field in line 3.
Instance type,Instance family,Instance size,Availability zones,Free-Tier eligible,Bare metal,Hypervisor,vCPUs,Architecture,Cores,Valid cores,Threads per core,Valid threads per core,Sustained clock speed (GHz),Memory (MiB),Storage (GB),Local instance storage,Storage type,Storage disk count,EBS encryption support,EBS optimization support,Network performance,ENA support,Maximum number of network interfaces,IPv4 addresses per interface,IPv6 addresses per interface,IPv6 support,Supported placement group strategies,GPUs,FPGAs,Auto Recovery support,Supported root devices,Dedicated Host support,On-Demand Hibernation support,Burstable Performance support,Current generation,On-Demand Linux pricing,On-Demand Windows pricing
c4.2xlarge,c4,2xlarge,"eu-west-2a, eu-west-2b, eu-west-2c",false,false,xen,8,x86_64,4,"1,2,3,4",2,"1,2",2.9,15360,-,-,-,-,supported,default,High,unsupported,4,15,15,true,"partition, spread, cluster",-,-,true,-,true,true,-,true,0.476 USD per Hour,0.844 USD per Hour
c4.4xlarge,c4,4xlarge,"eu-west-2a, e
@cgpu
cgpu / github_packages_metabolomics.csv
Created December 18, 2019 21:00
githubinstall::gh_search_packages("metabolomics")
We can make this file beautiful and searchable if this error is corrected: It looks like row 8 should actually have 3 columns, instead of 2 in line 7.
username,package_name,title
CarlBrunius,<a href='https://github.com/CarlBrunius/batchCorr'/target='blank'>batchCorr</a>,Within and between batch correction of LC-MS metabolomics data
JustinZZW,<a href='https://github.com/JustinZZW/ZZWtool'/target='blank'>ZZWtool</a>,ZZWtool: common tools in metabolomics data process
PlantDefenseMetabolism,<a href='https://github.com/PlantDefenseMetabolism/MetCirc'/target='blank'>MetCirc</a>,A workflow for metabolomics data in R
PlantDefenseMetabolism,<a href='https://github.com/PlantDefenseMetabolism/MetabolomicTools'/target='blank'>MetabolomicTools</a>,A workflow for metabolomics data in R
Viant-Metabolomics,<a href='https://github.com/Viant-Metabolomics/msPurity'/target='blank'>msPurity</a>,Automated Evaluation of Precursor Ion Purity for Mass Spectrometry Based Fragmentation in Metabolomics
YonghuiDong,<a href='https://github.com/YonghuiDong/Miso'/target='blank'>Miso</a>,R package for Multi-isotope Labeling for Metabolomics Analysis
afukushima,<a href='https://github.com/a
@cgpu
cgpu / AllMetaRbolomics.csv
Created December 18, 2019 18:19
metaRbolomics packageverse
We can make this file beautiful and searchable if this error is corrected: It looks like row 5 should actually have 7 columns, instead of 1 in line 4.
Table,Section,Functionalities,Package,Code_link,Reference,Repo
Table 1: R packages for mass spectrometry data handling and (pre-)processing.,MS data handling,"Parser for common file formats: mzXML, mzData, mzML and netCDF. Usually not used directly by the end user, but provides functions to read raw data for other packages.",mzR,https://doi.org/doi:10.18129/B9.bioc.mzR,[@chambers_2012],BioC
Table 1: R packages for mass spectrometry data handling and (pre-)processing.,MS data handling,"Infrastructure to manipulate, process and visualise MS and proteomics data, ranging from raw to quantitative and annotated data.",MSnbase,https://doi.org/doi:10.18129/B9.bioc.MSnbase,[@gatto_2012],BioC
Table 1: R packages for mass spectrometry data handling and (pre-)processing.,MS data handling,Export and import of processed metabolomics MS results to and from the mzTab-M for metabolomics data format.,rmzTab-M,https://lifs-tools.github.io/rmzTab-m/index.html,[@hoffmann_2019],GitHub
Table 1: R packages for mass spectrometry data
@cgpu
cgpu / nomnoml_awesomeness.Rmd
Last active December 18, 2019 13:34
Hello world example of {nomnoml} R package to programmatically contruct graphs https://github.com/javierluraschi/nomnoml
---
title: "nomnoml"
output: html_document
---
```{r}
library(nomnoml)
```
@cgpu
cgpu / doi2citation.R
Created December 18, 2019 12:21
Cite papers in 1 of 2070 available styles with rOpenSci https://github.com/ropensci/rcrossref
install.packages("rcrossref")
rcrossref::cr_cn(dois = "10.1186/1471-2105-9-375",
format = "text",
style = "bioinformatics")