Skip to content

Instantly share code, notes, and snippets.

@edmundmiller
Last active September 11, 2024 14:45
Show Gist options
  • Save edmundmiller/926ab6987460a288add192c723ddb83e to your computer and use it in GitHub Desktop.
Save edmundmiller/926ab6987460a288add192c723ddb83e to your computer and use it in GitHub Desktop.
{"$schema":"https://raw.githubusercontent.com/jsonresume/resume-schema/v1.0.0/schema.json","basics":{"name":"Edmund Miller","label":"PhD Candidate | nf-core Core Team","image":"https://0.gravatar.com/avatar/765e027729c5964a4e5c83b90d6013bb9029b41ef2be474c2fee3eac3ba2be28","email":"[email protected]","phone":"","url":"https://edmundmiller.dev","summary":"","location":{"countryCode":"US","address":"United States"},"profiles":[{"url":"https://github.com/edmundmiller","username":"edmundmiller","network":"github"},{"network":"LinkedIn","username":"edmundmiller","url":"https://www.linkedin.com/in/edmundmiller/"},{"network":"Mastodon","username":"@[email protected]","url":"https://genomic.social/@emiller"},{"network":"ORCiD","username":"0000-0002-2398-0334","url":"https://orcid.org/0000-0002-2398-0334"},{"network":"Twitter","username":"E_miller88","url":"https://twitter.com/E_miller88"}]},"work":[{"name":"Element Biosciences","position":"Bioinformatics Engineering Associate","startDate":"2021-08-10","endDate":"2023-02-28","highlights":["Designed and created a system for internal pipelines and automating secondary analysis.","Gave a talk at Nextflow Summit 2022 on secondary analysis automation.","Contributed in converting Loop Genomics pipeline from Python and Azure API calls to Nextflow going from $500+ per sample to $40 per sample","Restructured Tertiary analysis and conducted various analysis","Creation of TumorNormal pipeline."],"summary":null,"url":"https://www.elementbiosciences.com","location":"Remote"},{"name":"Element Biosciences","position":"Bioinformatics Engineering Intern","startDate":"2021-06-01","endDate":"2021-08-10","highlights":["Converted Internal Whole Genome Sequencing Pipeline from Jupyter and bash scripts to Nextflow.","Performed variant calling for COVID‑19 AmpliSeq samples.","Processed 10x spatial transcriptomics data.","Created internal standards for Nextflow modules, MultiQC, and pipeline testing.","Analyzed ERCC spike‑in data."],"summary":null,"url":"https://www.elementbiosciences.com","location":"Remote"},{"name":"Olypsis Technologies","position":"Blockchain Software Engineer","startDate":"2018-06-30","endDate":"2020-06-30","highlights":["Lead developement and design of BlockNKey completing tasks including, creating a functioning test suite for legacy code, containerizing the entire system, designing and implementing a REST API and Smart contracts for the system.","Created a novel ERC20 payment splitter Smart contract in Solidity for Digital Assets Foundry.","Developed the DAWN protocol to transfer files in a peer to peer fashion, that is decentralized and does not rely on a 3rd party to be trusted using","Worked with a variety of Web3 technologies including Whisper, IPFS, AES256, and React."],"summary":null,"url":"https://www.linkedin.com/company/30006902","location":"Remote"}],"volunteer":[],"education":[{"institution":"The University of Texas at Dallas","area":"Molecular and Cell Biology","studyType":"Doctor of Philosophy - PhD","startDate":"2020-08-20","endDate":"2025-05-12","score":"","courses":[]},{"institution":"The University of Texas at Dallas","area":"Biotechnology","studyType":"Master of Science - MS","startDate":"2018-08-20","endDate":"2019-12-13","score":"","courses":[]},{"institution":"The University of Texas at Dallas","area":"Molecular Biology","studyType":"Bachelor of Science - BS","startDate":"2015-08-20","endDate":"2018-08-04","score":"","courses":[]}],"awards":[],"certificates":[],"publications":[{"name":"Scalable and efficient DNA sequencing analysis on different compute infrastructures aiding variant discovery","publisher":"bioRxiv","releaseDate":"2023-07-19","url":"https://doi.org/10.1101/2023.07.19.549462","summary":"DNA variation analysis has become indispensable in many aspects of modern biomedicine, most prominently in the comparison of normal and tumor samples. Thousands of samples are collected in local sequencing efforts and public databases requiring highly scalable, portable, and automated workflows for streamlined processing. Here, we present nf-core/sarek 3, a well-established, comprehensive variant calling and annotation pipeline for germline and somatic samples. It is suitable for any genome with a known reference. We present a full rewrite of the original pipeline showing a significant reduction of storage requirements by using the CRAM format and runtime by increasing intra-sample parallelization. Both are leading to a 70% cost reduction in commercial clouds enabling users to do large-scale and cross-platform data analysis while keeping costs and CO2 emissions low. The code is available at https://nf-co.re/sarek."},{"name":"Sequencing by avidity enables high accuracy with low reagent consumption","publisher":"Nature Biotechnology","releaseDate":"2023-05-25","url":"https://doi.org/10.1038/s41587-023-01750-7","summary":"We present avidity sequencing, a sequencing chemistry that separately optimizes the processes of stepping along a DNA template and that of identifying each nucleotide within the template. Nucleotide identification uses multivalent nucleotide ligands on dye-labeled cores to form polymerase–polymer–nucleotide complexes bound to clonal copies of DNA targets. These polymer–nucleotide substrates, termed avidites, decrease the required concentration of reporting nucleotides from micromolar to nanomolar and yield negligible dissociation rates. Avidity sequencing achieves high accuracy, with 96.2% and 85.4% of base calls having an average of one error per 1,000 and 10,000 base pairs, respectively. We show that the average error rate of avidity sequencing remained stable following a long homopolymer."},{"name":"mlf-core: a framework for deterministic machine learning","publisher":"Bioinformatics","releaseDate":"2023-04-03","url":"https://doi.org/10.1093/bioinformatics/btad164","summary":"Machine learning has shown extensive growth in recent years and is now routinely applied to sensitive areas. To allow appropriate verification of predictive models before deployment, models must be deterministic. Solely fixing all random seeds is not sufficient for deterministic machine learning, as major machine learning libraries default to the usage of nondeterministic algorithms based on atomic operations."}],"skills":[{"name":"Bioinformatics","level":"5","keywords":["Nextflow","Seqera Platform","Snakemake","Genomics","NGS"]},{"name":"Computational Biology","level":"5","keywords":["Rust","Python"]},{"name":"High Performance Computing","level":"5","keywords":["Linux","Slurm","Kubernetes","AWS Batch"]},{"name":"DevOPs","level":"5","keywords":["Linux","NixOS","Kubernetes","GitHub Actions","AWS","Terraform","Ancible"]},{"name":"Software Packaging","level":"","keywords":["nix","Docker","Singularity","Conda"]},{"name":"Data Science","level":"","keywords":["Python","Julia","SQL","DuckDB","Machine Learning","Deep Learning"]},{"name":"Decentralized Computing","level":"4","keywords":["IPFS","Ethereum","Solidity","web3.js","Whisper"]},{"name":"Web Development","level":"3","keywords":["Typescript","Vue","Astro","Deno","React","HTML","CSS","JavaScript"]}],"interests":[],"references":[],"projects":[{"name":"nf-core/nascent","startDate":"2020-03-20","description":"Nascent Transcription Processing Pipeline ","url":"https://github.com/nf-core/nascent"}],"meta":{"version":"v1.0.0","canonical":"https://github.com/jsonresume/resume-schema/blob/v1.0.0/schema.json"}}
@edmundmiller
Copy link
Author

Thanks! I was just tossing this up here to try to consume it in a nix-flake.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment