Skip to content

Instantly share code, notes, and snippets.

View rob-p's full-sized avatar
🎯
Focusing

Rob Patro rob-p

🎯
Focusing
View GitHub Profile
@rob-p
rob-p / compare_hits.py
Created September 30, 2021 04:02
compare containment hits
from pafpy import PafFile
import argparse
import sys
def parse_tab6(tf):
res = []
first = True
with open(tf, 'r') as ifile:
for l in ifile:
if first:
@rob-p
rob-p / filter_blast.py
Created September 30, 2021 03:59
Filter blast output by containment criteria
from pafpy import PafFile
import argparse
import sys
def main(args):
pi = args.input
of = args.out
ld = {}
with open(args.lens, 'r') as ifile:
@rob-p
rob-p / filter_contain.py
Created September 28, 2021 21:17
filter containments out of a PAF file
from pafpy import PafFile
import argparse
import sys
def main(argsj):
pi = args.paf
of = args.tblast
def frac_covered(r):
@rob-p
rob-p / main.nf
Created June 24, 2021 00:38
nextflow MWE 2177
nextflow.enable.dsl=2
process foo {
publishDir "output"
input:
val(prefix)
output:
path "${prefix}_foo"
@rob-p
rob-p / plfit.py
Created December 1, 2020 16:57
powerlaw fit with functional fun
from math import *
# function [alpha, xmin, L]=plfit(x, varargin)
# PLFIT fits a power-law distributional model to data.
# Source: http://www.santafe.edu/~aaronc/powerlaws/
#
# PLFIT(x) estimates x_min and alpha according to the goodness-of-fit
# based method described in Clauset, Shalizi, Newman (2007). x is a
# vector of observations of some quantity to which we wish to fit the
# power-law distribution p(x) ~ x^-alpha for x >= xmin.
@rob-p
rob-p / rust_and_cpp.md
Last active July 5, 2024 20:53
Some thoughts on Rust vs. C++

This is an interesting question, and I think the answer depends on what your primary goal is. Istvan makes good points in favor of straight-forward C (integrated with Python) for building tools that are easy for others, without a ton of experience in programming languages, to modify etc. However, if your primary goal is to make efficient and robust tools for others to use, then let me offer a somewhat different perspective.

The language in which one develops a project has important implications beyond just the speed and memory profiles that will result. Both C++ and Rust will allow you to write very fast tools, making use of zero-cost abstractions where feasible, with predictable and frugal memory profiles (assuming you choose the right algorithms and are careful in designing the programs). However, two important areas where I see these languages diverging are safety and maintainability. By safety, I mean minimizing the types of nefarious memory and correctness bugs that can easily slip into "bare-met

@rob-p
rob-p / convert_to_pdfa.sh
Created July 29, 2020 16:53
Convert a PDF to PDF/A
#!/usr/bin/env bash
pn=$1
convert $1 $1.ps
gs -dPDFA -dBATCH -dNOPAUSE -dNOOUTERSAVE -dUseCIEColor -sProcessColorModel=DeviceCMYK -sDEVICE=pdfwrite -sPDFACompatibilityPolicy=1 -sOutputFile=$1.a.pdf $1.ps
@rob-p
rob-p / Dockerfile
Created June 13, 2020 02:17
Dockerfile to reproduce jemalloc segfault with pufferfish
# image: COMBINE-lab/salmon
# This dockerfile is based on the one created by
# Titus Brown (available at https://github.com/ctb/2015-docker-building/blob/master/salmon/Dockerfile)
FROM centos:latest
ENV PACKAGES git gcc make g++ liblzma-dev libbz2-dev \
ca-certificates zlib1g-dev libcurl4-openssl-dev curl unzip autoconf apt-transport-https ca-certificates gnupg software-properties-common wget
ENV SALMON_VERSION 1.2.1
# salmon binary will be installed in /home/salmon/bin/salmon
/**
MWE for salmon segfault
**/
#include <boost/thread/thread.hpp>
#include <cstdint>
#include <cstdio>
#include <cstring>
#include <fstream>
@rob-p
rob-p / try_enq.cpp
Created January 8, 2019 03:52
custom queue block size
#include "concurrentqueue.h"
//#include "blockingconcurrentqueue.h"
#include <thread>
#include <atomic>
#include <vector>
#include <iostream>
using namespace moodycamel ;
using namespace std ;