Skip to content

Instantly share code, notes, and snippets.

View bede's full-sized avatar

Bede Constantinides bede

View GitHub Profile
@bede
bede / gist:fb8683029c6c10d4ad99a015c9fc8b7b
Created May 20, 2016 11:09
GKNO build failure - cat build_freebayes.*
$ cat build_freebayes.*
bgzf.c:44:1: warning: unused function 'kh_clear_cache' [-Wunused-function]
KHASH_MAP_INIT_INT64(cache, cache_t)
^
./khash.h:468:2: note: expanded from macro 'KHASH_MAP_INIT_INT64'
KHASH_INIT(name, uint64_t, khval_t, 1, kh_int64_hash_func, kh_int64_hash_equal)
^
./khash.h:140:21: note: expanded from macro 'KHASH_INIT'
static inline void kh_clear_##name(kh_##name##_t *h) \
^
@bede
bede / lists_generators_and_laziness.ipynb
Last active August 17, 2016 13:11
Generators vs. lists for sequence filtering
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@bede
bede / simple_python3_parallelism.ipynb
Last active January 23, 2017 13:17
Simple Python3 parallelism
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@bede
bede / Dockerfile
Created July 27, 2017 21:27
IVA Ubuntu Dockerfile
FROM ubuntu:16.04
RUN apt-get update && \
apt-get --yes install \
kmc smalt python3-pip zlib1g-dev libncurses5-dev libncursesw5-dev mummer samtools
RUN pip3 install iva
ENTRYPOINT ["iva"]
@bede
bede / cluster_df.py
Last active May 8, 2019 12:53
Distance matrix clustering
import pandas as pd
from scipy.spatial.distance import squareform
from scipy.cluster.hierarchy import fcluster, linkage
def cluster_df(df, method='single', threshold=100):
'''
Accepts a square distance matrix as an indexed DataFrame and returns a dict of index keyed flat clusters
Performs single linkage clustering by default, see scipy.cluster.hierarchy.linkage docs for others
'''
import pandas as pd
from bokeh.models.widgets import Select
from bokeh.layouts import widgetbox
from bokeh.models import ColumnDataSource, DataTable, TableColumn, CustomJS
from bokeh.io import show, output_file, output_notebook, reset_output
from bokeh.layouts import row, column, layout
raw_data = {'ORG': ['APPLE', 'ORANGE', 'MELON'],
'APPROVED': [5, 10, 15],
@bede
bede / split_summary_by_barcode.py
Created February 11, 2021 10:25
Split Guppy sequencing summaries by barcode
def split_summary_by_barcode(summary_path, out_dir, run_name):
'''Given a sequencing summary file path, write per barcode summaries to an output directory'''
dtypes = {
'filename_fastq': 'object',
'filename_fast5': 'object',
'read_id': 'object',
'run_id': 'category',
'channel': 'int64',
'mux': 'int64',
@bede
bede / custom_check.py
Last active May 26, 2023 15:49
Pandera MWE – I want a single failure case when region_is_valid fails indicating the sample_name of the row that failed (cDNA-VOC-1-v4-1)
from io import StringIO
import pandas as pd
import pandera as pa
import pandera.extensions as extensions
from pandera.typing import Index, Series
csv_string = """
sample_name,country,region
cDNA-VOC-1-v4-1,USA,Bretagne
@bede
bede / concat_by_barcode.py
Last active August 8, 2024 10:51
Concatenate demultiplexed ONT FASTQs by barcode (for one or more runs)
"""
Purpose: Concatenate demultiplexed FASTQs by barcode for one or more ONT runs
Usage: python concat_by_barcode.py run1/fastq_pass run2/fastq_pass -o output/
Author: Bede Constantinides
"""
import subprocess
import sys
import argparse
from collections import defaultdict
@bede
bede / bioinformatics.patch
Created January 10, 2024 14:08
Necessary modifications to oup-authoring-template.tex for Oxford Bioinformatics submission
22,23c22,23
< \documentclass[unnumsec,webpdf,contemporary,large]{oup-authoring-template}%
< %\documentclass[unnumsec,webpdf,contemporary,large,namedate]{oup-authoring-template}% uncomment this line for author year citations and comment the above
---
> % \documentclass[unnumsec,webpdf,contemporary,large]{oup-authoring-template}%
> \documentclass[unnumsec,webpdf,contemporary,large,namedate]{oup-authoring-template}% uncomment this line for author year citations and comment the above
957,958c957,958
< %\bibliographystyle{abbrvnat}
< %\bibliography{reference}
---