Skip to content

Instantly share code, notes, and snippets.

View fomightez's full-sized avatar

Wayne's Bioinformatics Code Portal fomightez

View GitHub Profile
@fomightez
fomightez / useful_pandas_snippets.py
Last active September 11, 2025 18:25 — forked from bsweger/useful_pandas_snippets.md
Useful Pandas Snippets
# List unique values in a DataFrame column
df['Column Name'].unique() # Note, `NaN` is included as a unique value. If you just want the number, use `nunique()` which stands
# for 'number of unique values'; By default, it excludes `NaN`. `.nunique(dropna=False)` will include `NaN` in the count of unique values.
# To extract a specific column (subset the dataframe), you can use [ ] (brackets) or attribute notation.
df.height
df['height']
# are same thing!!! (from http://www.stephaniehicks.com/learnPython/pages/pandas.html
# -or-
# http://www.datacarpentry.org/python-ecology-lesson/02-index-slice-subset/)
Updated 2025-01-17 thanks to Yemster's comment.
This should work on any architecture of Amazon Linux 2.
(_Although not tested , should also work for Amazon Linux 2023_).
**Prereq**
- visit https://johnvansickle.com/ffmpeg/ to grab the link to the relevant tarball for your specific server architecture.
- Use `uname -a` to find out your arch if unknown
### TL;DR
@sminot
sminot / ncbi_taxonomy.py
Last active January 7, 2024 09:37
Class for using the NCBI taxonomy, reading from taxdump files
import os
from functools import lru_cache
from collections import defaultdict
# Read in the taxonomy
class NCBITaxonomy():
def __init__(self, folder):
self.tax = defaultdict(dict)
# Read in the file of taxid information
names_fp = os.path.join(folder, 'names.dmp')
int[][] result;
float t, c;
float ease(float p) {
return 3*p*p - 2*p*p*p;
}
float ease(float p, float g) {
if (p < 0.5)
return 0.5 * pow(2*p, g);
@jakevdp
jakevdp / PythonLogo.ipynb
Last active April 7, 2024 18:40
Creating the Python Logo in Matplotlib
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@fomightez
fomightez / realtime_vpython_matplotlib_combo.py
Last active November 4, 2016 13:44
code to paste into a cell of a notebook from VPython Binder to demonstrate realtime integration of matplotlib with VPython
%matplotlib notebook
# use `%matplotlib notebook` if you are using current JupyterLab
from vpython import *
import matplotlib.pyplot as plt
plt.style.use('ggplot')
# based on "AtomicSolid" by Bruce Sherwood
# adapted to include realtime matplotlib by Wayne Decatur
@fomightez
fomightez / Launch VPython Binder with Seaborn Support.md
Last active July 6, 2017 03:29
Launch VPython Binder with Seaborn Support
  • Go to my fork of the VPython Binder repository in your browser.

  • Click on the Binder on the bottom of that page.

  • That will take you to a new page and trigger deploying version of the jupyter notebook environment from the correct repository. You shouldn't need to do anything as this takes place; you can watch the progress bar roughly in the middle of the screen, just below the Launch button. It may take about a minute. After it boots up, it should bring you to the dashboard that will look like below

zexample_dashboard.png

@fomightez
fomightez / How-to for Launching VPython Binder.md
Last active March 26, 2017 18:15
How-to for Launching VPython Binder
  • Go to VPython.org in your browser. The landing page will look like below.

zvpythonDOTorg.png

  • Click on Binder package link on that page. That link is near the very bottom of the part of the page that is showing above; it is just below Demo Programs.

  • A notebook will then launch. (Sometimes first times they hang, just hit reload in your browser.)
    After it loads fully it will look like below with a URL different from what you see but similar.
    zexample_VPython_launch.png

@JoaoRodrigues
JoaoRodrigues / bio_align.py
Last active April 9, 2025 11:01
Sequence-based structure alignment of protein structures with Biopython
#!/usr/bin/env python
"""Sequence-based structural alignment of two proteins."""
import argparse
import pathlib
from Bio.PDB import FastMMCIFParser, MMCIFIO, PDBParser, PDBIO, Superimposer
from Bio.PDB.Polypeptide import is_aa