Skip to content

Instantly share code, notes, and snippets.

View sjessa's full-sized avatar

Selin Jessa sjessa

View GitHub Profile
@whitead
whitead / self_cite.py
Last active December 1, 2024 11:16
Compute number of self citations with Semantic Scholar
# License CC0
import httpx
async def analyze_self_citations(doi):
async with httpx.AsyncClient() as client:
response = await client.get(
f"https://api.semanticscholar.org/graph/v1/paper/DOI:{doi}",
params={"fields": "title,authors,references.authors"}
)
@veekaybee
veekaybee / normcore-llm.md
Last active July 10, 2025 00:56
Normcore LLM Reads

Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.

Foundational Concepts

Screenshot 2023-12-18 at 10 40 27 PM

Pre-Transformer Models

@krlmlr
krlmlr / Rprofile-entrace
Last active January 7, 2021 01:59
Pretty stack traces in R
# Add this to your .Rprofile
options(
error = quote(rlang::entrace()),
rlang__backtrace_on_error = "collapse" # or "branch" or "full"
)
@iandanforth
iandanforth / rlreproducibilitychecklist.md
Last active May 11, 2019 01:34
RL Reproducibility Checklist

A Checklist for Reproducibility in Reinforcement Learning

From a slide in a NeurIPS 2018 keynote by Joelle Pineau

For all algorithms presented, check if you include:

  • A clear description of the algorithm.
  • An analysis of the complexity (time, space, sample size) of the algorithm.
  • A link to downloadable source code, including all dependencies.
@schochastics
schochastics / umap.R
Last active October 10, 2018 19:24
Quick and dirty way of using UMAP in R using rPyhton
#install UMAP from https://github.com/lmcinnes/umap
#install.packages("rPython")
umap <- function(x,n_neighbors=10,n_components=2,min_dist=0.1,metric="euclidean"){
x <- as.matrix(x)
colnames(x) <- NULL
rPython::python.exec( c( "def umap(data,n,d,mdist,metric):",
"\timport umap" ,
"\timport numpy",
"\tembedding = umap.UMAP(n_neighbors=n,n_components=d,min_dist=mdist,metric=metric).fit_transform(data)",
@mattpitkin
mattpitkin / README.md
Last active March 15, 2023 10:52
Singularity & Docker in jupyter

Use Singularity and Docker to run a kernel in a jupyter notebook

This is an extension to this post about creating a kernel in a Jupyter notebook that runs a Singularity container.

Download Singularity (see here).

Create a Singularity file, e.g., (making sure to install the ipykernel module in it):

Bootstrap: docker
@jxtx
jxtx / GLBio_3D.md
Last active May 17, 2017 22:05
Notes for 3D genome track at GLBio 2017

Keles -- Statistical Methods for profiling long range chromatin interactions from repetitive regions of the genome

  • Multi-mapping reads (multi-reads) are typically thrown out in many HTS analyses incuding Hi-C
    • Assays predominently rely on short-read (50-150bp) so multi-reads are common
    • Using ChIP-seq as an example, incorporating multi-reads finds peaks in regions where "uni-reads" do not
    • e.g. Perm-seq using DHS + ChIP-seq data and multi-reads. 27.3% more peaks compared to ENCODE uniform processing pipeline
  • How to combine this with Hi-C data?
    • Hi-C read processing
      • Typical pipelines: singletons, multi-mapping ends, low map quality, and unaligned all discarded
  • Evaluation of the impact of this using IMR90 and Plasmodium datasets

A Few Useful Things to Know about Machine Learning

The paper presents some key lessons and "folk wisdom" that machine learning researchers and practitioners have learnt from experience and which are hard to find in textbooks.

1. Learning = Representation + Evaluation + Optimization

All machine learning algorithms have three components:

  • Representation for a learner is the set if classifiers/functions that can be possibly learnt. This set is called hypothesis space. If a function is not in hypothesis space, it can not be learnt.
  • Evaluation function tells how good the machine learning model is.
  • Optimisation is the method to search for the most optimal learning model.
@Pathoschild
Pathoschild / google-sheets-color-preview.js
Last active August 1, 2024 21:43
A Google Sheets script which adds color preview to cells. When you edit a cell containing a valid CSS hexadecimal color code (like #000 or #000000), the background color is changed to that color and the font color is changed to the inverse color for readability.
/*
This script is meant to be used with a Google Sheets spreadsheet. When you edit a cell containing a
valid CSS hexadecimal color code (like #000 or #000000), the background color will change to that
color and the font color will be changed to the inverse color for readability.
To use this script in a Google Sheets spreadsheet:
1. go to Tools » Script Editor;
2. replace everyting in the text editor with this code;
3. click File » Save;
@pbugnion
pbugnion / ipython_notebook_in_git.md
Last active October 22, 2023 12:25
Keeping IPython notebooks under Git version control

This gist lets you keep IPython notebooks in git repositories. It tells git to ignore prompt numbers and program outputs when checking that a file has changed.

To use the script, follow the instructions given in the script's docstring.

For further details, read this blogpost.

The procedure outlined here is inspired by this answer on Stack Overflow.