Skip to content

Instantly share code, notes, and snippets.

@cthoyt
cthoyt / clinicaltrials-summary.py
Last active January 24, 2025 14:12
A script that generates a histogram over ClinicalTrials.gov study types.
# /// script
# requires-python = ">=3.11"
# dependencies = [
# "clinicaltrials-downloader>=0.0.2",
# "pyobo[grounding]",
# "tabulate",
# "pystow",
# "click",
# ]
#
@cthoyt
cthoyt / swear_the_oaths.py
Last active November 18, 2024 22:45
Automate downloading audiobook chapters for Wind and Truth
# /// script
# requires-python = ">=3.11"
# dependencies = [
# "beautifulsoup4",
# "requests",
# "yt-dlp",
# ]
# ///
"""Download chapters from Wind and Truth.
@cthoyt
cthoyt / bioregistry_negative_mappings.py
Created November 2, 2024 13:41
Generate SSSOM for the negative mappings in the Bioregistry
import json
import bioregistry
if __name__ == "__main__":
r = json.load(open("mismatch.json"))
with open("mismatches.sssom.tsv", "w") as file:
print(
"subject_id",
"subject_label",
@cthoyt
cthoyt / snapquery_demo.py
Created October 18, 2024 10:32
Demonstrate using the Snapquery package to run a query that takes in some parameters
from typing import Any
from snapquery.snapquery_core import QueryName, NamedQueryManager, QueryBundle
#: See documentation at https://snapquery.bitplan.com/docs
API_ENDPOINT_FMT = "https://snapquery.bitplan.com/api/query/{domain}/{namespace}/{name}"
type Result = dict[str, Any]
type Results = list[Result]
@cthoyt
cthoyt / bioregistry_pydantic_validator.py
Created June 22, 2024 09:39
Use the Bioregistry for Pydantic (v2) validation
import bioregistry
from pydantic.functional_validators import AfterValidator
def validate_local_identifier(prefix: str) -> AfterValidator:
"""Make a validator function based on a Bioregistry prefix.
Example usage:
.. code-block:: python
@cthoyt
cthoyt / bioregistry_records_for_contact_curation.py
Created April 18, 2024 10:49
Find Bioregistry records with publications but no contact information to prioritize curation of new contact information
from tabulate import tabulate
import bioregistry
if __name__ == "__main__":
rows = []
for resource in bioregistry.resources():
if resource.is_deprecated():
continue
if resource.get_contact():
@cthoyt
cthoyt / get_bioregistry_versions.py
Created April 16, 2024 09:00
Get bioregistry data from first of each month
import requests
from dateutil.parser import parse
res_json = requests.get(
f'https://pypi.org/pypi/bioregistry/json',
headers={'Accept': 'application/json'}
).json()
releases = {
parse(data[0]['upload_time']): version
@cthoyt
cthoyt / carolio_matches.ipynb
Last active January 23, 2024 23:25
Find matches between terms in proposed CaroliO ontology and existing OBO Foundry ontologies
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@cthoyt
cthoyt / torch_max_mem_benchmark.py
Created January 12, 2024 12:52
This is a script I was using a long time ago `torch-max-mem` relevant for https://github.com/mberr/torch-max-mem/issues/14. This cause crashes on my MPS GPU
import torch
from torch_max_mem import maximize_memory_utilization
import logging
import torch.mps
from humanize.filesize import naturalsize
logging.basicConfig(level=logging.DEBUG)
@maximize_memory_utilization()
@cthoyt
cthoyt / clo-to-sssom.ipynb
Created June 29, 2023 18:59
Extract and clean up cross-references from the Cell Line Ontology (CLO, v2.1.178)
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.