Skip to content

Instantly share code, notes, and snippets.

View shntnu's full-sized avatar

Shantanu Singh shntnu

View GitHub Profile
@shntnu
shntnu / compoundvalet_identifier_gap.sql
Last active February 17, 2026 02:28
CompoundValet CSV identifier gap analysis — run in https://shell.duckdb.org/
-- CompoundValet CSV: no structural identifiers → can't join to other databases
-- Paste into https://shell.duckdb.org/
--
-- The Drug Repurposing Hub is a similar drug-target database but includes
-- InChIKey, SMILES, and PubChem CID — making it instantly joinable.
-- CompoundValet has only drug names.
-- Load CompoundValet
CREATE TABLE cv AS
SELECT * FROM read_csv_auto(
@shntnu
shntnu / compute_etag.sh
Created July 25, 2022 18:51 — forked from rajivnarayan/compute_etag.sh
Calculate checksum corresponding to the entity-tag hash (ETag) of Amazon S3 objects
#!/bin/bash
#
# Calculate checksum corresponding to the entity-tag hash (ETag) of Amazon S3 objects
#
# Usage: compute_etag.sh <filename> <part_size_mb>
#
# filename: file to process
# part_size_mb: chunk size in MiB used for multipart uploads.
# This is 8M by default for the AWS CLI See:
# https://docs.aws.amazon.com/cli/latest/topic/s3-config.html#multipart_chunksize