Skip to content

Instantly share code, notes, and snippets.

@seandavi
Created August 15, 2025 22:06
Show Gist options
  • Save seandavi/009732f885ce8023474d5f2ed71e8693 to your computer and use it in GitHub Desktop.
Save seandavi/009732f885ce8023474d5f2ed71e8693 to your computer and use it in GitHub Desktop.
biodatalake ducklake example

Connecting

Start duckdb, either on the command line or in R, etc. To connect to and then use the biodatalake ducklake, do the following inside the duckdb connection:

install httpfs;
load httpfs;
.read 'https://store.cancerdatasci.org/ducklake_config/ducklake_ro_connect.sql'

Querying

It can be helpful to start a duckdb UI at this point to see what tables are available:

call start_ui();

BugSigDB Exports

To take a look at the raw(ish) bugsigdb_export table:

select * from bronze.bugsigdb__export;

SRA metadata

select * from bronze.sra__studies limit 100;
select count(*), study_type from bronze.sra__studies group by study_type order by count(*) desc;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment