- Date: 2026-03-18
- Branch:
zeming/lance-mapper - Script:
claude_scratchpad/fold_val_test.py - Dataset:
/bio/projects/es/zlin/esmc2_datasets/260312_uniref_seqonly/val_filtered.lance - Model: Janaury trainout of ESMCFold hero medium (24blk, 12 diffusion steps, no MSA, confidence-trained)
- Checkpoint:
conf_esmcfold_hero_medium_24blk_12diffu_no_msa_bs128_ctx512_mult2_noise1.1_step1.0_nodiffcond/epoch-0000-step-7000_cleaned.ckpt
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| """ | |
| Benchmark lance scalar index lookups: sequential, batched, and async. | |
| Lance (https://lancedb.github.io/lance/) is a columnar format that supports | |
| BTREE scalar indices, making point lookups fast even over S3 — no local copy | |
| or database server needed. This script benchmarks three lookup patterns against | |
| a 120M-row dataset stored on S3: | |
| - Sequential: one filter query at a time (baseline) | |
| - Batched IN: single query with WHERE protein_hash IN (...) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| """ | |
| lance_mapper: parallel map over Lance datasets on SLURM | |
| ======================================================== | |
| This example shows how to use LanceMapper to run an embarrassingly parallel | |
| computation over a Lance dataset using SLURM job arrays. | |
| The pattern: | |
| 1. Subclass LanceMapper | |
| 2. Set key_column (unique ID column) and rows_per_shard |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import torch | |
| from torch import nn | |
| from torch.nn import functional as F | |
| from torch.autograd import Variable | |
| import sys | |
| nlen = 5 | |
| model_type = nn.LSTM | |
| running_loss = 1 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| pkgname='singularity-container' | |
| pkgver='2.3' | |
| pkgrel='0' | |
| pkgdesc='Container platform focused on supporting "Mobility of Compute".' | |
| arch=('i686' 'x86_64') | |
| url='http://singularity.lbl.gov' | |
| license=('BSD') | |
| depends=('bash' 'python') | |
| source=("https://github.com/singularityware/singularity/releases/download/${pkgver}/singularity-${pkgver}.tar.gz") | |
| md5sums=('dbc02b17f15680c378c1ec9e4d80956d') |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #include <ctime> | |
| #include <iostream> | |
| #include "replayer.h" | |
| using namespace std; | |
| using namespace torchcraft::replayer; | |
| int main() { | |
| std::clock_t start; | |
| double duration; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # This script tries as best as possible to filter out bad replays | |
| # Pass it a subdir, and it will read all '.rep' files, and spit out a list | |
| # of the corrupt files in stdout | |
| from __future__ import print_function | |
| from pyreplib import replay | |
| from itertools import repeat | |
| from multiprocessing import Pool, Process, Pipe | |
| from multiprocessing.pool import ThreadPool | |
| from Queue import Queue | |
| import os |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # This script tries as best as possible to filter out bad replays | |
| # Pass it a subdir, and it will read all '.rep' files, and spit out a list | |
| # of the corrupt files in stdout | |
| from __future__ import print_function | |
| from pyreplib import replay # https://github.com/HearthSim/pyreplib/ | |
| from itertools import repeat | |
| from multiprocessing import Pool, Process, Pipe | |
| from multiprocessing.pool import ThreadPool | |
| import os | |
| import sys |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import torch | |
| import torch.nn as nn | |
| import torch.nn.functional as F | |
| from torch.autograd import Variable | |
| class Policy(nn.Module): | |
| def __init__(self): | |
| super(Policy, self).__init__() | |
| self.affine1 = nn.Linear(4, 128) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import argparse | |
| import gym | |
| import numpy as np | |
| from itertools import count | |
| import torch | |
| import torch.nn as nn | |
| import torch.nn.functional as F | |
| import torch.optim as optim | |
| import torch.autograd as autograd |