Skip to content

Instantly share code, notes, and snippets.

@nousr
Last active October 8, 2022 04:42
Show Gist options
  • Save nousr/b380c81c0dae01966ff72b9349a742f3 to your computer and use it in GitHub Desktop.
Save nousr/b380c81c0dae01966ff72b9349a742f3 to your computer and use it in GitHub Desktop.
2022-10-08 03:30:04 1922586752 img_emb_0000.npy
2022-10-08 03:57:15 1920735360 img_emb_0022.npy
2022-10-08 03:28:48 1921157248 img_emb_0044.npy
2022-10-08 03:28:30 1922345088 img_emb_0066.npy
2022-10-08 03:26:16 1922654336 img_emb_0088.npy
2022-10-08 03:31:30 1922353280 img_emb_0110.npy
2022-10-08 03:28:30 1922943104 img_emb_0132.npy
2022-10-08 03:26:54 1923940480 img_emb_0154.npy
2022-10-08 03:31:19 1922218112 img_emb_0176.npy
2022-10-08 03:28:39 1922607232 img_emb_0198.npy
2022-10-08 03:32:57 1922828416 img_emb_0220.npy
2022-10-08 03:28:01 1923815552 img_emb_0242.npy
2022-10-08 03:26:43 1922560128 img_emb_0264.npy
2022-10-08 03:26:43 1922967680 img_emb_0286.npy
2022-10-08 03:27:35 1922427008 img_emb_0308.npy
2022-10-08 03:27:40 1922832512 img_emb_0330.npy
2022-10-08 03:28:55 1923082368 img_emb_0352.npy
2022-10-08 03:27:47 1923119232 img_emb_0374.npy
2022-10-08 03:27:39 1921964160 img_emb_0396.npy
2022-10-08 03:27:38 1922584704 img_emb_0418.npy
2022-10-08 03:27:33 1921833088 img_emb_0440.npy
2022-10-08 03:27:46 1922943104 img_emb_0462.npy
2022-10-08 03:27:50 1923614848 img_emb_0484.npy
2022-10-08 03:27:43 1921953920 img_emb_0506.npy
2022-10-08 03:28:00 1922412672 img_emb_0528.npy
2022-10-08 03:27:43 1922779264 img_emb_0550.npy
2022-10-08 03:28:02 1922465920 img_emb_0572.npy
2022-10-08 03:27:39 1922818176 img_emb_0594.npy
2022-10-08 03:27:47 1922420864 img_emb_0616.npy
2022-10-08 03:27:38 1922777216 img_emb_0638.npy
2022-10-08 03:28:04 1922248832 img_emb_0660.npy
2022-10-08 03:27:49 1923229824 img_emb_0682.npy
2022-10-08 03:27:46 1923850368 img_emb_0704.npy
2022-10-08 03:28:02 1922738304 img_emb_0726.npy
2022-10-08 03:27:44 1922048128 img_emb_0748.npy
2022-10-08 03:27:45 1922816128 img_emb_0770.npy
2022-10-08 03:27:37 1922525312 img_emb_0792.npy
2022-10-08 03:27:57 1922705536 img_emb_0814.npy
2022-10-08 03:27:53 1922912384 img_emb_0836.npy
2022-10-08 03:27:40 1922680960 img_emb_0858.npy
2022-10-08 03:27:38 1923070080 img_emb_0880.npy
2022-10-08 03:27:51 1922060416 img_emb_0902.npy
2022-10-08 03:27:32 1921562752 img_emb_0924.npy
2022-10-08 03:27:37 1923018880 img_emb_0946.npy
2022-10-08 03:27:37 1922805888 img_emb_0968.npy
2022-10-08 03:27:44 1922154624 img_emb_0990.npy
2022-10-08 03:27:41 1922824320 img_emb_1012.npy
2022-10-08 03:27:45 1922283648 img_emb_1034.npy
2022-10-08 03:27:38 1922242688 img_emb_1056.npy
2022-10-08 03:27:28 1921378432 img_emb_1078.npy
2022-10-08 03:27:34 1920444544 img_emb_1100.npy
2022-10-08 03:27:41 1921202304 img_emb_1122.npy
2022-10-08 03:27:57 1922375808 img_emb_1144.npy
2022-10-08 03:27:37 1923512448 img_emb_1166.npy
2022-10-08 03:27:41 1922578560 img_emb_1188.npy
2022-10-08 03:27:55 1921818752 img_emb_1210.npy
2022-10-08 03:27:55 1922689152 img_emb_1232.npy
2022-10-08 03:27:41 1922623616 img_emb_1254.npy
2022-10-08 03:27:37 1921390720 img_emb_1276.npy
2022-10-08 03:27:45 1922076800 img_emb_1298.npy
2022-10-08 03:28:04 1922015360 img_emb_1320.npy
2022-10-08 03:27:55 1922920576 img_emb_1342.npy
2022-10-08 03:27:34 1922005120 img_emb_1364.npy
2022-10-08 03:27:51 1922238592 img_emb_1386.npy
2022-10-08 03:27:38 1922621568 img_emb_1408.npy
2022-10-08 03:27:04 1907409024 img_emb_1430.npy
2022-10-08 03:27:46 1922533504 img_emb_1452.npy
2022-10-08 03:27:46 1922306176 img_emb_1474.npy
2022-10-08 03:27:46 1922742400 img_emb_1496.npy
2022-10-08 03:27:38 1920686208 img_emb_1518.npy
2022-10-08 03:27:38 1922500736 img_emb_1540.npy
2022-10-08 03:27:42 1922103424 img_emb_1562.npy
2022-10-08 03:27:34 1920682112 img_emb_1584.npy
2022-10-08 03:27:44 1922971776 img_emb_1606.npy
2022-10-08 03:27:41 1922730112 img_emb_1628.npy
2022-10-08 03:27:39 1921812608 img_emb_1650.npy
2022-10-08 03:27:46 1922064512 img_emb_1672.npy
2022-10-08 03:27:35 1921951872 img_emb_1694.npy
2022-10-08 03:28:06 1921894528 img_emb_1716.npy
2022-10-08 03:27:40 1920256128 img_emb_1738.npy
2022-10-08 03:27:47 1921583232 img_emb_1760.npy
2022-10-08 03:33:15 1921009792 img_emb_1782.npy
2022-10-08 03:27:48 1920817280 img_emb_1804.npy
2022-10-08 03:27:45 1922064512 img_emb_1826.npy
2022-10-08 03:27:45 1922674816 img_emb_1848.npy
2022-10-08 03:27:58 1922832512 img_emb_1870.npy
2022-10-08 03:27:45 1922361472 img_emb_1892.npy
2022-10-08 03:27:48 1922324608 img_emb_1914.npy
2022-10-08 03:27:47 1922568320 img_emb_1936.npy
2022-10-08 03:27:55 1922297984 img_emb_1958.npy
2022-10-08 03:27:52 1922859136 img_emb_1980.npy
2022-10-08 03:27:40 1921443968 img_emb_2002.npy
2022-10-08 03:27:47 1922105472 img_emb_2024.npy
2022-10-08 03:27:55 1922168960 img_emb_2046.npy
2022-10-08 03:27:40 1920700544 img_emb_2068.npy
2022-10-08 03:27:40 1921730688 img_emb_2090.npy
2022-10-08 03:27:43 1922168960 img_emb_2112.npy
2022-10-08 03:27:33 1920577664 img_emb_2134.npy
2022-10-08 03:27:39 1921333376 img_emb_2156.npy
2022-10-08 03:27:35 1922615424 img_emb_2178.npy
2022-10-08 03:27:27 1922080896 img_emb_2200.npy
2022-10-08 03:27:45 1923205248 img_emb_2222.npy
2022-10-08 03:26:50 1922422912 img_emb_2244.npy
2022-10-08 03:25:52 1903550592 img_emb_2266.npy
Traceback (most recent call last):
File "/fsx/nousr/clip-retrieval/.env/bin/clip-retrieval", line 11, in <module>
load_entry_point('clip-retrieval', 'console_scripts', 'clip-retrieval')()
File "/fsx/nousr/clip-retrieval/clip_retrieval/cli.py", line 27, in main
"parquet_to_arrow": parquet_to_arrow,
File "/fsx/nousr/clip-retrieval/.env/lib64/python3.7/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/fsx/nousr/clip-retrieval/.env/lib64/python3.7/site-packages/fire/core.py", line 471, in _Fire
target=component.__name__)
File "/fsx/nousr/clip-retrieval/.env/lib64/python3.7/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/fsx/nousr/clip-retrieval/clip_retrieval/clip_inference/worker.py", line 118, in worker
runner(task)
File "/fsx/nousr/clip-retrieval/clip_retrieval/clip_inference/runner.py", line 27, in __call__
reader = self.reader_builder(sampler)
File "/fsx/nousr/clip-retrieval/clip_retrieval/clip_inference/worker.py", line 75, in reader_builder
cache_path=cache_path,
File "/fsx/nousr/clip-retrieval/clip_retrieval/clip_inference/reader.py", line 238, in __init__
input_sampler=sampler,
File "/fsx/nousr/clip-retrieval/clip_retrieval/clip_inference/reader.py", line 129, in create_webdataset
dataset = wds.WebDataset(urls, cache_dir=cache_path, cache_size=10**10, handler=wds.handlers.warn_and_continue)
File "/fsx/nousr/clip-retrieval/.env/lib64/python3.7/site-packages/webdataset/dataset.py", line 80, in WebDataset
result = PytorchShardList(urls, shuffle=shardshuffle)
File "/fsx/nousr/clip-retrieval/.env/lib64/python3.7/site-packages/webdataset/shardlists.py", line 284, in __init__
urls = SimpleShardSample(urls)
File "/fsx/nousr/clip-retrieval/.env/lib64/python3.7/site-packages/webdataset/shardlists.py", line 173, in __init__
assert isinstance(self.urls[0], str)
IndexError: list index out of range
#!/bin/bash
clip-retrieval inference \
--input_dataset="pipe:aws s3 cp --quiet s3://s-datasets/laion5b/laion2B-data/{000000..231349}.tar -" \
--output_folder="s3://s-laion/vit-h-14-embeddings" \
--input_format="webdataset" \
--enable_metadata=True \
--write_batch_size=1000000 \
--num_prepro_workers=2 \
--batch_size=64 \
--enable_wandb=True \
--clip_model="open_clip:ViT-H-14" \
--distribution_strategy="slurm" \
--slurm_job_name="h-embeds" \
--slurm_partition="gpu" \
--slurm_jobs=104 \
--slurm_job_comment="dalle2" \
--slurm_job_timeout=150000 \
--cache_path=None \
--clip_cache_path=None \
--slurm_cache_path=None \
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment