Skip to content

Instantly share code, notes, and snippets.

@lgray
lgray / example.py
Created January 8, 2024 20:53
client.compute() example for coffea
with Client() as client: # distributed Client scheduler
# Run preprocess
print("\nRunning preprocess...")
dataset_runnable, dataset_updated = preprocess(
fileset,
maybe_step_size=50_000,
align_clusters=False,
files_per_batch=1,
#skip_bad_files=True,
@lgray
lgray / profile.txt
Created January 10, 2024 13:43
profile of running dak.necessary_columns on a wwz analysis
Running necessary_columns...
_ ._ __/__ _ _ _ _ _/_ Recorded: 07:32:07 Samples: 154011
/_//_/// /_\ / //_// / //_'/ // Duration: 164.460 CPU time: 164.503
/ _/ v4.6.1
Program: run_wwz4l.py ../../input_samples/sample_jsons/test_samples/UL17_WWZJetsTo4L2Nu_forCI.json,../../input_samples/sample_jsons/test_samples/UL17_WWZJetsTo4L2Nu_forCI_extra.json -x iterative
164.462 <module> run_wwz4l.py:1
└─ 164.462 report_necessary_columns dask_awkward/lib/inspect.py:118
@lgray
lgray / ewkcoffea_setup.txt
Last active February 12, 2024 22:04
instructions for slow analysis example
# in a fresh conda environment, >= py3.8
conda install xrootd -c conda-forge
pip install coffea xgboost mt2
git clone https://github.com/TopEFT/topcoffea.git -b coffea2023
pushd topcoffea
pip install -e .
popd
git clone https://github.com/cmstas/ewkcoffea.git -b coffea2023
# in a fresh conda environment, >= py3.8
conda install xrootd -c conda-forge
# install dask-awkward@main, install dask-histogram from https://github.com/lgray/dask-histogram/tree/map_reduce_agg_hist_adds
pip install coffea xgboost mt2 distributed==2024.2.0 dask==2024.2.0
git clone https://github.com/TopEFT/topcoffea.git -b coffea2023
pushd topcoffea
pip install -e .
popd