In order to get this to run, I needed to backtrack through the scenario as stored in GARD evaluations. My "travelogue" follows the "How To" in case I need to recapitulate this again for another job.
First make things look like a "standard" armory installation:
mkdir -p ~/git/twosixlabs
cd ~/git/twosix/labs
git clone https://github.com/twosixlabs/armory.git
git clone https://github.com/twosixlabs/gard-submissions.git
git clone https://github.com/twosixlabs/gard-evaluations.git
Where ~/git/twosixlabs
is the default local_git_dir
specified in
~/.armory/config.json
This is where armory will look to satisfy the references in the
configuration. Next, checkout the eval3 workbranches:
cd gard-submissions && git checkout eval3 && cd ..
cd gard-evaluations && git checkout eval3 && cd ..
To get armory to run in a no-docker environment, make a virtualenv and
pip install armory-testbed
pip install -r git/twosixlabs/armory/requirements.txt
pip install -r git/twosixlabs/armory/host-requirements.txt
pip install tensorflow
because data/datasets.py
needs it. Bring the gard-submissions and evaluations
directories up to date:
gard-submissions$ git fetch; git checkout eval3
gard-evaluations$ git fetch; git checkout eval3
Finally,
armory run JHUM_resisc_ddpa.json --no-docker --check
That was focused on getting that configuration to run. Now looking back on Kevin's original report
Getting an error when running the code during the model download process for JHUM_resisc_ddpa.json
...
File "/workspace/armory/data/utils.py", line 75, in maybe_download_weights_from_s3
f"{weights_file} was not found in the armory public & submission S3 buckets."
ValueError: resisc_resnet_test.pth.tar was not found in the armory public & submission S3 buckets.
I've checked in the gard-evalauations config file (92a20a8e) with the name
as submitted JHUM_resisc10_pth.tar
and the origin validation run has completed:
"results": {
"benign_validation_accuracy": 0.63,
"benign_validation_accuracy_targeted_class": 0.51,
"poisoned_targeted_misclassification_accuracy": 0.21,
"poisoned_test_accuracy": 0.661
},
These agree with the performer benign/targeted expectations of 0.663 and 0.22.
This is kept here for reference in case anyone needs to revisit it.
Running armory --check
yields
ModuleNotFoundError: No module named 'armory.scenarios.poisoning_resisc10_scenario'
after pulling the github ref
mmoayeri/JHU-GARD-Eval3-DDPA@main
The db5 job uses the twosixarmory/pytorch:0.12.3
container and the module
armory.scenarios.poisoning_resisc10_scenario
was added in armory 0.13.1.
I suspect that this was worked around by the addition of
"local_repo_path": [
"twosixlabs/gard-submissions/repos/twosixlabs__armory__default/armory",
"twosixlabs/gard-submissions/repos/mmoayeri__JHU-GARD-Eval3-DDPA__main/JHU-GARD-Eval3-DDPA",
"twosixlabs/gard-evaluations/JHUM_resisc_ddpa"
],
…
which tries to pull in the scenario that didn't exist in the twosixarmory/pytorch:0.12.3
container. Unfortunately, I do not know how to recapitulate that environment.
So I will try to re-target the configuration to
twosixlabsarmory/pytorch:0.13.3
And..that fails in a complicated fashion
2021-07-22 19:34:11 6f131c5cbe32 armory.scenarios.poisoning_resisc10_scenario[6] INFO Fitting model of dpa_ensemble.get_art_model...
2021-07-22 19:34:12 6f131c5cbe32 armory.scenarios.base[6] ERROR Encountered error during scenario evaluation.
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/armory/scenarios/base.py", line 84, in evaluate
config, num_eval_batches, skip_benign, skip_attack, skip_misclassified
File "/opt/conda/lib/python3.7/site-packages/armory/scenarios/poisoning_resisc10_scenario.py", line 266, in _evaluate
shuffle=True,
File "/opt/conda/lib/python3.7/site-packages/art/estimators/classification/classifier.py", line 71, in replacement_function
return fdict[func_name](self, *args, **kwargs)
File "/armory/tmp/2021-07-22T193355.982539/external/JHU-GARD-Eval3-DDPA/dpa_ensemble.py", line 65, in fit
labels = torch.as_tensor(y,device=self.device)
TypeError: can't convert np.ndarray of type numpy.object_. The only supported types are: float64, float32, float16, complex64, complex128, int64, int32, int16, int8, uint8, and bool.
2021-07-22 19:34:12.060842: W tensorflow/core/kernels/data/cache_dataset_ops.cc:757] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
Which suggests that it is worth another try on repos/twosixlabs__armory__default/armory
.
I found a very specific fix that casts the faulted "can't convert" in the recent traceback.
There is one github commit that adds an .astype(int)
to the torch.as_tensor
call. That commit is
dated 16-Jul-2021 (six days ago). Because the local_repo_path
block shown above
references both repos/twosixlabs__armory__default
and repos/mmoayeri__JHU-GARD-Eval3-DDPA__main
I hope that armory will pull them.
But it does not:
File "/opt/conda/lib/python3.7/site-packages/armory/utils/external_repo.py", line 32, in add_path
raise ValueError(f"{path} is not a valid directory path")
ValueError: /armory/git/twosixlabs/gard-submissions/repos/twosixlabs__armory__default is not a valid directory path
So time to find how armory processes local_repo_path
. Sure enough, the code in
armory.scenarios.base.py
explicitly loads local_repo_path
and then calls external_repo.add_local_repo
on it which appears to load those three without error. What is not
clear is if those definitions, which duplicate the running bits of armory will
get called.
There was a hack in eval2 which said "need to run armory inside the armory source repository", but that doesn't have the desired effect because the source repo would be a released v0.13.x and we need the overridden symbols. So I'm trying it in the --no-docker environment...