Last active
June 5, 2021 14:18
-
-
Save jcreinhold/3442ce5ead5cfe626d8b072e77a24a73 to your computer and use it in GitHub Desktop.
An exercise in getting started with the tiramisu-brulee CLIs
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
tiramisu-brulee exercise | |
================================================================================ | |
The goal of this exercise is to get up-and-running with the tiramisu-brulee | |
package found here: https://github.com/jcreinhold/tiramisu-brulee | |
Use the documentation if you get stuck: https://tiramisu-brulee.readthedocs.io | |
If you run into a bug, raise an issue here: | |
https://github.com/jcreinhold/tiramisu-brulee/issues | |
(Follow the bug template!) | |
Author: Jacob Reinhold ([email protected]) | |
--------------------------------------------------------------------------------- | |
Steps: | |
1) Create a conda environment with PyTorch (with an appropriate version of CUDA) | |
2) Install tiramisu-brulee with the [lesionseg] extras. | |
Open python and run: | |
>>> import tiramisu_brulee | |
>>> tiramisu_brulee.__version__ | |
'0.1.15' | |
Your version should be '0.1.15'. If it's not, re-run `pip install` with the | |
-U flag (to upgrade your version). | |
3) Create a configuration file for training a network and use all of the | |
train_*.csv files for the `train_csv` argument and all of the valid_*.csv | |
files for the `valid_csv` argument. Make sure they are in correspondence! | |
You can create a config file for training by running, e.g., | |
lesion-train --print_config > train_config.yaml | |
You can create a config file for training *with help comments* running, e.g., | |
lesion-train --print_config=comments > train_config.yaml | |
(IMO, this is hard to read but it may be useful for starting out.) | |
** All images are assumed to be co-registered (i.e., the anatomy is in | |
alignment and the image dimensions are the same for each contrast | |
as well as the labels). This is already the case with the data in this | |
exercise, but, in general, tiramisu-brulee assumes the images used for | |
training/prediction have already been preprocessed (e.g., registered, | |
bias field corrected, intensity normnalized). ** | |
You'll need to modify the `num_input` argument to match the number of | |
non-label columns in the csv files. | |
Generally this package is used for (binary) lesion segmentation. However, | |
there is limited support for multi-class segmentation. Set the `num_classes` | |
to however many classes there are in your target images. tiramisu-brulee | |
assumes that the multi-class image labels are arrays with a unique | |
integer representing each class. Note that only the `combo` loss function | |
(the sum of dice and cross-entropy) can be used for multi-class segmentation. | |
Make sure to set the `gpus` argument to 1 or 2. If you set 2, you should also | |
change `accelerator` to ddp and set "sync_batchnorm" to true. | |
Note that you'll have to figure out the right batch size (or use | |
auto_scale_batch_size)! The default may cause you to run out of CUDA memory. | |
You should also consider using a 2.5D or pseudo-3d network. The | |
state-of-the-art in MS lesion segmentation uses such a methodology. | |
Basically, a 2.5D network uses 2D operations instead of 3D and the | |
input to the network is a stack of adjacent slices, concatenated on the | |
channel row. | |
To use a 2.5D/pseudo-3d network, determine which axis you want to stack the | |
slices on (for the ISBI 15 data, if you want axial slices it should be 2). | |
Set `pseudo3d_dim` to that axis (e.g., 2). Then change the patch size to | |
something like: | |
- 128 | |
- 128 | |
and set `pseudo3d_size` to something small and odd like 3. | |
Also set the `pseudo3d_dim` to some value between 0 and 2. If you have | |
set `num_input` to N and the pseudo3d_size to M, then this will result in | |
a 2D network with N * M input channels and trained/validated on 128x128 | |
images (if you set the patch size as above). | |
Note that you can set the `pseudo3d_dim` per each set of train/valid | |
CSVs, e.g., if you have two train/valid CSVs, then you can set | |
`pseudo3d_dim` to: | |
- 1 | |
- 2 | |
which will set the network on corresponding to the first train/valid | |
CSV pair to have pseudo3d_dim == 1 and the second CSV pair to have | |
psuedo3d_dim == 2 (pseudo3d_size will correspond to both). This | |
can be useful to replicate the training/prediction scheme used in | |
the original Tiramisu 2.5D paper. | |
I'd set `verbosity` to 1, because there is some helpful logging. | |
It's also best practice to set `benchmark` to true. It should speed up | |
training when the input to the network doesn't change in size (which is the | |
case with lesion-train). | |
Consider setting `precision` to 16 for mixed-precision training (more | |
memory-efficient, so you can use a larger batch size) Also, decide about | |
setting `label_sampler`, `spatial_augmentation`, and `mixup` to true. | |
Read about them in the documentation! | |
4) Use the config file to train a set of lesion segmentation neural networks | |
with: | |
lesion-train --config train_config.yaml | |
This will create a directory called `lesion_tiramisu_experiment` in the | |
directory in which you run the above command. The more times you run | |
the above command, the more `lesion_tiramisu_experiment/version_*` | |
directories will be created. So the current run is usually in the last | |
`version_*` directory. Note that if you provide N `train_csv` files, | |
N `version_*` will be created for each run, e.g., if | |
`lesion_tiramisu_experiment` contains `version_12` and you start training | |
with 3 CSV files, then lesion-train will create `version_13`, `version_14` | |
and `version_15`. | |
You can montior your experiment with tensorboard by running, e.g., | |
tensorboard --logdir=lesion_tiramisu_experiment/version_13 | |
(where the version number is changed appropriately based on your experiment.) | |
On your local machine you can run, e.g.,: | |
ssh -N -f -L localhost:6006:localhost:6006 [username]@par.ece.jhu.edu | |
Change the username and hostname (par) appropriately based on your username | |
and the hostname you start tensorboard on. (You might also have to change | |
6006 to something else if 6006 is already being used. Tensorboard will print | |
out the number, it'll be something close to 6006 like 6007 or 6008.) Then | |
you can open your browser and type in: localhost:6006 (or 6007 or whatever) | |
and view the tensorboard log of metrics, images, and predictions. | |
5) Once training is completed, it will generate `predict_config.yaml` files | |
in every `lesion_tiramisu_experiment/version_*` directory with the best | |
model path already filled in according to the (approximate) ISBI 15 score | |
on the validation data. | |
Copy one of those config files and modify it to use the `predict.csv` file. | |
I'd also enable 1 GPU to speed things up. | |
You can either use patches for prediction (by setting --patch-size) or | |
predict the whole image volume at once. If you predict with patches, | |
you'll need to tune the batch size. If you predict the whole volume | |
at once, I'd leave `batch_size` at 1 because, even though it crops the | |
image based on estimates foreground values and inputs the only the | |
foreground image into the network, it is still memory-intensive. | |
If you set `pseudo3d_dim` for training, an option to make prediction | |
faster is to set `patch_size` to use the full image dimensions | |
along the non-`pseudo3d_dim` axis. To do so, you can set `patch_size` | |
to: | |
- null | |
- null | |
Note that `pseudo3d_size` must be the same as used in training | |
If the image to predict on has shape `H x W x D`, then the input to the | |
network will be `H x W x (M * N)` where `N` is the `num_input` set | |
in training and `M` is the `pseudo3d_size`. | |
This will speed up prediction because some redundant prediction is | |
skipped due to predicting non-overlapped patches. In general, you should | |
leave `patch_overlap` as `null`, regardless, because the correct | |
`patch_overlap` will be automatically determined based on `patch_size` | |
such that there are no missing predictions. | |
If you are using multiple networks for prediction (by providing multiple | |
model paths) and those networks are pseudo3d networks, then you should | |
set `pseudo3d_dim` to either 1 number to be used across all models, | |
e.g.,: | |
pseudo3d_dim: | |
- 1 | |
Or, if each model doesn't use the same `pseudo3d_dim`, then use, e.g.,: | |
pseudo3d_dim: | |
- 1 | |
- 2 | |
where each number corresponds to a model path. | |
If you run out of memory, try it on a machine with more memory or use | |
patch-based prediction. And/or try setting the precision to 16. | |
6) Create a directory for the output predictions and modify the `predict.csv` | |
file's `out` column to save the images in that path. You need to create a | |
unique filename for each prediction, so brush up on your text editor/sed/awk | |
skills, or use python with `csv = pandas.read_csv(...)`, modifying the | |
dataframe, saving the modified csv with `csv.to_csv(...)` | |
7) Use `lesion-predict --config predict_config.yaml` to run prediction. Once | |
you run this once, you can run it again with the `only_aggregate` option | |
set to true to test out a different threshold without recomputing all of | |
the intermediate predictions again. Make sure you move out the final | |
predictions from your previous run though, if you want to keep them, as | |
the new run will over-write them! | |
Alternatively, use the `lesion-predict-image` script for single time-point | |
prediction. Note that this interface doesn't accept a config file. Note that | |
you input the image using the same name used for the header in training, | |
e.g., | |
lesion-predict-image --t1 /path/to/t1.nii --flair /path/to/flair.nii \ | |
--out path/to/prediction.nii ... | |
where `--out` is the output prediction and `--label` is excluded. | |
8) You're done! Now you know the basics of how to use tiramisu-brulee CLIs for | |
lesion segmentation. There are many options to read about and ways to modify | |
the network for your specific problem. Report any bugs or major difficulties | |
at the link in the header. Best of luck! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment