jcreinhold · June 5, 2021 14:18
diff --git a/tiramisu_brulee_exercise.txt b/tiramisu_brulee_exercise.txt
 tiramisu-brulee exercise
 ================================================================================

 The goal of this exercise is to get up-and-running with the tiramisu-brulee 
 package found here: https://github.com/jcreinhold/tiramisu-brulee

 Use the documentation if you get stuck: https://tiramisu-brulee.readthedocs.io

 If you run into a bug, raise an issue here: 
    https://github.com/jcreinhold/tiramisu-brulee/issues
 (Follow the bug template!)

 Author: Jacob Reinhold ([email protected])

 ---------------------------------------------------------------------------------
 Steps:

 1) Create a conda environment with PyTorch (with an appropriate version of CUDA)

 2) Install tiramisu-brulee with the [lesionseg] extras.
 
   Open python and run:
       >>> import tiramisu_brulee
       >>> tiramisu_brulee.__version__
       '0.1.15'

   Your version should be '0.1.15'. If it's not, re-run `pip install` with the
   -U flag (to upgrade your version).

 3) Create a configuration file for training a network and use all of the 
   train_*.csv files for the `train_csv` argument and all of the valid_*.csv 
   files for the `valid_csv` argument. Make sure they are in correspondence!

   You can create a config file for training by running, e.g.,
      lesion-train --print_config > train_config.yaml

   You can create a config file for training *with help comments* running, e.g.,
      lesion-train --print_config=comments > train_config.yaml
   (IMO, this is hard to read but it may be useful for starting out.)

   ** All images are assumed to be co-registered (i.e., the anatomy is in 
   alignment and the image dimensions are the same for each contrast
   as well as the labels). This is already the case with the data in this
   exercise, but, in general, tiramisu-brulee assumes the images used for
   training/prediction have already been preprocessed (e.g., registered,
   bias field corrected, intensity normnalized). **
   
   You'll need to modify the `num_input` argument to match the number of 
   non-label columns in the csv files.
   
   Generally this package is used for (binary) lesion segmentation. However,
   there is limited support for multi-class segmentation. Set the `num_classes`
   to however many classes there are in your target images. tiramisu-brulee 
   assumes that the multi-class image labels are arrays with a unique
   integer representing each class. Note that only the `combo` loss function
   (the sum of dice and cross-entropy) can be used for multi-class segmentation.

   Make sure to set the `gpus` argument to 1 or 2. If you set 2, you should also
   change `accelerator` to ddp and set "sync_batchnorm" to true.

   Note that you'll have to figure out the right batch size (or use 
   auto_scale_batch_size)! The default may cause you to run out of CUDA memory.
   
   You should also consider using a 2.5D or pseudo-3d network. The 
   state-of-the-art in MS lesion segmentation uses such a methodology.
   Basically, a 2.5D network uses 2D operations instead of 3D and the
   input to the network is a stack of adjacent slices, concatenated on the
   channel row. 
   
   To use a 2.5D/pseudo-3d network, determine which axis you want to stack the
   slices on (for the ISBI 15 data, if you want axial slices it should be 2).
   Set `pseudo3d_dim` to that axis (e.g., 2). Then change the patch size to
   something like:
     - 128
     - 128
   and set `pseudo3d_size` to something small and odd like 3.
   Also set the `pseudo3d_dim` to some value between 0 and 2. If you have 
   set `num_input` to N and the pseudo3d_size to M, then this will result in 
   a 2D network with N * M input channels and trained/validated on 128x128
   images (if you set the patch size as above).
   
   Note that you can set the `pseudo3d_dim` per each set of train/valid
   CSVs, e.g., if you have two train/valid CSVs, then you can set 
   `pseudo3d_dim` to:
     - 1
     - 2
   which will set the network on corresponding to the first train/valid
   CSV pair to have pseudo3d_dim == 1 and the second CSV pair to have
   psuedo3d_dim == 2 (pseudo3d_size will correspond to both). This 
   can be useful to replicate the training/prediction scheme used in
   the original Tiramisu 2.5D paper.
   
   I'd set `verbosity` to 1, because there is some helpful logging.

   It's also best practice to set `benchmark` to true. It should speed up 
   training when the input to the network doesn't change in size (which is the
   case with lesion-train).

   Consider setting `precision` to 16 for mixed-precision training (more 
   memory-efficient, so you can use a larger batch size) Also, decide about 
   setting `label_sampler`, `spatial_augmentation`, and `mixup` to true.
   Read about them in the documentation!

 4) Use the config file to train a set of lesion segmentation neural networks 
   with: 
      lesion-train --config train_config.yaml

   This will create a directory called `lesion_tiramisu_experiment` in the
   directory in which you run the above command. The more times you run
   the above command, the more `lesion_tiramisu_experiment/version_*`
   directories will be created. So the current run is usually in the last
   `version_*` directory. Note that if you provide N `train_csv` files,
   N `version_*` will be created for each run, e.g., if 
   `lesion_tiramisu_experiment` contains `version_12` and you start training
   with 3 CSV files, then lesion-train will create `version_13`, `version_14`
   and `version_15`.

   You can montior your experiment with tensorboard by running, e.g.,
       tensorboard --logdir=lesion_tiramisu_experiment/version_13
   (where the version number is changed appropriately based on your experiment.)

   On your local machine you can run, e.g.,:
       ssh -N -f -L localhost:6006:localhost:6006 [username]@par.ece.jhu.edu
   Change the username and hostname (par) appropriately based on your username
   and the hostname you start tensorboard on. (You might also have to change
   6006 to something else if 6006 is already being used. Tensorboard will print
   out the number, it'll be something close to 6006 like 6007 or 6008.) Then
   you can open your browser and type in: localhost:6006 (or 6007 or whatever)
   and view the tensorboard log of metrics, images, and predictions.

 5) Once training is completed, it will generate `predict_config.yaml` files
   in every `lesion_tiramisu_experiment/version_*` directory with the best
   model path already filled in according to the (approximate) ISBI 15 score 
   on the validation data.
   
   Copy one of those config files and modify it to use the `predict.csv` file.
   I'd also enable 1 GPU to speed things up. 
   
   You can either use patches for prediction (by setting --patch-size) or
   predict the whole image volume at once. If you predict with patches, 
   you'll need to tune the batch size. If you predict the whole volume
   at once, I'd leave `batch_size` at 1 because, even though it crops the 
   image based on estimates foreground values and inputs the only the
   foreground image into the network, it is still memory-intensive.

   If you set `pseudo3d_dim` for training, an option to make prediction
   faster is to set `patch_size` to use the full image dimensions
   along the non-`pseudo3d_dim` axis. To do so, you can set `patch_size`
   to:
     - null
     - null
   Note that `pseudo3d_size` must be the same as used in training
   If the image to predict on has shape `H x W x D`, then the input to the
   network will be `H x W x (M * N)` where `N` is the `num_input` set
   in training and `M` is the `pseudo3d_size`.
   This will speed up prediction because some redundant prediction is
   skipped due to predicting non-overlapped patches. In general, you should
   leave `patch_overlap` as `null`, regardless, because the correct
   `patch_overlap` will be automatically determined based on `patch_size`
   such that there are no missing predictions.

   If you are using multiple networks for prediction (by providing multiple
   model paths) and those networks are pseudo3d networks, then you should
   set `pseudo3d_dim` to either 1 number to be used across all models,
   e.g.,:
     pseudo3d_dim:
     - 1
   Or, if each model doesn't use the same `pseudo3d_dim`, then use, e.g.,:
     pseudo3d_dim:
     - 1
     - 2
   where each number corresponds to a model path.
    
   If you run out of memory, try it on a machine with more memory or use 
   patch-based prediction. And/or try setting the precision to 16.

 6) Create a directory for the output predictions and modify the `predict.csv` 
   file's `out` column to save the images in that path. You need to create a 
   unique filename for each prediction, so brush up on your text editor/sed/awk
   skills, or use python with `csv = pandas.read_csv(...)`, modifying the 
   dataframe, saving the modified csv with `csv.to_csv(...)`

 7) Use `lesion-predict --config predict_config.yaml` to run prediction. Once
   you run this once, you can run it again with the `only_aggregate` option
   set to true to test out a different threshold without recomputing all of
   the intermediate predictions again. Make sure you move out the final
   predictions from your previous run though, if you want to keep them, as
   the new run will over-write them!
  
   Alternatively, use the `lesion-predict-image` script for single time-point
   prediction. Note that this interface doesn't accept a config file. Note that
   you input the image using the same name used for the header in training, 
   e.g.,
       lesion-predict-image --t1 /path/to/t1.nii --flair /path/to/flair.nii \
            --out path/to/prediction.nii ...
   where `--out` is the output prediction and `--label` is excluded.
   
 8) You're done! Now you know the basics of how to use tiramisu-brulee CLIs for 
   lesion segmentation. There are many options to read about and ways to modify
   the network for your specific problem. Report any bugs or major difficulties 
   at the link in the header. Best of luck!
	tiramisu-brulee exercise
	================================================================================

	The goal of this exercise is to get up-and-running with the tiramisu-brulee
	package found here: https://github.com/jcreinhold/tiramisu-brulee

	Use the documentation if you get stuck: https://tiramisu-brulee.readthedocs.io

	If you run into a bug, raise an issue here:
	https://github.com/jcreinhold/tiramisu-brulee/issues
	(Follow the bug template!)

	Author: Jacob Reinhold ([email protected])

	---------------------------------------------------------------------------------
	Steps:

	1) Create a conda environment with PyTorch (with an appropriate version of CUDA)

	2) Install tiramisu-brulee with the [lesionseg] extras.

	Open python and run:
	>>> import tiramisu_brulee
	>>> tiramisu_brulee.__version__
	'0.1.15'

	Your version should be '0.1.15'. If it's not, re-run `pip install` with the
	-U flag (to upgrade your version).

	3) Create a configuration file for training a network and use all of the
	train_.csv files for the `train_csv` argument and all of the valid_.csv
	files for the `valid_csv` argument. Make sure they are in correspondence!

	You can create a config file for training by running, e.g.,
	lesion-train --print_config > train_config.yaml

	You can create a config file for training with help comments running, e.g.,
	lesion-train --print_config=comments > train_config.yaml
	(IMO, this is hard to read but it may be useful for starting out.)

	** All images are assumed to be co-registered (i.e., the anatomy is in
	alignment and the image dimensions are the same for each contrast
	as well as the labels). This is already the case with the data in this
	exercise, but, in general, tiramisu-brulee assumes the images used for
	training/prediction have already been preprocessed (e.g., registered,
	bias field corrected, intensity normnalized). **

	You'll need to modify the `num_input` argument to match the number of
	non-label columns in the csv files.

	Generally this package is used for (binary) lesion segmentation. However,
	there is limited support for multi-class segmentation. Set the `num_classes`
	to however many classes there are in your target images. tiramisu-brulee
	assumes that the multi-class image labels are arrays with a unique
	integer representing each class. Note that only the `combo` loss function
	(the sum of dice and cross-entropy) can be used for multi-class segmentation.

	Make sure to set the `gpus` argument to 1 or 2. If you set 2, you should also
	change `accelerator` to ddp and set "sync_batchnorm" to true.

	Note that you'll have to figure out the right batch size (or use
	auto_scale_batch_size)! The default may cause you to run out of CUDA memory.

	You should also consider using a 2.5D or pseudo-3d network. The
	state-of-the-art in MS lesion segmentation uses such a methodology.
	Basically, a 2.5D network uses 2D operations instead of 3D and the
	input to the network is a stack of adjacent slices, concatenated on the
	channel row.

	To use a 2.5D/pseudo-3d network, determine which axis you want to stack the
	slices on (for the ISBI 15 data, if you want axial slices it should be 2).
	Set `pseudo3d_dim` to that axis (e.g., 2). Then change the patch size to
	something like:
	- 128
	- 128
	and set `pseudo3d_size` to something small and odd like 3.
	Also set the `pseudo3d_dim` to some value between 0 and 2. If you have
	set `num_input` to N and the pseudo3d_size to M, then this will result in
	a 2D network with N * M input channels and trained/validated on 128x128
	images (if you set the patch size as above).

	Note that you can set the `pseudo3d_dim` per each set of train/valid
	CSVs, e.g., if you have two train/valid CSVs, then you can set
	`pseudo3d_dim` to:
	- 1
	- 2
	which will set the network on corresponding to the first train/valid
	CSV pair to have pseudo3d_dim == 1 and the second CSV pair to have
	psuedo3d_dim == 2 (pseudo3d_size will correspond to both). This
	can be useful to replicate the training/prediction scheme used in
	the original Tiramisu 2.5D paper.

	I'd set `verbosity` to 1, because there is some helpful logging.

	It's also best practice to set `benchmark` to true. It should speed up
	training when the input to the network doesn't change in size (which is the
	case with lesion-train).

	Consider setting `precision` to 16 for mixed-precision training (more
	memory-efficient, so you can use a larger batch size) Also, decide about
	setting `label_sampler`, `spatial_augmentation`, and `mixup` to true.
	Read about them in the documentation!

	4) Use the config file to train a set of lesion segmentation neural networks
	with:
	lesion-train --config train_config.yaml

	This will create a directory called `lesion_tiramisu_experiment` in the
	directory in which you run the above command. The more times you run
	the above command, the more `lesion_tiramisu_experiment/version_*`
	directories will be created. So the current run is usually in the last
	`version_*` directory. Note that if you provide N `train_csv` files,
	N `version_*` will be created for each run, e.g., if
	`lesion_tiramisu_experiment` contains `version_12` and you start training
	with 3 CSV files, then lesion-train will create `version_13`, `version_14`
	and `version_15`.

	You can montior your experiment with tensorboard by running, e.g.,
	tensorboard --logdir=lesion_tiramisu_experiment/version_13
	(where the version number is changed appropriately based on your experiment.)

	On your local machine you can run, e.g.,:
	ssh -N -f -L localhost:6006:localhost:6006 [username]@par.ece.jhu.edu
	Change the username and hostname (par) appropriately based on your username
	and the hostname you start tensorboard on. (You might also have to change
	6006 to something else if 6006 is already being used. Tensorboard will print
	out the number, it'll be something close to 6006 like 6007 or 6008.) Then
	you can open your browser and type in: localhost:6006 (or 6007 or whatever)
	and view the tensorboard log of metrics, images, and predictions.

	5) Once training is completed, it will generate `predict_config.yaml` files
	in every `lesion_tiramisu_experiment/version_*` directory with the best
	model path already filled in according to the (approximate) ISBI 15 score
	on the validation data.

	Copy one of those config files and modify it to use the `predict.csv` file.
	I'd also enable 1 GPU to speed things up.

	You can either use patches for prediction (by setting --patch-size) or
	predict the whole image volume at once. If you predict with patches,
	you'll need to tune the batch size. If you predict the whole volume
	at once, I'd leave `batch_size` at 1 because, even though it crops the
	image based on estimates foreground values and inputs the only the
	foreground image into the network, it is still memory-intensive.

	If you set `pseudo3d_dim` for training, an option to make prediction
	faster is to set `patch_size` to use the full image dimensions
	along the non-`pseudo3d_dim` axis. To do so, you can set `patch_size`
	to:
	- null
	- null
	Note that `pseudo3d_size` must be the same as used in training
	If the image to predict on has shape `H x W x D`, then the input to the
	network will be `H x W x (M * N)` where `N` is the `num_input` set
	in training and `M` is the `pseudo3d_size`.
	This will speed up prediction because some redundant prediction is
	skipped due to predicting non-overlapped patches. In general, you should
	leave `patch_overlap` as `null`, regardless, because the correct
	`patch_overlap` will be automatically determined based on `patch_size`
	such that there are no missing predictions.

	If you are using multiple networks for prediction (by providing multiple
	model paths) and those networks are pseudo3d networks, then you should
	set `pseudo3d_dim` to either 1 number to be used across all models,
	e.g.,:
	pseudo3d_dim:
	- 1
	Or, if each model doesn't use the same `pseudo3d_dim`, then use, e.g.,:
	pseudo3d_dim:
	- 1
	- 2
	where each number corresponds to a model path.

	If you run out of memory, try it on a machine with more memory or use
	patch-based prediction. And/or try setting the precision to 16.

	6) Create a directory for the output predictions and modify the `predict.csv`
	file's `out` column to save the images in that path. You need to create a
	unique filename for each prediction, so brush up on your text editor/sed/awk
	skills, or use python with `csv = pandas.read_csv(...)`, modifying the
	dataframe, saving the modified csv with `csv.to_csv(...)`

	7) Use `lesion-predict --config predict_config.yaml` to run prediction. Once
	you run this once, you can run it again with the `only_aggregate` option
	set to true to test out a different threshold without recomputing all of
	the intermediate predictions again. Make sure you move out the final
	predictions from your previous run though, if you want to keep them, as
	the new run will over-write them!

	Alternatively, use the `lesion-predict-image` script for single time-point
	prediction. Note that this interface doesn't accept a config file. Note that
	you input the image using the same name used for the header in training,
	e.g.,
	lesion-predict-image --t1 /path/to/t1.nii --flair /path/to/flair.nii \
	--out path/to/prediction.nii ...
	where `--out` is the output prediction and `--label` is excluded.

	8) You're done! Now you know the basics of how to use tiramisu-brulee CLIs for
	lesion segmentation. There are many options to read about and ways to modify
	the network for your specific problem. Report any bugs or major difficulties
	at the link in the header. Best of luck!