Procedure for setting up PyTorch for training and predicting with Inception CNN models. We were previously using Tensorflow + Keras, but are transitioning to PyTorch.
crc has instructions for python environment on cluster. Basically, we set up a new virtual environment on the cluster like this:
module purge
module load python/3.7.0 venv/wrap
mkvirtualenv pytorch
workon pytorch
pip install torch==1.4.0 torchvision==0.5.0 numpy pillow pandas
or on a mac machine (eg Robin) like this:
# create a new conda environment named pytorch
conda create -n pytorch
# activate the environment
source activate pytorch
# install pytorch packages and other useful packages
pip install torch==1.4.0 torchvision==0.5.0 numpy pillow pandas
We write a class specific to our training data which will provide the data to the model training step.
First, we use the class to calculate mean and standard deviation of the entire dataset of images. Then, we write a similar class which subtracts mean and divides by standard deviation to standardize each image. That class will be the one that actually provides training data to the model.
But first: the script to calculate mean and std over all images:
calculate_mean_std.py
import torch
import torch.nn as nn
import pandas as pd
import numpy as np
from PIL import Image
from torchvision import transforms
class WothTrainingData(torch.utils.data.Dataset):
def __init__(self, csv_file):
self.data = pd.read_csv(csv_file)
self.to_tensor = transforms.ToTensor()
#these allow us to index the object with Objet[0]...
def __len__(self):
return len(self.data)
#we will ask for a preprocessed image WothTrainingData[0]
def __getitem__(self, idx):
#all of our preprocessing happens here
row = self.data.iloc[idx,:]
image = Image.open(row['filename'])
#preprocessing: 3 channels, and reshaping the image
image = image.convert("RGB")
image = image.resize((299,299)) #width, height
image = np.array(image)
#now get the label for this species
species_name = 'hylocichla-mustelina'
label = row[species_name]
labels = [label]
return {
"image_tensor": self.to_tensor(image),
"label": torch.from_numpy(np.array(labels)) #for multiclass we would return a vector
}
dataGen = WothTrainingData('woth_training_data_labels_bgfs_png.csv')
#dataloader = torch.utils.data.DataLoader(dataGen, batch_size=32, shuffle=True, num_workers=4)
first = dataGen[0]
print(len(first))
print(first['image_tensor'].shape)
print(first['label'])
mean = torch.tensor([0.0,0.0,0.0])
for d in dataGen:
for channel in range(1): #since our channels are identicle. If they are different loop over all 3
mean[channel] += d['image_tensor'][channel].sum()
mean = mean / (len(dataGen) * dataGen.height * dataGen.width)
print(f'means: {mean}')
std = torch.tensor([0.0,0.0,0.0])
for d in dataGen:
for channel in range(1):
dif = d['image_tensor'][channel] - mean[channel]
std[channel] += (dif**2).sum()
prefactor = 1 /( (len(dataGen) * dataGen.height * dataGen.width) -1 )
std = torch.sqrt(prefactor * std)
print(f'standard deviations: {std}')
print(mean)
print(std)
We use this script to compute the mean and standard deviation of all images in the entire training set so that we can rescale. Then we will never use this class again. We create a very similar class but subtract mean and divide by standard deviation for the image tensor, so that the training set is "standardized" with mean 0 and std 1. To do this, we initialize our class with atributes self.mean and self.std, then we modify the image tensor (image - mean) / std
The exact change is like this:
import torch
import torch.nn as nn
import pandas as pd
import numpy as np
from PIL import Image
from torchvision import transforms
class WothTrainingData(torch.utils.data.Dataset):
def __init__(self, csv_file, height=299,width=299):
self.data = pd.read_csv(csv_file)
self.height = height
self.width = width
self.mean = 0.8013
self.std = 0.1576
self.to_tensor = transforms.ToTensor()
#these 2 functions allow us to index the object with Objet[0]...
#we will ask for a preprocessed image WothTrainingData[0]
def __len__(self):
return len(self.data)
def __getitem__(self, idx):
#all of our preprocessing happens here
row = self.data.iloc[idx,:]
image = Image.open(row['filename'])
#preprocessing: 3 channels, and reshaping the image
image = image.convert("RGB")
image = image.resize((self.width, self.height)) #width, height
image = np.array(image)
#now get the label for this species
species_name = 'hylocichla-mustelina'
label = row[species_name]
labels = [label]
image_tensor = (self.to_tensor(image)-self.mean) / self.std
return {
"image_tensor": image_tensor,
"label": torch.from_numpy(np.array(labels)) #for multiclass we would return a vector
}
#example of using the generator:
dataGen = WothTrainingData('woth_training_data_labels_bgfs_png.csv')
#dataloader = torch.utils.data.DataLoader(dataGen, batch_size=32, shuffle=True, num_workers=4)
first = dataGen[100]
print(len(first))
print('mean value in image: ')
print(first['image_tensor'][0].sum() / (dataGen.height * dataGen.width) )
print(first['image_tensor'].shape)
print(first['label'])
Barry is working on a training script
then we will predict.