Installing and setting up Pytorch

Procedure for setting up PyTorch for training and predicting with Inception CNN models. We were previously using Tensorflow + Keras, but are transitioning to PyTorch.

crc has instructions for python environment on cluster. Basically, we set up a new virtual environment on the cluster like this:

module purge 
module load python/3.7.0 venv/wrap

mkvirtualenv pytorch

workon pytorch

pip install torch==1.4.0 torchvision==0.5.0 numpy pillow pandas

or on a mac machine (eg Robin) like this:

# create a new conda environment named pytorch
conda create -n pytorch

# activate the environment
source activate pytorch

# install pytorch packages and other useful packages
pip install torch==1.4.0 torchvision==0.5.0 numpy pillow pandas

Creating a data generator

We write a class specific to our training data which will provide the data to the model training step.

First, we use the class to calculate mean and standard deviation of the entire dataset of images. Then, we write a similar class which subtracts mean and divides by standard deviation to standardize each image. That class will be the one that actually provides training data to the model.

But first: the script to calculate mean and std over all images: calculate_mean_std.py

import torch
import torch.nn as nn
import pandas as pd
import numpy as np
from PIL  import Image
from torchvision import transforms

class WothTrainingData(torch.utils.data.Dataset):
    def __init__(self, csv_file):
        self.data = pd.read_csv(csv_file)

        self.to_tensor = transforms.ToTensor()

    #these allow us to index the object with Objet[0]...
    def __len__(self):
        return len(self.data)

    #we will ask for a preprocessed image WothTrainingData[0]
    def __getitem__(self, idx):

        #all of our preprocessing happens here
        row = self.data.iloc[idx,:]
        image = Image.open(row['filename'])

        #preprocessing: 3 channels, and reshaping the image
        image = image.convert("RGB")
        image = image.resize((299,299)) #width, height
        image = np.array(image)

        #now get the label for this species
        species_name = 'hylocichla-mustelina'
        label = row[species_name]
        labels = [label]

        return {
            "image_tensor": self.to_tensor(image),
            "label": torch.from_numpy(np.array(labels)) #for multiclass we would return a vector
            }


dataGen = WothTrainingData('woth_training_data_labels_bgfs_png.csv')
#dataloader = torch.utils.data.DataLoader(dataGen, batch_size=32, shuffle=True, num_workers=4)

first = dataGen[0]
print(len(first))
print(first['image_tensor'].shape)
print(first['label'])

mean = torch.tensor([0.0,0.0,0.0])
for d in dataGen:
    for channel in range(1): #since our channels are identicle. If they are different loop over all 3
        mean[channel] += d['image_tensor'][channel].sum()
mean = mean / (len(dataGen) * dataGen.height * dataGen.width)
print(f'means: {mean}')

std =  torch.tensor([0.0,0.0,0.0])
for d in dataGen:
    for channel in range(1):
        dif = d['image_tensor'][channel] - mean[channel]
        std[channel] += (dif**2).sum()
prefactor = 1 /( (len(dataGen) * dataGen.height * dataGen.width) -1 )
std = torch.sqrt(prefactor * std)

print(f'standard deviations: {std}')

print(mean)
print(std)

We use this script to compute the mean and standard deviation of all images in the entire training set so that we can rescale. Then we will never use this class again. We create a very similar class but subtract mean and divide by standard deviation for the image tensor, so that the training set is "standardized" with mean 0 and std 1. To do this, we initialize our class with atributes self.mean and self.std, then we modify the image tensor (image - mean) / std

The exact change is like this:

import torch
import torch.nn as nn
import pandas as pd
import numpy as np
from PIL  import Image
from torchvision import transforms

class WothTrainingData(torch.utils.data.Dataset):
    def __init__(self, csv_file, height=299,width=299):
        self.data = pd.read_csv(csv_file)
        self.height = height
        self.width = width

        self.mean = 0.8013
        self.std = 0.1576

        self.to_tensor = transforms.ToTensor()

    #these 2 functions allow us to index the object with Objet[0]...
    #we will ask for a preprocessed image WothTrainingData[0]
    def __len__(self):
        return len(self.data)

    def __getitem__(self, idx):

        #all of our preprocessing happens here
        row = self.data.iloc[idx,:]
        image = Image.open(row['filename'])

        #preprocessing: 3 channels, and reshaping the image
        image = image.convert("RGB")
        image = image.resize((self.width, self.height)) #width, height
        image = np.array(image)

        #now get the label for this species
        species_name = 'hylocichla-mustelina'
        label = row[species_name]
        labels = [label]

        image_tensor =  (self.to_tensor(image)-self.mean) / self.std
        return {
            "image_tensor": image_tensor,
            "label": torch.from_numpy(np.array(labels)) #for multiclass we would return a vector
            }

#example of using the generator:
dataGen = WothTrainingData('woth_training_data_labels_bgfs_png.csv')
#dataloader = torch.utils.data.DataLoader(dataGen, batch_size=32, shuffle=True, num_workers=4)

first = dataGen[100]
print(len(first))
print('mean value in image: ')
print(first['image_tensor'][0].sum() / (dataGen.height * dataGen.width) )
print(first['image_tensor'].shape)
print(first['label'])

Model Training

Barry is working on a training script

Predictions

then we will predict.

sammlapp/PyTorch Model Training.md

Installing and setting up Pytorch

Creating a data generator

Model Training

Predictions