Skip to content

Instantly share code, notes, and snippets.

@0xfaust
Created February 22, 2019 11:57
Show Gist options
  • Save 0xfaust/27a53a09d540e1c71baef2018e74aca9 to your computer and use it in GitHub Desktop.
Save 0xfaust/27a53a09d540e1c71baef2018e74aca9 to your computer and use it in GitHub Desktop.
Jupyter Notebook for Experiment D in 'Data synthesis methods for semantic segmentation in agriculture: A Capsicum annuum dataset' paper
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "capsicum_annuum.ipynb",
"version": "0.3.2",
"provenance": [],
"collapsed_sections": [],
"toc_visible": true
},
"kernelspec": {
"name": "python2",
"display_name": "Python 2"
},
"accelerator": "GPU"
},
"cells": [
{
"metadata": {
"id": "vvk4jDgC70VO",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"### Jupyter Notebook for Experiments in [Capsicum Annuum Dataset](https://doi.org/10.1016/j.compag.2017.12.001) Paper using Deeplab Models\n"
]
},
{
"metadata": {
"id": "J8HY15Mv9RAb",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"**Import Libraries and Download Tensorflow Models**"
]
},
{
"metadata": {
"id": "jBg6BPGZ1vWd",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"import os\n",
"import math\n",
"import sys\n",
"from IPython.display import HTML"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "Wfzu6S35WZDj",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"import tensorflow as tf\n",
"device_name = tf.test.gpu_device_name()\n",
"if device_name != '/device:GPU:0':\n",
" raise SystemError('GPU device not found')\n",
"print('Found GPU at: {}'.format(device_name))"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "dTVhVb4SRN0J",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"# Install the PyDrive wrapper & import libraries.\n",
"# This only needs to be done once in a notebook.\n",
"!pip install -U -q PyDrive\n",
"from pydrive.auth import GoogleAuth\n",
"from pydrive.drive import GoogleDrive\n",
"from google.colab import auth\n",
"from oauth2client.client import GoogleCredentials\n",
"\n",
"# Authenticate and create the PyDrive client.\n",
"# This only needs to be done once in a notebook.\n",
"auth.authenticate_user()\n",
"gauth = GoogleAuth()\n",
"gauth.credentials = GoogleCredentials.get_application_default()\n",
"drive = GoogleDrive(gauth)"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "5FknKhTZ1vW4",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"!git clone https://github.com/tensorflow/models.git"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "Q0NboZCxIQ97",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"### Experiment A \n",
"> Train: synthetic (1–8750). Test: synthetic (8851–8900).\n",
">> *This experiment was run to obtain a performance reference point of the model when having access to a large and detailed annotated dataset for this domain.*\n",
" \n",
"### Experiment B\n",
">Train: synthetic (1–8750). Test: empirical (41–50).\n",
">> *To determine to what extent a synthetically trained model can generalise to a similar set in the same domain without fine-tuning.*\n",
"\n",
"### Experiment C\n",
">Train: empirical (1–30). Test: empirical (41–50).\n",
">> *As a reference to see if the model can learn using a small dataset, using empirical data.*\n",
"\n",
"### Experiment D.\n",
">Train: PASCAL VOC. Fine-tune: empirical (1–30). Test: empirical (41–50).\n",
">> *To compare the effect of bootstrapping with a non-related dataset.*\n",
"\n",
"### Experiment E.\n",
">Train: synthetic (1–8750). Fine-tune: empirical (1–30). Test: empirical (41–50).\n",
">> *To assess the effect of bootstrapping with a related dataset.*\n"
]
},
{
"metadata": {
"id": "qc8O5pszIV5d",
"colab_type": "code",
"cellView": "form",
"colab": {}
},
"cell_type": "code",
"source": [
"#@title Select Experiment\n",
"\n",
"EXP = 'Exp. A' #@param [\"Pascal Voc Test\", \"Exp. A\", \"Exp. B\", \"Exp. C\", \"Exp. D\", \"Exp. E\"]\n",
"\n",
"print(EXP)"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "L_R3xxs3RbnA",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"print (EXP)"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "Y6OIRkFPMMhN",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"%%bash\n",
"pwd\n",
"cd models/research/deeplab\n",
"pwd\n",
"sh ./local_test.sh"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "PH8kC47d9dO7",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"**Setup Directories and Download Dataset Data and Uncompress Files** \n",
"Dataset: https://data.4tu.nl/repository/uuid:884958f5-b868-46e1-b3d8-a0b5d91b02c0 \n",
"Execute in /tensorflow/models/research/deeplab/datasets"
]
},
{
"metadata": {
"id": "2sX-yrNx1vXI",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"# empirical data\n",
"BASE_URL = 'https://data.4tu.nl/bulk/uuid_884958f5-b868-46e1-b3d8-a0b5d91b02c0'\n",
"FILENAME_DATA = 'empirical_image_color.zip'\n",
"FILENAME_LABELS = 'empirical_label_class_grayscale.zip'"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "KRN7Jozf1vXZ",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"# synthetic data\n",
"BASE_URL = 'https://data.4tu.nl/bulk/uuid_884958f5-b868-46e1-b3d8-a0b5d91b02c0'\n",
"FILENAME_DATA = 'synthetic_image_color.zip'\n",
"FILENAME_LABELS = 'synthetic_label_class_grayscale.zip'"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "CWwAGNA91vXx",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"# setup directories\n",
"RESEARCH_DIR = os.getcwd()+'/models/research'\n",
"DEEPLAB_DIR = RESEARCH_DIR+'/deeplab'\n",
"DATASET_DIR = DEEPLAB_DIR+'/datasets'\n",
"CAPSICUM_ANNUUM_DIR = DATASET_DIR+'/capsicum_annuum'\n",
"LIST_DIR = CAPSICUM_ANNUUM_DIR+'/image_sets'\n",
"ANNOTATED_DIR = CAPSICUM_ANNUUM_DIR+'/segmentation_class'\n",
"INIT_DIR = CAPSICUM_ANNUUM_DIR+'/init_models'\n",
"EXP_DIR = CAPSICUM_ANNUUM_DIR+'/exp'\n",
"\n",
"if(EXP == 'Exp. A'):\n",
" IMAGE_DIR = CAPSICUM_ANNUUM_DIR+'/synthetic_image_color'\n",
" GROUND_TRUTH_DIR = CAPSICUM_ANNUUM_DIR+'/synthetic_label_class_grayscale/synthetic_label_class_all_grayscale'\n",
" EXP_ID = EXP_DIR+'/a'\n",
"elif(EXP == 'Exp. B'):\n",
" IMAGE_DIR = CAPSICUM_ANNUUM_DIR+'/x'\n",
" GROUND_TRUTH_DIR = CAPSICUM_ANNUUM_DIR+'/x'\n",
" EXP_ID = EXP_DIR+'/b'\n",
"elif(EXP == 'Exp. C'):\n",
" IMAGE_DIR = CAPSICUM_ANNUUM_DIR+'/x'\n",
" GROUND_TRUTH_DIR = CAPSICUM_ANNUUM_DIR+'/x'\n",
" EXP_ID = EXP_DIR+'/c'\n",
"elif(EXP == 'Exp. D'):\n",
" IMAGE_DIR = CAPSICUM_ANNUUM_DIR+'/empirical_image_color'\n",
" GROUND_TRUTH_DIR = CAPSICUM_ANNUUM_DIR+'/empirical_label_class_grayscale/empirical_label_class_all_grayscale'\n",
" EXP_ID = EXP_DIR+'/d'\n",
"else:\n",
" IMAGE_DIR = CAPSICUM_ANNUUM_DIR+'/x'\n",
" GROUND_TRUTH_DIR = CAPSICUM_ANNUUM_DIR+'/x'\n",
" EXP_ID = EXP_DIR+'/e'\n",
"\n",
"TRAIN_LOGDIR = EXP_ID+'/train'\n",
"EVAL_LOGDIR = EXP_ID+'/eval'\n",
"VIS_LOGDIR = EXP_ID+'/vis'\n",
"EXPORT_DIR = EXP_ID+'/export'\n",
"TF_RECORD_DIR = CAPSICUM_ANNUUM_DIR+'/tfrecord'\n",
" \n",
"%mkdir -p \"$CAPSICUM_ANNUUM_DIR\"\n",
"%mkdir -p \"$LIST_DIR\"\n",
"%mkdir -p \"$ANNOTATED_DIR\"\n",
"%mkdir -p \"$INIT_DIR\"\n",
"%mkdir -p \"$TRAIN_LOGDIR\"\n",
"%mkdir -p \"$EVAL_LOGDIR\"\n",
"%mkdir -p \"$VIS_LOGDIR\"\n",
"%mkdir -p \"$EXPORT_DIR\"\n",
"%mkdir -p \"$TF_RECORD_DIR\""
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "oBdtIkgn1vX8",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"%%bash -s \"$CAPSICUM_ANNUUM_DIR\" \"$BASE_URL\" \"$FILENAME_DATA\" \"$FILENAME_LABELS\" \n",
"CAPSICUM_ANNUUM_DIR=$1\n",
"cd \"${CAPSICUM_ANNUUM_DIR}\"\n",
"\n",
"# file urls\n",
"BASE_URL=$2\n",
"FILENAME_DATA=$3\n",
"FILENAME_LABELS=$4\n",
"\n",
"# Helper function to download dataset.\n",
"download(){\n",
" local BASE_URL=${1}\n",
" local FILENAME=${2}\n",
"\n",
" if [ ! -f \"${FILENAME}\" ]; then\n",
" echo \"Downloading ${FILENAME} to ${CAPSICUM_ANNUUM_DIR}\"\n",
" wget -q -nd -c \"${BASE_URL}/${FILENAME}\"\n",
" fi\n",
"}\n",
"\n",
"# Download the images.\n",
"download \"${BASE_URL}\" \"${FILENAME_DATA}\"\n",
"download \"${BASE_URL}\" \"${FILENAME_LABELS}\""
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "5bqeRvnyDtIY",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"%%bash -s \"$CAPSICUM_ANNUUM_DIR\" \"$BASE_URL\" \"$FILENAME_DATA\" \"$FILENAME_LABELS\" \n",
"CAPSICUM_ANNUUM_DIR=$1\n",
"cd \"${CAPSICUM_ANNUUM_DIR}\"\n",
"\n",
"# file urls\n",
"BASE_URL=$2\n",
"FILENAME_DATA=$3\n",
"FILENAME_LABELS=$4\n",
"\n",
"# Helper function to unpack dataset.\n",
"uncompress() {\n",
" local BASE_URL=${1}\n",
" local FILENAME=${2}\n",
"\n",
" echo \"Uncompressing ${FILENAME}\"\n",
" unzip \"${FILENAME}\"\n",
"}\n",
"\n",
"# Uncompress the images.\n",
"uncompress \"${BASE_URL}\" \"${FILENAME_DATA}\"\n",
"uncompress \"${BASE_URL}\" \"${FILENAME_LABELS}\""
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "H89SXXi310zl",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"%%bash -s \"$CAPSICUM_ANNUUM_DIR\" \"$GROUND_TRUTH_DIR\" \"$ANNOTATED_DIR\" \n",
"CAPSICUM_ANNUUM_DIR=$1\n",
"GROUND_TRUTH_DIR=$2\n",
"ANNOTATED_DIR=$3\n",
"\n",
"cd \"${CAPSICUM_ANNUUM_DIR}\"\n",
"\n",
"echo \"Removing the color map in ground truth annotations...\"\n",
"echo \"Ground truth directory: $GROUND_TRUTH_DIR\"\n",
"\n",
"python ../remove_gt_colormap.py \\\n",
" --original_gt_folder=\"$GROUND_TRUTH_DIR\" \\\n",
"--output_dir=\"$ANNOTATED_DIR/raw\""
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "xekr_spl-CKu",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"**Define Training and Evaluation Images**"
]
},
{
"metadata": {
"id": "g4x87JyT1vYN",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"%%bash -s \"$IMAGE_DIR\" \"$LIST_DIR\"\n",
"\n",
"IMAGE_DIR=$1\n",
"LIST_DIR=$2\n",
"\n",
"cd \"${IMAGE_DIR}\"\n",
"\n",
"ls -v | head -30 | cut -d '.' -f 1 > ${LIST_DIR}/train.txt\n",
"ls -v | tail -9 | cut -d '.' -f 1 > ${LIST_DIR}/val.txt\n",
"cat ${LIST_DIR}/train.txt ${LIST_DIR}/val.txt > ${LIST_DIR}/trainval.txt"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "M_Z5uT4tV7sb",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"%%bash -s \"$IMAGE_DIR\" \"$LIST_DIR\"\n",
"\n",
"IMAGE_DIR=$1\n",
"LIST_DIR=$2\n",
"\n",
"cd \"${IMAGE_DIR}\"\n",
"\n",
"ls -v | head -8750 | cut -d '.' -f 1 > ${LIST_DIR}/train.txt\n",
"ls -v | tail -9 | cut -d '.' -f 1 > ${LIST_DIR}/val.txt\n",
"cat ${LIST_DIR}/train.txt ${LIST_DIR}/val.txt > ${LIST_DIR}/trainval.txt\n",
"\n",
"> Train: synthetic (1–8750). Test: synthetic (8851–8900).\n",
"Fine-tune: empirical (1–30). Test: empirical (41–50)\n"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "d82aFYsy-Nj6",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"**Set Environment Path for Colab and Linux**"
]
},
{
"metadata": {
"id": "Q10uAcnJ4It7",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"os.environ['PYTHONPATH'] += \":/content/models/research\"\n",
"os.environ['PYTHONPATH'] += \":/content/models/research/slim\""
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "BKirt22E1vYi",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"%%bash -s \"$RESEARCH_DIR\" \"$DATASET_DIR\"\n",
"\n",
"RESEARCH_DIR=$1\n",
"DATASET_DIR=$2\n",
"\n",
"cd \"${RESEARCH_DIR}\"\n",
"pwd\n",
"echo \"${PYTHONPATH}\"\n",
"export PYTHONPATH=$PYTHONPATH:`pwd`/slim\n",
"python deeplab/model_test.py\n",
"echo \"${DATASET_DIR}\"\n",
"cd \"${DATASET_DIR}\"\n",
"pwd"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "fmhJJpMA-aZB",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"**Clean and Rename Annotated Data**"
]
},
{
"metadata": {
"scrolled": true,
"id": "tX1Kw5lM1vZE",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"%%bash -s \"$ANNOTATED_DIR\"\n",
"\n",
"ANNOTATED_DIR=$1\n",
"\n",
"cd \"${ANNOTATED_DIR}/raw\"\n",
"echo \"${ANNOTATED_DIR}\"\n",
"cp *.png ../\n",
"cd \"${ANNOTATED_DIR}\"\n",
"rename 's/label_class_all_grayscale/image_color/' *.png\n"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "GTXPro0I-g73",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"**Generate TFRecords for Tensorflow**"
]
},
{
"metadata": {
"id": "lzw_yRQc1vY6",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"%cd \"{DATASET_DIR}\"\n",
"!pwd"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "nxHolq7c1vZT",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"import build_data\n",
"\n",
"FLAGS = tf.app.flags.FLAGS\n",
"\n",
"####Delete all flags before declare#####\n",
"\n",
"def del_all_flags(FLAGS):\n",
" flags_dict = FLAGS._flags() \n",
" keys_list = [keys for keys in flags_dict] \n",
" for keys in keys_list:\n",
" FLAGS.__delattr__(keys)\n",
"\n",
"del_all_flags(tf.flags.FLAGS)\n",
"\n",
"tf.app.flags.DEFINE_string('image_folder',\n",
" IMAGE_DIR,\n",
" 'Folder containing images.')\n",
"\n",
"tf.app.flags.DEFINE_string(\n",
" 'semantic_segmentation_folder',\n",
" ANNOTATED_DIR,\n",
" 'Folder containing semantic segmentation annotations.')\n",
"\n",
"tf.app.flags.DEFINE_string(\n",
" 'list_folder',\n",
" LIST_DIR,\n",
" 'Folder containing lists for training and validation')\n",
"\n",
"tf.app.flags.DEFINE_string(\n",
" 'image_format',\n",
" \"png\",\n",
" 'Format of images.')\n",
"\n",
"tf.app.flags.DEFINE_string(\n",
" 'label_format',\n",
" \"png\",\n",
" 'Format of labels.')\n",
"\n",
"tf.app.flags.DEFINE_string(\n",
" 'output_dir',\n",
" TF_RECORD_DIR,\n",
" 'Path to save converted SSTable of TensorFlow examples.')\n",
"\n",
"\n",
"_NUM_SHARDS = 4\n",
"\n",
"\n",
"def _convert_dataset(dataset_split):\n",
" \"\"\"Converts the specified dataset split to TFRecord format.\n",
"\n",
" Args:\n",
" dataset_split: The dataset split (e.g., train, test).\n",
"\n",
" Raises:\n",
" RuntimeError: If loaded image and label have different shape.\n",
" \"\"\"\n",
" dataset = os.path.basename(dataset_split)[:-4]\n",
" sys.stdout.write('Processing ' + dataset)\n",
" filenames = [x.strip('\\n') for x in open(dataset_split, 'r')]\n",
" num_images = len(filenames)\n",
" num_per_shard = int(math.ceil(num_images / float(_NUM_SHARDS)))\n",
"\n",
" image_reader = build_data.ImageReader('png', channels=3)\n",
" label_reader = build_data.ImageReader('png', channels=1)\n",
"\n",
" for shard_id in range(_NUM_SHARDS):\n",
" output_filename = os.path.join(\n",
" FLAGS.output_dir,\n",
" '%s-%05d-of-%05d.tfrecord' % (dataset, shard_id, _NUM_SHARDS))\n",
" with tf.python_io.TFRecordWriter(output_filename) as tfrecord_writer:\n",
" start_idx = shard_id * num_per_shard\n",
" end_idx = min((shard_id + 1) * num_per_shard, num_images)\n",
" for i in range(start_idx, end_idx):\n",
" sys.stdout.write('\\r>> Converting image %d/%d shard %d' % (\n",
" i + 1, len(filenames), shard_id))\n",
" sys.stdout.flush()\n",
" # Read the image.\n",
" image_filename = os.path.join(\n",
" FLAGS.image_folder, filenames[i] + '.png') #+ FLAGS.image_format)\n",
" image_data = tf.gfile.FastGFile(image_filename, 'rb').read()\n",
" height, width = image_reader.read_image_dims(image_data)\n",
" # Read the semantic segmentation annotation.\n",
" seg_filename = os.path.join(\n",
" FLAGS.semantic_segmentation_folder,\n",
" filenames[i] + '.png') #+ FLAGS.label_format)\n",
" seg_data = tf.gfile.FastGFile(seg_filename, 'rb').read()\n",
" seg_height, seg_width = label_reader.read_image_dims(seg_data)\n",
" if height != seg_height or width != seg_width:\n",
" raise RuntimeError('Shape mismatched between image and label.')\n",
" # Convert to tf example.\n",
" example = build_data.image_seg_to_tfexample(\n",
" image_data, filenames[i], height, width, seg_data)\n",
" tfrecord_writer.write(example.SerializeToString())\n",
" sys.stdout.write('\\n')\n",
" sys.stdout.flush()\n",
"\n",
"\n",
"def main(unused_argv):\n",
" dataset_splits = tf.gfile.Glob(os.path.join(FLAGS.list_folder, '*.txt'))\n",
" for dataset_split in dataset_splits:\n",
" _convert_dataset(dataset_split)\n",
"\n",
"\n",
"if __name__ == '__main__':\n",
" tf.app.run()\n"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "Gdv4geJ3-m4b",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"**Download PASCAL VOC Checkpoint for Exp. D**"
]
},
{
"metadata": {
"id": "rfzDtOg41vZo",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"%%bash -s \"$INIT_DIR\" \n",
"\n",
"INIT_DIR=$1\n",
"\n",
"TF_INIT_ROOT=\"http://download.tensorflow.org/models\"\n",
"TF_INIT_CKPT=\"deeplabv3_pascal_train_aug_2018_01_04.tar.gz\"\n",
"cd \"${INIT_DIR}\"\n",
"pwd\n",
"wget -nd -c \"${TF_INIT_ROOT}/${TF_INIT_CKPT}\"\n",
"tar -xf \"${TF_INIT_CKPT}\""
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "OLx7rbvu-0Gh",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"**Modify Segmentation Dataset Config**"
]
},
{
"metadata": {
"id": "35KklVKf1vZ4",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"!pwd\n",
"%cd /content/models/research/"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "1jrAUwB71vap",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"#@title segmentation_dataset.py {display-mode: \"form\"}\n",
"\n",
"# This code will be hidden when the notebook is loaded.\n",
"SEG_DATA = \"\"\"import collections\n",
"import os.path\n",
"import tensorflow as tf\n",
"\n",
"slim = tf.contrib.slim\n",
"\n",
"dataset = slim.dataset\n",
"\n",
"tfexample_decoder = slim.tfexample_decoder\n",
"\n",
"\n",
"_ITEMS_TO_DESCRIPTIONS = {\n",
" 'image': 'A color image of varying height and width.',\n",
" 'labels_class': ('A semantic segmentation label whose size matches image.'\n",
" 'Its values range from 0 (background) to num_classes.'),\n",
"}\n",
"\n",
"# Named tuple to describe the dataset properties.\n",
"DatasetDescriptor = collections.namedtuple(\n",
" 'DatasetDescriptor',\n",
" ['splits_to_sizes', # Splits of the dataset into training, val, and test.\n",
" 'num_classes', # Number of semantic classes, including the background\n",
" # class (if exists). For example, there are 20\n",
" # foreground classes + 1 background class in the PASCAL\n",
" # VOC 2012 dataset. Thus, we set num_classes=21.\n",
" 'ignore_label', # Ignore label value.\n",
" ]\n",
")\n",
"\n",
"_CITYSCAPES_INFORMATION = DatasetDescriptor(\n",
" splits_to_sizes={\n",
" 'train': 2975,\n",
" 'val': 500,\n",
" },\n",
" num_classes=19,\n",
" ignore_label=0,\n",
")\n",
"\n",
"_PASCAL_VOC_SEG_INFORMATION = DatasetDescriptor(\n",
" splits_to_sizes={\n",
" 'train': 1464,\n",
" 'train_aug': 10582,\n",
" 'trainval': 2913,\n",
" 'val': 1449,\n",
" },\n",
" num_classes=21,\n",
" ignore_label=255,\n",
")\n",
"\n",
"_CAPSICUM_ANNUUM_INFORMATION = DatasetDescriptor(\n",
" splits_to_sizes={\n",
" 'train': 30,\n",
" 'trainval': 39,\n",
" 'val': 9,\n",
" },\n",
" num_classes=9,\n",
" ignore_label=255,\n",
")\n",
"# These number (i.e., 'train'/'test') seems to have to be hard coded\n",
"# You are required to figure it out for your training/testing example.\n",
"_ADE20K_INFORMATION = DatasetDescriptor(\n",
" splits_to_sizes={\n",
" 'train': 20210, # num of samples in images/training\n",
" 'val': 2000, # num of samples in images/validation\n",
" },\n",
" num_classes=151,\n",
" ignore_label=0,\n",
")\n",
"\n",
"\n",
"_DATASETS_INFORMATION = {\n",
" 'cityscapes': _CITYSCAPES_INFORMATION,\n",
" 'pascal_voc_seg': _PASCAL_VOC_SEG_INFORMATION,\n",
" 'ade20k': _ADE20K_INFORMATION,\n",
" 'capsicum_annuum': _CAPSICUM_ANNUUM_INFORMATION,\n",
"}\n",
"\n",
"# Default file pattern of TFRecord of TensorFlow Example.\n",
"_FILE_PATTERN = '%s-*'\n",
"\n",
"\n",
"def get_cityscapes_dataset_name():\n",
" return 'cityscapes'\n",
"\n",
"\n",
"def get_dataset(dataset_name, split_name, dataset_dir):\n",
" \n",
" if dataset_name not in _DATASETS_INFORMATION:\n",
" raise ValueError('The specified dataset is not supported yet.')\n",
"\n",
" splits_to_sizes = _DATASETS_INFORMATION[dataset_name].splits_to_sizes\n",
"\n",
" if split_name not in splits_to_sizes:\n",
" raise ValueError('data split name %s not recognized' % split_name)\n",
"\n",
" # Prepare the variables for different datasets.\n",
" num_classes = _DATASETS_INFORMATION[dataset_name].num_classes\n",
" ignore_label = _DATASETS_INFORMATION[dataset_name].ignore_label\n",
"\n",
" file_pattern = _FILE_PATTERN\n",
" file_pattern = os.path.join(dataset_dir, file_pattern % split_name)\n",
"\n",
" # Specify how the TF-Examples are decoded.\n",
" keys_to_features = {\n",
" 'image/encoded': tf.FixedLenFeature(\n",
" (), tf.string, default_value=''),\n",
" 'image/filename': tf.FixedLenFeature(\n",
" (), tf.string, default_value=''),\n",
" 'image/format': tf.FixedLenFeature(\n",
" (), tf.string, default_value='jpeg'),\n",
" 'image/height': tf.FixedLenFeature(\n",
" (), tf.int64, default_value=0),\n",
" 'image/width': tf.FixedLenFeature(\n",
" (), tf.int64, default_value=0),\n",
" 'image/segmentation/class/encoded': tf.FixedLenFeature(\n",
" (), tf.string, default_value=''),\n",
" 'image/segmentation/class/format': tf.FixedLenFeature(\n",
" (), tf.string, default_value='png'),\n",
" }\n",
" items_to_handlers = {\n",
" 'image': tfexample_decoder.Image(\n",
" image_key='image/encoded',\n",
" format_key='image/format',\n",
" channels=3),\n",
" 'image_name': tfexample_decoder.Tensor('image/filename'),\n",
" 'height': tfexample_decoder.Tensor('image/height'),\n",
" 'width': tfexample_decoder.Tensor('image/width'),\n",
" 'labels_class': tfexample_decoder.Image(\n",
" image_key='image/segmentation/class/encoded',\n",
" format_key='image/segmentation/class/format',\n",
" channels=1),\n",
" }\n",
"\n",
" decoder = tfexample_decoder.TFExampleDecoder(\n",
" keys_to_features, items_to_handlers)\n",
"\n",
" return dataset.Dataset(\n",
" data_sources=file_pattern,\n",
" reader=tf.TFRecordReader,\n",
" decoder=decoder,\n",
" num_samples=splits_to_sizes[split_name],\n",
" items_to_descriptions=_ITEMS_TO_DESCRIPTIONS,\n",
" ignore_label=ignore_label,\n",
" num_classes=num_classes,\n",
" name=dataset_name,\n",
" multi_label=True)\n",
"\"\"\"\n",
"with open(DATASET_DIR+'/segmentation_dataset.py', \"w\") as file:\n",
" file.write(SEG_DATA)"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "ZiPTvA2fbc4T",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"#@title train_utils.py {display-mode: \"form\"}\n",
"\n",
"# This code will be hidden when the notebook is loaded.\n",
"\n",
"TRAIN_UTILS = \"\"\"\n",
"import six\n",
"\n",
"import tensorflow as tf\n",
"from deeplab.core import preprocess_utils\n",
"\n",
"slim = tf.contrib.slim\n",
"\n",
"\n",
"def add_softmax_cross_entropy_loss_for_each_scale(scales_to_logits,\n",
" labels,\n",
" num_classes,\n",
" ignore_label,\n",
" loss_weight=1.0,\n",
" upsample_logits=True,\n",
" scope=None):\n",
" if labels is None:\n",
" raise ValueError('No label for softmax cross entropy loss.')\n",
"\n",
" for scale, logits in six.iteritems(scales_to_logits):\n",
" loss_scope = None\n",
" if scope:\n",
" loss_scope = '%s_%s' % (scope, scale)\n",
"\n",
" if upsample_logits:\n",
" # Label is not downsampled, and instead we upsample logits.\n",
" logits = tf.image.resize_bilinear(\n",
" logits,\n",
" preprocess_utils.resolve_shape(labels, 4)[1:3],\n",
" align_corners=True)\n",
" scaled_labels = labels\n",
" else:\n",
" # Label is downsampled to the same size as logits.\n",
" scaled_labels = tf.image.resize_nearest_neighbor(\n",
" labels,\n",
" preprocess_utils.resolve_shape(logits, 4)[1:3],\n",
" align_corners=True)\n",
"\n",
" scaled_labels = tf.reshape(scaled_labels, shape=[-1])\n",
" not_ignore_mask = tf.to_float(tf.not_equal(scaled_labels,\n",
" ignore_label)) * loss_weight\n",
" one_hot_labels = slim.one_hot_encoding(\n",
" scaled_labels, num_classes, on_value=1.0, off_value=0.0)\n",
" tf.losses.softmax_cross_entropy(\n",
" one_hot_labels,\n",
" tf.reshape(logits, shape=[-1, num_classes]),\n",
" weights=not_ignore_mask,\n",
" scope=loss_scope)\n",
"\n",
"\n",
"def get_model_init_fn(train_logdir,\n",
" tf_initial_checkpoint,\n",
" initialize_last_layer,\n",
" last_layers,\n",
" ignore_missing_vars=False):\n",
" \n",
" if tf_initial_checkpoint is None:\n",
" tf.logging.info('Not initializing the model from a checkpoint.')\n",
" return None\n",
"\n",
" if tf.train.latest_checkpoint(train_logdir):\n",
" tf.logging.info('Ignoring initialization; other checkpoint exists')\n",
" return None\n",
"\n",
" tf.logging.info('Initializing model from path: %s', tf_initial_checkpoint)\n",
"\n",
" # Variables that will not be restored.\n",
" exclude_list = ['global_step','logits']\n",
" if not initialize_last_layer:\n",
" exclude_list.extend(last_layers)\n",
"\n",
" variables_to_restore = slim.get_variables_to_restore(exclude=exclude_list)\n",
"\n",
" if variables_to_restore:\n",
" return slim.assign_from_checkpoint_fn(\n",
" tf_initial_checkpoint,\n",
" variables_to_restore,\n",
" ignore_missing_vars=ignore_missing_vars)\n",
" return None\n",
"\n",
"\n",
"def get_model_gradient_multipliers(last_layers, last_layer_gradient_multiplier):\n",
" gradient_multipliers = {}\n",
"\n",
" for var in slim.get_model_variables():\n",
" # Double the learning rate for biases.\n",
" if 'biases' in var.op.name:\n",
" gradient_multipliers[var.op.name] = 2.\n",
"\n",
" # Use larger learning rate for last layer variables.\n",
" for layer in last_layers:\n",
" if layer in var.op.name and 'biases' in var.op.name:\n",
" gradient_multipliers[var.op.name] = 2 * last_layer_gradient_multiplier\n",
" break\n",
" elif layer in var.op.name:\n",
" gradient_multipliers[var.op.name] = last_layer_gradient_multiplier\n",
" break\n",
"\n",
" return gradient_multipliers\n",
"\n",
"\n",
"def get_model_learning_rate(\n",
" learning_policy, base_learning_rate, learning_rate_decay_step,\n",
" learning_rate_decay_factor, training_number_of_steps, learning_power,\n",
" slow_start_step, slow_start_learning_rate):\n",
" \n",
" global_step = tf.train.get_or_create_global_step()\n",
" if learning_policy == 'step':\n",
" learning_rate = tf.train.exponential_decay(\n",
" base_learning_rate,\n",
" global_step,\n",
" learning_rate_decay_step,\n",
" learning_rate_decay_factor,\n",
" staircase=True)\n",
" elif learning_policy == 'poly':\n",
" learning_rate = tf.train.polynomial_decay(\n",
" base_learning_rate,\n",
" global_step,\n",
" training_number_of_steps,\n",
" end_learning_rate=0,\n",
" power=learning_power)\n",
" else:\n",
" raise ValueError('Unknown learning policy.')\n",
"\n",
" # Employ small learning rate at the first few steps for warm start.\n",
" return tf.where(global_step < slow_start_step, slow_start_learning_rate,\n",
" learning_rate)\n",
"\n",
"\"\"\"\n",
"with open(DEEPLAB_DIR+'/utils/train_utils.py', \"w\") as file:\n",
" file.write(TRAIN_UTILS)"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "EvS21exIegWc",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"#@title train.py {display-mode: \"form\"}\n",
"\n",
"# This code will be hidden when the notebook is loaded.\n",
"\n",
"TRAIN = \"\"\"\n",
"\n",
"import six\n",
"import tensorflow as tf\n",
"from deeplab import common\n",
"from deeplab import model\n",
"from deeplab.datasets import segmentation_dataset\n",
"from deeplab.utils import input_generator\n",
"from deeplab.utils import train_utils\n",
"from deployment import model_deploy\n",
"\n",
"slim = tf.contrib.slim\n",
"\n",
"prefetch_queue = slim.prefetch_queue\n",
"\n",
"flags = tf.app.flags\n",
"\n",
"FLAGS = flags.FLAGS\n",
"\n",
"# Settings for multi-GPUs/multi-replicas training.\n",
"\n",
"flags.DEFINE_integer('num_clones', 1, 'Number of clones to deploy.')\n",
"\n",
"flags.DEFINE_boolean('clone_on_cpu', False, 'Use CPUs to deploy clones.')\n",
"\n",
"flags.DEFINE_integer('num_replicas', 1, 'Number of worker replicas.')\n",
"\n",
"flags.DEFINE_integer('startup_delay_steps', 15,\n",
" 'Number of training steps between replicas startup.')\n",
"\n",
"flags.DEFINE_integer('num_ps_tasks', 0,\n",
" 'The number of parameter servers. If the value is 0, then '\n",
" 'the parameters are handled locally by the worker.')\n",
"\n",
"flags.DEFINE_string('master', '', 'BNS name of the tensorflow server')\n",
"\n",
"flags.DEFINE_integer('task', 0, 'The task ID.')\n",
"\n",
"# Settings for logging.\n",
"\n",
"flags.DEFINE_string('train_logdir', None,\n",
" 'Where the checkpoint and logs are stored.')\n",
"\n",
"flags.DEFINE_integer('log_steps', 10,\n",
" 'Display logging information at every log_steps.')\n",
"\n",
"flags.DEFINE_integer('save_interval_secs', 1200,\n",
" 'How often, in seconds, we save the model to disk.')\n",
"\n",
"flags.DEFINE_integer('save_summaries_secs', 600,\n",
" 'How often, in seconds, we compute the summaries.')\n",
"\n",
"flags.DEFINE_boolean('save_summaries_images', False,\n",
" 'Save sample inputs, labels, and semantic predictions as '\n",
" 'images to summary.')\n",
"\n",
"# Settings for training strategy.\n",
"\n",
"flags.DEFINE_enum('learning_policy', 'poly', ['poly', 'step'],\n",
" 'Learning rate policy for training.')\n",
"\n",
"# Use 0.007 when training on PASCAL augmented training set, train_aug. When\n",
"# fine-tuning on PASCAL trainval set, use learning rate=0.0001.\n",
"flags.DEFINE_float('base_learning_rate', .00005,\n",
" 'The base learning rate for model training.')\n",
"\n",
"flags.DEFINE_float('learning_rate_decay_factor', 0.1,\n",
" 'The rate to decay the base learning rate.')\n",
"\n",
"flags.DEFINE_integer('learning_rate_decay_step', 2000,\n",
" 'Decay the base learning rate at a fixed step.')\n",
"\n",
"flags.DEFINE_float('learning_power', 0.9,\n",
" 'The power value used in the poly learning policy.')\n",
"\n",
"flags.DEFINE_integer('training_number_of_steps', 30000,\n",
" 'The number of steps used for training')\n",
"\n",
"flags.DEFINE_float('momentum', 0.9, 'The momentum value to use')\n",
"\n",
"# When fine_tune_batch_norm=True, use at least batch size larger than 12\n",
"# (batch size more than 16 is better). Otherwise, one could use smaller batch\n",
"# size and set fine_tune_batch_norm=False.\n",
"flags.DEFINE_integer('train_batch_size', 10,\n",
" 'The number of images in each batch during training.')\n",
"\n",
"# For weight_decay, use 0.00004 for MobileNet-V2 or Xcpetion model variants.\n",
"# Use 0.0001 for ResNet model variants.\n",
"flags.DEFINE_float('weight_decay', 0.00004,\n",
" 'The value of the weight decay for training.')\n",
"\n",
"flags.DEFINE_multi_integer('train_crop_size', [513, 513],\n",
" 'Image crop size [height, width] during training.')\n",
"\n",
"flags.DEFINE_float('last_layer_gradient_multiplier', 1.0,\n",
" 'The gradient multiplier for last layers, which is used to '\n",
" 'boost the gradient of last layers if the value > 1.')\n",
"\n",
"flags.DEFINE_boolean('upsample_logits', True,\n",
" 'Upsample logits during training.')\n",
"\n",
"# Settings for fine-tuning the network.\n",
"\n",
"flags.DEFINE_string('tf_initial_checkpoint', None,\n",
" 'The initial checkpoint in tensorflow format.')\n",
"\n",
"# Set to False if one does not want to re-use the trained classifier weights.\n",
"flags.DEFINE_boolean('initialize_last_layer', False,\n",
" 'Initialize the last layer.')\n",
"\n",
"flags.DEFINE_boolean('last_layers_contain_logits_only', False,\n",
" 'Only consider logits as last layers or not.')\n",
"\n",
"flags.DEFINE_integer('slow_start_step', 0,\n",
" 'Training model with small learning rate for few steps.')\n",
"\n",
"flags.DEFINE_float('slow_start_learning_rate', 1e-4,\n",
" 'Learning rate employed during slow start.')\n",
"\n",
"# Set to True if one wants to fine-tune the batch norm parameters in DeepLabv3.\n",
"# Set to False and use small batch size to save GPU memory.\n",
"flags.DEFINE_boolean('fine_tune_batch_norm', False,\n",
" 'Fine tune the batch norm parameters or not.')\n",
"\n",
"flags.DEFINE_float('min_scale_factor', 0.5,\n",
" 'Mininum scale factor for data augmentation.')\n",
"\n",
"flags.DEFINE_float('max_scale_factor', 2.,\n",
" 'Maximum scale factor for data augmentation.')\n",
"\n",
"flags.DEFINE_float('scale_factor_step_size', 0.25,\n",
" 'Scale factor step size for data augmentation.')\n",
"\n",
"# For `xception_65`, use atrous_rates = [12, 24, 36] if output_stride = 8, or\n",
"# rates = [6, 12, 18] if output_stride = 16. For `mobilenet_v2`, use None. Note\n",
"# one could use different atrous_rates/output_stride during training/evaluation.\n",
"flags.DEFINE_multi_integer('atrous_rates', None,\n",
" 'Atrous rates for atrous spatial pyramid pooling.')\n",
"\n",
"flags.DEFINE_integer('output_stride', 16,\n",
" 'The ratio of input to output spatial resolution.')\n",
"\n",
"# Dataset settings.\n",
"flags.DEFINE_string('dataset', 'capsicum_annuum',\n",
" 'Name of the segmentation dataset.')\n",
"\n",
"flags.DEFINE_string('train_split', 'train',\n",
" 'Which split of the dataset to be used for training')\n",
"\n",
"flags.DEFINE_string('dataset_dir', None, 'Where the dataset reside.')\n",
"\n",
"\n",
"def _build_deeplab(inputs_queue, outputs_to_num_classes, ignore_label):\n",
"\n",
" samples = inputs_queue.dequeue()\n",
"\n",
" # Add name to input and label nodes so we can add to summary.\n",
" samples[common.IMAGE] = tf.identity(\n",
" samples[common.IMAGE], name=common.IMAGE)\n",
" samples[common.LABEL] = tf.identity(\n",
" samples[common.LABEL], name=common.LABEL)\n",
"\n",
" model_options = common.ModelOptions(\n",
" outputs_to_num_classes=outputs_to_num_classes,\n",
" crop_size=FLAGS.train_crop_size,\n",
" atrous_rates=FLAGS.atrous_rates,\n",
" output_stride=FLAGS.output_stride)\n",
" outputs_to_scales_to_logits = model.multi_scale_logits(\n",
" samples[common.IMAGE],\n",
" model_options=model_options,\n",
" image_pyramid=FLAGS.image_pyramid,\n",
" weight_decay=FLAGS.weight_decay,\n",
" is_training=True,\n",
" fine_tune_batch_norm=FLAGS.fine_tune_batch_norm)\n",
"\n",
" # Add name to graph node so we can add to summary.\n",
" output_type_dict = outputs_to_scales_to_logits[common.OUTPUT_TYPE]\n",
" output_type_dict[model.MERGED_LOGITS_SCOPE] = tf.identity(\n",
" output_type_dict[model.MERGED_LOGITS_SCOPE],\n",
" name=common.OUTPUT_TYPE)\n",
"\n",
" for output, num_classes in six.iteritems(outputs_to_num_classes):\n",
" train_utils.add_softmax_cross_entropy_loss_for_each_scale(\n",
" outputs_to_scales_to_logits[output],\n",
" samples[common.LABEL],\n",
" num_classes,\n",
" ignore_label,\n",
" loss_weight=1.0,\n",
" upsample_logits=FLAGS.upsample_logits,\n",
" scope=output)\n",
"\n",
" return outputs_to_scales_to_logits\n",
"\n",
"\n",
"def main(unused_argv):\n",
" tf.logging.set_verbosity(tf.logging.INFO)\n",
" # Set up deployment (i.e., multi-GPUs and/or multi-replicas).\n",
" config = model_deploy.DeploymentConfig(\n",
" num_clones=FLAGS.num_clones,\n",
" clone_on_cpu=FLAGS.clone_on_cpu,\n",
" replica_id=FLAGS.task,\n",
" num_replicas=FLAGS.num_replicas,\n",
" num_ps_tasks=FLAGS.num_ps_tasks)\n",
"\n",
" # Split the batch across GPUs.\n",
" assert FLAGS.train_batch_size % config.num_clones == 0, (\n",
" 'Training batch size not divisble by number of clones (GPUs).')\n",
"\n",
" clone_batch_size = FLAGS.train_batch_size // config.num_clones\n",
"\n",
" # Get dataset-dependent information.\n",
" dataset = segmentation_dataset.get_dataset(\n",
" FLAGS.dataset, FLAGS.train_split, dataset_dir=FLAGS.dataset_dir)\n",
"\n",
" tf.gfile.MakeDirs(FLAGS.train_logdir)\n",
" tf.logging.info('Training on %s set', FLAGS.train_split)\n",
"\n",
" with tf.Graph().as_default() as graph:\n",
" with tf.device(config.inputs_device()):\n",
" samples = input_generator.get(\n",
" dataset,\n",
" FLAGS.train_crop_size,\n",
" clone_batch_size,\n",
" min_resize_value=FLAGS.min_resize_value,\n",
" max_resize_value=FLAGS.max_resize_value,\n",
" resize_factor=FLAGS.resize_factor,\n",
" min_scale_factor=FLAGS.min_scale_factor,\n",
" max_scale_factor=FLAGS.max_scale_factor,\n",
" scale_factor_step_size=FLAGS.scale_factor_step_size,\n",
" dataset_split=FLAGS.train_split,\n",
" is_training=True,\n",
" model_variant=FLAGS.model_variant)\n",
" inputs_queue = prefetch_queue.prefetch_queue(\n",
" samples, capacity=128 * config.num_clones)\n",
"\n",
" # Create the global step on the device storing the variables.\n",
" with tf.device(config.variables_device()):\n",
" global_step = tf.train.get_or_create_global_step()\n",
"\n",
" # Define the model and create clones.\n",
" model_fn = _build_deeplab\n",
" model_args = (inputs_queue, {\n",
" common.OUTPUT_TYPE: dataset.num_classes\n",
" }, dataset.ignore_label)\n",
" clones = model_deploy.create_clones(config, model_fn, args=model_args)\n",
"\n",
" # Gather update_ops from the first clone. These contain, for example,\n",
" # the updates for the batch_norm variables created by model_fn.\n",
" first_clone_scope = config.clone_scope(0)\n",
" update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS, first_clone_scope)\n",
"\n",
" # Gather initial summaries.\n",
" summaries = set(tf.get_collection(tf.GraphKeys.SUMMARIES))\n",
"\n",
" # Add summaries for model variables.\n",
" for model_var in slim.get_model_variables():\n",
" summaries.add(tf.summary.histogram(model_var.op.name, model_var))\n",
"\n",
" # Add summaries for images, labels, semantic predictions\n",
" if FLAGS.save_summaries_images:\n",
" summary_image = graph.get_tensor_by_name(\n",
" ('%s/%s:0' % (first_clone_scope, common.IMAGE)).strip('/'))\n",
" summaries.add(\n",
" tf.summary.image('samples/%s' % common.IMAGE, summary_image))\n",
"\n",
" first_clone_label = graph.get_tensor_by_name(\n",
" ('%s/%s:0' % (first_clone_scope, common.LABEL)).strip('/'))\n",
" # Scale up summary image pixel values for better visualization.\n",
" pixel_scaling = max(1, 255 // dataset.num_classes)\n",
" summary_label = tf.cast(first_clone_label * pixel_scaling, tf.uint8)\n",
" summaries.add(\n",
" tf.summary.image('samples/%s' % common.LABEL, summary_label))\n",
"\n",
" first_clone_output = graph.get_tensor_by_name(\n",
" ('%s/%s:0' % (first_clone_scope, common.OUTPUT_TYPE)).strip('/'))\n",
" predictions = tf.expand_dims(tf.argmax(first_clone_output, 3), -1)\n",
"\n",
" summary_predictions = tf.cast(predictions * pixel_scaling, tf.uint8)\n",
" summaries.add(\n",
" tf.summary.image(\n",
" 'samples/%s' % common.OUTPUT_TYPE, summary_predictions))\n",
"\n",
" # Add summaries for losses.\n",
" for loss in tf.get_collection(tf.GraphKeys.LOSSES, first_clone_scope):\n",
" summaries.add(tf.summary.scalar('losses/%s' % loss.op.name, loss))\n",
"\n",
" # Build the optimizer based on the device specification.\n",
" with tf.device(config.optimizer_device()):\n",
" learning_rate = train_utils.get_model_learning_rate(\n",
" FLAGS.learning_policy, FLAGS.base_learning_rate,\n",
" FLAGS.learning_rate_decay_step, FLAGS.learning_rate_decay_factor,\n",
" FLAGS.training_number_of_steps, FLAGS.learning_power,\n",
" FLAGS.slow_start_step, FLAGS.slow_start_learning_rate)\n",
" optimizer = tf.train.MomentumOptimizer(learning_rate, FLAGS.momentum)\n",
" summaries.add(tf.summary.scalar('learning_rate', learning_rate))\n",
"\n",
" startup_delay_steps = FLAGS.task * FLAGS.startup_delay_steps\n",
" for variable in slim.get_model_variables():\n",
" summaries.add(tf.summary.histogram(variable.op.name, variable))\n",
"\n",
" with tf.device(config.variables_device()):\n",
" total_loss, grads_and_vars = model_deploy.optimize_clones(\n",
" clones, optimizer)\n",
" total_loss = tf.check_numerics(total_loss, 'Loss is inf or nan.')\n",
" summaries.add(tf.summary.scalar('total_loss', total_loss))\n",
"\n",
" # Modify the gradients for biases and last layer variables.\n",
" last_layers = model.get_extra_layer_scopes(\n",
" FLAGS.last_layers_contain_logits_only)\n",
" grad_mult = train_utils.get_model_gradient_multipliers(\n",
" last_layers, FLAGS.last_layer_gradient_multiplier)\n",
" if grad_mult:\n",
" grads_and_vars = slim.learning.multiply_gradients(\n",
" grads_and_vars, grad_mult)\n",
"\n",
" # Create gradient update op.\n",
" grad_updates = optimizer.apply_gradients(\n",
" grads_and_vars, global_step=global_step)\n",
" update_ops.append(grad_updates)\n",
" update_op = tf.group(*update_ops)\n",
" with tf.control_dependencies([update_op]):\n",
" train_tensor = tf.identity(total_loss, name='train_op')\n",
"\n",
" # Add the summaries from the first clone. These contain the summaries\n",
" # created by model_fn and either optimize_clones() or _gather_clone_loss().\n",
" summaries |= set(\n",
" tf.get_collection(tf.GraphKeys.SUMMARIES, first_clone_scope))\n",
"\n",
" # Merge all summaries together.\n",
" summary_op = tf.summary.merge(list(summaries))\n",
"\n",
" # Soft placement allows placing on CPU ops without GPU implementation.\n",
" session_config = tf.ConfigProto(\n",
" allow_soft_placement=True, log_device_placement=False)\n",
"\n",
" # Start the training.\n",
" slim.learning.train(\n",
" train_tensor,\n",
" logdir=FLAGS.train_logdir,\n",
" log_every_n_steps=FLAGS.log_steps,\n",
" master=FLAGS.master,\n",
" number_of_steps=FLAGS.training_number_of_steps,\n",
" is_chief=(FLAGS.task == 0),\n",
" session_config=session_config,\n",
" startup_delay_steps=startup_delay_steps,\n",
" init_fn=train_utils.get_model_init_fn(\n",
" FLAGS.train_logdir,\n",
" FLAGS.tf_initial_checkpoint,\n",
" FLAGS.initialize_last_layer,\n",
" last_layers,\n",
" ignore_missing_vars=True),\n",
" summary_op=summary_op,\n",
" save_summaries_secs=FLAGS.save_summaries_secs,\n",
" save_interval_secs=FLAGS.save_interval_secs)\n",
"\n",
"\n",
"if __name__ == '__main__':\n",
" flags.mark_flag_as_required('train_logdir')\n",
" flags.mark_flag_as_required('tf_initial_checkpoint')\n",
" flags.mark_flag_as_required('dataset_dir')\n",
" tf.app.run()\n",
"\"\"\"\n",
"with open(DEEPLAB_DIR+'/train.py', \"w\") as file:\n",
" file.write(TRAIN)\n"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "Jm1MbTLAfHQ8",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"#@title eval.py {display-mode: \"form\"}\n",
"\n",
"# This code will be hidden when the notebook is loaded.\n",
"\n",
"EVAL = \"\"\"\n",
"# Copyright 2018 The TensorFlow Authors All Rights Reserved.\n",
"#\n",
"# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
"# you may not use this file except in compliance with the License.\n",
"# You may obtain a copy of the License at\n",
"#\n",
"# http://www.apache.org/licenses/LICENSE-2.0\n",
"#\n",
"# Unless required by applicable law or agreed to in writing, software\n",
"# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
"# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
"# See the License for the specific language governing permissions and\n",
"# limitations under the License.\n",
"# ==============================================================================\n",
"\n",
"\n",
"import math\n",
"import six\n",
"import tensorflow as tf\n",
"from deeplab import common\n",
"from deeplab import model\n",
"from deeplab.datasets import segmentation_dataset\n",
"from deeplab.utils import input_generator\n",
"\n",
"slim = tf.contrib.slim\n",
"\n",
"flags = tf.app.flags\n",
"\n",
"FLAGS = flags.FLAGS\n",
"\n",
"flags.DEFINE_string('master', '', 'BNS name of the tensorflow server')\n",
"\n",
"# Settings for log directories.\n",
"\n",
"flags.DEFINE_string('eval_logdir', None, 'Where to write the event logs.')\n",
"\n",
"flags.DEFINE_string('checkpoint_dir', None, 'Directory of model checkpoints.')\n",
"\n",
"# Settings for evaluating the model.\n",
"\n",
"flags.DEFINE_integer('eval_batch_size', 1,\n",
" 'The number of images in each batch during evaluation.')\n",
"\n",
"flags.DEFINE_multi_integer('eval_crop_size', [513, 513],\n",
" 'Image crop size [height, width] for evaluation.')\n",
"\n",
"flags.DEFINE_integer('eval_interval_secs', 60 * 5,\n",
" 'How often (in seconds) to run evaluation.')\n",
"\n",
"# For `xception_65`, use atrous_rates = [12, 24, 36] if output_stride = 8, or\n",
"# rates = [6, 12, 18] if output_stride = 16. For `mobilenet_v2`, use None. Note\n",
"# one could use different atrous_rates/output_stride during training/evaluation.\n",
"flags.DEFINE_multi_integer('atrous_rates', None,\n",
" 'Atrous rates for atrous spatial pyramid pooling.')\n",
"\n",
"flags.DEFINE_integer('output_stride', 16,\n",
" 'The ratio of input to output spatial resolution.')\n",
"\n",
"# Change to [0.5, 0.75, 1.0, 1.25, 1.5, 1.75] for multi-scale test.\n",
"flags.DEFINE_multi_float('eval_scales', [1.0],\n",
" 'The scales to resize images for evaluation.')\n",
"\n",
"# Change to True for adding flipped images during test.\n",
"flags.DEFINE_bool('add_flipped_images', False,\n",
" 'Add flipped images for evaluation or not.')\n",
"\n",
"# Dataset settings.\n",
"\n",
"flags.DEFINE_string('dataset', 'capsicum_annuum',\n",
" 'Name of the segmentation dataset.')\n",
"\n",
"flags.DEFINE_string('eval_split', 'val',\n",
" 'Which split of the dataset used for evaluation')\n",
"\n",
"flags.DEFINE_string('dataset_dir', None, 'Where the dataset reside.')\n",
"\n",
"flags.DEFINE_integer('max_number_of_evaluations', 0,\n",
" 'Maximum number of eval iterations. Will loop '\n",
" 'indefinitely upon nonpositive values.')\n",
"\n",
"\n",
"def main(unused_argv):\n",
" tf.logging.set_verbosity(tf.logging.INFO)\n",
" # Get dataset-dependent information.\n",
" dataset = segmentation_dataset.get_dataset(\n",
" FLAGS.dataset, FLAGS.eval_split, dataset_dir=FLAGS.dataset_dir)\n",
"\n",
" tf.gfile.MakeDirs(FLAGS.eval_logdir)\n",
" tf.logging.info('Evaluating on %s set', FLAGS.eval_split)\n",
"\n",
" with tf.Graph().as_default():\n",
" samples = input_generator.get(\n",
" dataset,\n",
" FLAGS.eval_crop_size,\n",
" FLAGS.eval_batch_size,\n",
" min_resize_value=FLAGS.min_resize_value,\n",
" max_resize_value=FLAGS.max_resize_value,\n",
" resize_factor=FLAGS.resize_factor,\n",
" dataset_split=FLAGS.eval_split,\n",
" is_training=False,\n",
" model_variant=FLAGS.model_variant)\n",
"\n",
" model_options = common.ModelOptions(\n",
" outputs_to_num_classes={common.OUTPUT_TYPE: dataset.num_classes},\n",
" crop_size=FLAGS.eval_crop_size,\n",
" atrous_rates=FLAGS.atrous_rates,\n",
" output_stride=FLAGS.output_stride)\n",
"\n",
" if tuple(FLAGS.eval_scales) == (1.0,):\n",
" tf.logging.info('Performing single-scale test.')\n",
" predictions = model.predict_labels(samples[common.IMAGE], model_options,\n",
" image_pyramid=FLAGS.image_pyramid)\n",
" else:\n",
" tf.logging.info('Performing multi-scale test.')\n",
" predictions = model.predict_labels_multi_scale(\n",
" samples[common.IMAGE],\n",
" model_options=model_options,\n",
" eval_scales=FLAGS.eval_scales,\n",
" add_flipped_images=FLAGS.add_flipped_images)\n",
" predictions = predictions[common.OUTPUT_TYPE]\n",
" predictions = tf.reshape(predictions, shape=[-1])\n",
" labels = tf.reshape(samples[common.LABEL], shape=[-1])\n",
" weights = tf.to_float(tf.not_equal(labels, dataset.ignore_label))\n",
"\n",
" # Set ignore_label regions to label 0, because metrics.mean_iou requires\n",
" # range of labels = [0, dataset.num_classes). Note the ignore_label regions\n",
" # are not evaluated since the corresponding regions contain weights = 0.\n",
" labels = tf.where(\n",
" tf.equal(labels, dataset.ignore_label), tf.zeros_like(labels), labels)\n",
"\n",
" predictions_tag = 'miou'\n",
" for eval_scale in FLAGS.eval_scales:\n",
" predictions_tag += '_' + str(eval_scale)\n",
" if FLAGS.add_flipped_images:\n",
" predictions_tag += '_flipped'\n",
"\n",
" # Define the evaluation metric.\n",
" metric_map = {}\n",
" metric_map[predictions_tag] = tf.metrics.mean_iou(\n",
" predictions, labels, dataset.num_classes, weights=weights)\n",
"\n",
" metrics_to_values, metrics_to_updates = (\n",
" tf.contrib.metrics.aggregate_metric_map(metric_map))\n",
"\n",
" for metric_name, metric_value in six.iteritems(metrics_to_values):\n",
" slim.summaries.add_scalar_summary(\n",
" metric_value, metric_name, print_summary=True)\n",
"\n",
" num_batches = int(\n",
" math.ceil(dataset.num_samples / float(FLAGS.eval_batch_size)))\n",
"\n",
" tf.logging.info('Eval num images %d', dataset.num_samples)\n",
" tf.logging.info('Eval batch size %d and num batch %d',\n",
" FLAGS.eval_batch_size, num_batches)\n",
"\n",
" num_eval_iters = None\n",
" if FLAGS.max_number_of_evaluations > 0:\n",
" num_eval_iters = FLAGS.max_number_of_evaluations\n",
" slim.evaluation.evaluation_loop(\n",
" master=FLAGS.master,\n",
" checkpoint_dir=FLAGS.checkpoint_dir,\n",
" logdir=FLAGS.eval_logdir,\n",
" num_evals=num_batches,\n",
" eval_op=list(metrics_to_updates.values()),\n",
" max_number_of_evaluations=num_eval_iters,\n",
" eval_interval_secs=FLAGS.eval_interval_secs)\n",
"\n",
"\n",
"if __name__ == '__main__':\n",
" flags.mark_flag_as_required('checkpoint_dir')\n",
" flags.mark_flag_as_required('eval_logdir')\n",
" flags.mark_flag_as_required('dataset_dir')\n",
" tf.app.run()\n",
"\"\"\"\n",
"with open(DEEPLAB_DIR+'/eval.py', \"w\") as file:\n",
" file.write(EVAL)"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "qgre2l6dfZF2",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"#@title vis.py {display-mode: \"form\"}\n",
"\n",
"# This code will be hidden when the notebook is loaded.\n",
"\n",
"VIS = \"\"\"\n",
"# Copyright 2018 The TensorFlow Authors All Rights Reserved.\n",
"#\n",
"# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
"# you may not use this file except in compliance with the License.\n",
"# You may obtain a copy of the License at\n",
"#\n",
"# http://www.apache.org/licenses/LICENSE-2.0\n",
"#\n",
"# Unless required by applicable law or agreed to in writing, software\n",
"# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
"# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
"# See the License for the specific language governing permissions and\n",
"# limitations under the License.\n",
"# ==============================================================================\n",
"\n",
"\n",
"import math\n",
"import os.path\n",
"import time\n",
"import numpy as np\n",
"import tensorflow as tf\n",
"from deeplab import common\n",
"from deeplab import model\n",
"from deeplab.datasets import segmentation_dataset\n",
"from deeplab.utils import input_generator\n",
"from deeplab.utils import save_annotation\n",
"\n",
"slim = tf.contrib.slim\n",
"\n",
"flags = tf.app.flags\n",
"\n",
"FLAGS = flags.FLAGS\n",
"\n",
"flags.DEFINE_string('master', '', 'BNS name of the tensorflow server')\n",
"\n",
"# Settings for log directories.\n",
"\n",
"flags.DEFINE_string('vis_logdir', None, 'Where to write the event logs.')\n",
"\n",
"flags.DEFINE_string('checkpoint_dir', None, 'Directory of model checkpoints.')\n",
"\n",
"# Settings for visualizing the model.\n",
"\n",
"flags.DEFINE_integer('vis_batch_size', 1,\n",
" 'The number of images in each batch during evaluation.')\n",
"\n",
"flags.DEFINE_multi_integer('vis_crop_size', [513, 513],\n",
" 'Crop size [height, width] for visualization.')\n",
"\n",
"flags.DEFINE_integer('eval_interval_secs', 60 * 5,\n",
" 'How often (in seconds) to run evaluation.')\n",
"\n",
"# For `xception_65`, use atrous_rates = [12, 24, 36] if output_stride = 8, or\n",
"# rates = [6, 12, 18] if output_stride = 16. For `mobilenet_v2`, use None. Note\n",
"# one could use different atrous_rates/output_stride during training/evaluation.\n",
"flags.DEFINE_multi_integer('atrous_rates', None,\n",
" 'Atrous rates for atrous spatial pyramid pooling.')\n",
"\n",
"flags.DEFINE_integer('output_stride', 16,\n",
" 'The ratio of input to output spatial resolution.')\n",
"\n",
"# Change to [0.5, 0.75, 1.0, 1.25, 1.5, 1.75] for multi-scale test.\n",
"flags.DEFINE_multi_float('eval_scales', [1.0],\n",
" 'The scales to resize images for evaluation.')\n",
"\n",
"# Change to True for adding flipped images during test.\n",
"flags.DEFINE_bool('add_flipped_images', False,\n",
" 'Add flipped images for evaluation or not.')\n",
"\n",
"# Dataset settings.\n",
"\n",
"flags.DEFINE_string('dataset', 'capsicum_annuum',\n",
" 'Name of the segmentation dataset.')\n",
"\n",
"flags.DEFINE_string('vis_split', 'val',\n",
" 'Which split of the dataset used for visualizing results')\n",
"\n",
"flags.DEFINE_string('dataset_dir', None, 'Where the dataset reside.')\n",
"\n",
"flags.DEFINE_enum('colormap_type', 'pascal', ['pascal', 'cityscapes'],\n",
" 'Visualization colormap type.')\n",
"\n",
"flags.DEFINE_boolean('also_save_raw_predictions', False,\n",
" 'Also save raw predictions.')\n",
"\n",
"flags.DEFINE_integer('max_number_of_iterations', 0,\n",
" 'Maximum number of visualization iterations. Will loop '\n",
" 'indefinitely upon nonpositive values.')\n",
"\n",
"# The folder where semantic segmentation predictions are saved.\n",
"_SEMANTIC_PREDICTION_SAVE_FOLDER = 'segmentation_results'\n",
"\n",
"# The folder where raw semantic segmentation predictions are saved.\n",
"_RAW_SEMANTIC_PREDICTION_SAVE_FOLDER = 'raw_segmentation_results'\n",
"\n",
"# The format to save image.\n",
"_IMAGE_FORMAT = '%06d_image'\n",
"\n",
"# The format to save prediction\n",
"_PREDICTION_FORMAT = '%06d_prediction'\n",
"\n",
"# To evaluate Cityscapes results on the evaluation server, the labels used\n",
"# during training should be mapped to the labels for evaluation.\n",
"_CITYSCAPES_TRAIN_ID_TO_EVAL_ID = [7, 8, 11, 12, 13, 17, 19, 20, 21, 22,\n",
" 23, 24, 25, 26, 27, 28, 31, 32, 33]\n",
"\n",
"\n",
"def _convert_train_id_to_eval_id(prediction, train_id_to_eval_id):\n",
"\n",
" converted_prediction = prediction.copy()\n",
" for train_id, eval_id in enumerate(train_id_to_eval_id):\n",
" converted_prediction[prediction == train_id] = eval_id\n",
"\n",
" return converted_prediction\n",
"\n",
"\n",
"def _process_batch(sess, original_images, semantic_predictions, image_names,\n",
" image_heights, image_widths, image_id_offset, save_dir,\n",
" raw_save_dir, train_id_to_eval_id=None):\n",
" \n",
" (original_images,\n",
" semantic_predictions,\n",
" image_names,\n",
" image_heights,\n",
" image_widths) = sess.run([original_images, semantic_predictions,\n",
" image_names, image_heights, image_widths])\n",
"\n",
" num_image = semantic_predictions.shape[0]\n",
" for i in range(num_image):\n",
" image_height = np.squeeze(image_heights[i])\n",
" image_width = np.squeeze(image_widths[i])\n",
" original_image = np.squeeze(original_images[i])\n",
" semantic_prediction = np.squeeze(semantic_predictions[i])\n",
" crop_semantic_prediction = semantic_prediction[:image_height, :image_width]\n",
"\n",
" # Save image.\n",
" save_annotation.save_annotation(\n",
" original_image, save_dir, _IMAGE_FORMAT % (image_id_offset + i),\n",
" add_colormap=False)\n",
"\n",
" # Save prediction.\n",
" save_annotation.save_annotation(\n",
" crop_semantic_prediction, save_dir,\n",
" _PREDICTION_FORMAT % (image_id_offset + i), add_colormap=True,\n",
" colormap_type=FLAGS.colormap_type)\n",
"\n",
" if FLAGS.also_save_raw_predictions:\n",
" image_filename = os.path.basename(image_names[i])\n",
"\n",
" if train_id_to_eval_id is not None:\n",
" crop_semantic_prediction = _convert_train_id_to_eval_id(\n",
" crop_semantic_prediction,\n",
" train_id_to_eval_id)\n",
" save_annotation.save_annotation(\n",
" crop_semantic_prediction, raw_save_dir, image_filename,\n",
" add_colormap=False)\n",
"\n",
"\n",
"def main(unused_argv):\n",
" tf.logging.set_verbosity(tf.logging.INFO)\n",
" # Get dataset-dependent information.\n",
" dataset = segmentation_dataset.get_dataset(\n",
" FLAGS.dataset, FLAGS.vis_split, dataset_dir=FLAGS.dataset_dir)\n",
" train_id_to_eval_id = None\n",
" if dataset.name == segmentation_dataset.get_cityscapes_dataset_name():\n",
" tf.logging.info('Cityscapes requires converting train_id to eval_id.')\n",
" train_id_to_eval_id = _CITYSCAPES_TRAIN_ID_TO_EVAL_ID\n",
"\n",
" # Prepare for visualization.\n",
" tf.gfile.MakeDirs(FLAGS.vis_logdir)\n",
" save_dir = os.path.join(FLAGS.vis_logdir, _SEMANTIC_PREDICTION_SAVE_FOLDER)\n",
" tf.gfile.MakeDirs(save_dir)\n",
" raw_save_dir = os.path.join(\n",
" FLAGS.vis_logdir, _RAW_SEMANTIC_PREDICTION_SAVE_FOLDER)\n",
" tf.gfile.MakeDirs(raw_save_dir)\n",
"\n",
" tf.logging.info('Visualizing on %s set', FLAGS.vis_split)\n",
"\n",
" g = tf.Graph()\n",
" with g.as_default():\n",
" samples = input_generator.get(dataset,\n",
" FLAGS.vis_crop_size,\n",
" FLAGS.vis_batch_size,\n",
" min_resize_value=FLAGS.min_resize_value,\n",
" max_resize_value=FLAGS.max_resize_value,\n",
" resize_factor=FLAGS.resize_factor,\n",
" dataset_split=FLAGS.vis_split,\n",
" is_training=False,\n",
" model_variant=FLAGS.model_variant)\n",
"\n",
" model_options = common.ModelOptions(\n",
" outputs_to_num_classes={common.OUTPUT_TYPE: dataset.num_classes},\n",
" crop_size=FLAGS.vis_crop_size,\n",
" atrous_rates=FLAGS.atrous_rates,\n",
" output_stride=FLAGS.output_stride)\n",
"\n",
" if tuple(FLAGS.eval_scales) == (1.0,):\n",
" tf.logging.info('Performing single-scale test.')\n",
" predictions = model.predict_labels(\n",
" samples[common.IMAGE],\n",
" model_options=model_options,\n",
" image_pyramid=FLAGS.image_pyramid)\n",
" else:\n",
" tf.logging.info('Performing multi-scale test.')\n",
" predictions = model.predict_labels_multi_scale(\n",
" samples[common.IMAGE],\n",
" model_options=model_options,\n",
" eval_scales=FLAGS.eval_scales,\n",
" add_flipped_images=FLAGS.add_flipped_images)\n",
" predictions = predictions[common.OUTPUT_TYPE]\n",
"\n",
" if FLAGS.min_resize_value and FLAGS.max_resize_value:\n",
" # Only support batch_size = 1, since we assume the dimensions of original\n",
" # image after tf.squeeze is [height, width, 3].\n",
" assert FLAGS.vis_batch_size == 1\n",
"\n",
" # Reverse the resizing and padding operations performed in preprocessing.\n",
" # First, we slice the valid regions (i.e., remove padded region) and then\n",
" # we reisze the predictions back.\n",
" original_image = tf.squeeze(samples[common.ORIGINAL_IMAGE])\n",
" original_image_shape = tf.shape(original_image)\n",
" predictions = tf.slice(\n",
" predictions,\n",
" [0, 0, 0],\n",
" [1, original_image_shape[0], original_image_shape[1]])\n",
" resized_shape = tf.to_int32([tf.squeeze(samples[common.HEIGHT]),\n",
" tf.squeeze(samples[common.WIDTH])])\n",
" predictions = tf.squeeze(\n",
" tf.image.resize_images(tf.expand_dims(predictions, 3),\n",
" resized_shape,\n",
" method=tf.image.ResizeMethod.NEAREST_NEIGHBOR,\n",
" align_corners=True), 3)\n",
"\n",
" tf.train.get_or_create_global_step()\n",
" saver = tf.train.Saver(slim.get_variables_to_restore())\n",
" sv = tf.train.Supervisor(graph=g,\n",
" logdir=FLAGS.vis_logdir,\n",
" init_op=tf.global_variables_initializer(),\n",
" summary_op=None,\n",
" summary_writer=None,\n",
" global_step=None,\n",
" saver=saver)\n",
" num_batches = int(math.ceil(\n",
" dataset.num_samples / float(FLAGS.vis_batch_size)))\n",
" last_checkpoint = None\n",
"\n",
" # Loop to visualize the results when new checkpoint is created.\n",
" num_iters = 0\n",
" while (FLAGS.max_number_of_iterations <= 0 or\n",
" num_iters < FLAGS.max_number_of_iterations):\n",
" num_iters += 1\n",
" last_checkpoint = slim.evaluation.wait_for_new_checkpoint(\n",
" FLAGS.checkpoint_dir, last_checkpoint)\n",
" start = time.time()\n",
" tf.logging.info(\n",
" 'Starting visualization at ' + time.strftime('%Y-%m-%d-%H:%M:%S',\n",
" time.gmtime()))\n",
" tf.logging.info('Visualizing with model %s', last_checkpoint)\n",
"\n",
" with sv.managed_session(FLAGS.master,\n",
" start_standard_services=False) as sess:\n",
" sv.start_queue_runners(sess)\n",
" sv.saver.restore(sess, last_checkpoint)\n",
"\n",
" image_id_offset = 0\n",
" for batch in range(num_batches):\n",
" tf.logging.info('Visualizing batch %d / %d', batch + 1, num_batches)\n",
" _process_batch(sess=sess,\n",
" original_images=samples[common.ORIGINAL_IMAGE],\n",
" semantic_predictions=predictions,\n",
" image_names=samples[common.IMAGE_NAME],\n",
" image_heights=samples[common.HEIGHT],\n",
" image_widths=samples[common.WIDTH],\n",
" image_id_offset=image_id_offset,\n",
" save_dir=save_dir,\n",
" raw_save_dir=raw_save_dir,\n",
" train_id_to_eval_id=train_id_to_eval_id)\n",
" image_id_offset += FLAGS.vis_batch_size\n",
"\n",
" tf.logging.info(\n",
" 'Finished visualization at ' + time.strftime('%Y-%m-%d-%H:%M:%S',\n",
" time.gmtime()))\n",
" time_to_next_eval = start + FLAGS.eval_interval_secs - time.time()\n",
" if time_to_next_eval > 0:\n",
" time.sleep(time_to_next_eval)\n",
"\n",
"\n",
"if __name__ == '__main__':\n",
" flags.mark_flag_as_required('checkpoint_dir')\n",
" flags.mark_flag_as_required('vis_logdir')\n",
" flags.mark_flag_as_required('dataset_dir')\n",
" tf.app.run()\n",
"\"\"\"\n",
"with open(DEEPLAB_DIR+'/vis.py', \"w\") as file:\n",
" file.write(VIS)"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "XrQIL9Xc1vbC",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"%%bash -s \"$DEEPLAB_DIR\" \"$INIT_DIR\" \"$TRAIN_LOGDIR\" \"$TF_RECORD_DIR\" \"$RESEARCH_DIR\"\n",
"\n",
"DEEPLAB_DIR=$1\n",
"INIT_DIR=$2\n",
"TRAIN_LOGDIR=$3\n",
"TF_RECORD_DIR=$4\n",
"RESEARCH_DIR=$5\n",
"\n",
"cd \"${RESEARCH_DIR}\"\n",
"export PYTHONPATH=$PYTHONPATH:`pwd`/slim\n",
"\n",
"NUM_ITERATIONS=30000\n",
"python \"${DEEPLAB_DIR}\"/train.py \\\n",
" --logtostderr \\\n",
" --train_split=\"trainval\" \\\n",
" --model_variant=\"xception_65\" \\\n",
" --atrous_rates=6 \\\n",
" --atrous_rates=12 \\\n",
" --atrous_rates=18 \\\n",
" --output_stride=16 \\\n",
" --decoder_output_stride=4 \\\n",
" --train_crop_size=513 \\\n",
" --train_crop_size=513 \\\n",
" --train_batch_size=4 \\\n",
" --training_number_of_steps=\"${NUM_ITERATIONS}\" \\\n",
" --fine_tune_batch_norm=true \\\n",
" --tf_initial_checkpoint=\"${INIT_DIR}/deeplabv3_pascal_train_aug/model.ckpt\" \\\n",
" --train_logdir=\"${TRAIN_LOGDIR}\" \\\n",
"--dataset_dir=\"${TF_RECORD_DIR}\""
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "6HYm6gu__tA9",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"%%bash -s \"$DEEPLAB_DIR\" \"$EVAL_LOGDIR\" \"$TRAIN_LOGDIR\" \"$TF_RECORD_DIR\" \"$RESEARCH_DIR\"\n",
"\n",
"DEEPLAB_DIR=$1\n",
"EVAL_LOGDIR=$2\n",
"TRAIN_LOGDIR=$3\n",
"TF_RECORD_DIR=$4\n",
"RESEARCH_DIR=$5\n",
"\n",
"cd \"${RESEARCH_DIR}\"\n",
"export PYTHONPATH=$PYTHONPATH:`pwd`/slim\n",
"\n",
"python \"${DEEPLAB_DIR}\"/eval.py \\\n",
" --logtostderr \\\n",
" --eval_split=\"val\" \\\n",
" --model_variant=\"xception_65\" \\\n",
" --atrous_rates=6 \\\n",
" --atrous_rates=12 \\\n",
" --atrous_rates=18 \\\n",
" --output_stride=16 \\\n",
" --decoder_output_stride=4 \\\n",
" --eval_crop_size=600 \\\n",
" --eval_crop_size=800 \\\n",
" --checkpoint_dir=\"${TRAIN_LOGDIR}\" \\\n",
" --eval_logdir=\"${EVAL_LOGDIR}\" \\\n",
" --dataset_dir=\"${TF_RECORD_DIR}\" \\\n",
"--max_number_of_evaluations=1"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "3b1Ct8uzAfCL",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"%%bash -s \"$DEEPLAB_DIR\" \"$VIS_LOGDIR\" \"$TRAIN_LOGDIR\" \"$TF_RECORD_DIR\" \"$RESEARCH_DIR\"\n",
"\n",
"DEEPLAB_DIR=$1\n",
"VIS_LOGDIR=$2\n",
"TRAIN_LOGDIR=$3\n",
"TF_RECORD_DIR=$4\n",
"RESEARCH_DIR=$5\n",
"\n",
"cd \"${RESEARCH_DIR}\"\n",
"export PYTHONPATH=$PYTHONPATH:`pwd`/slim\n",
"\n",
"python \"${DEEPLAB_DIR}\"/vis.py \\\n",
" --logtostderr \\\n",
" --vis_split=\"val\" \\\n",
" --model_variant=\"xception_65\" \\\n",
" --atrous_rates=6 \\\n",
" --atrous_rates=12 \\\n",
" --atrous_rates=18 \\\n",
" --output_stride=16 \\\n",
" --decoder_output_stride=4 \\\n",
" --vis_crop_size=600 \\\n",
" --vis_crop_size=800 \\\n",
" --checkpoint_dir=\"${TRAIN_LOGDIR}\" \\\n",
" --vis_logdir=\"${VIS_LOGDIR}\" \\\n",
" --dataset_dir=\"${TF_RECORD_DIR}\" \\\n",
"--max_number_of_iterations=1"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "aTtZ0WaiRWi8",
"colab_type": "code",
"cellView": "both",
"colab": {}
},
"cell_type": "code",
"source": [
"!pwd\n",
"from google.colab import files\n",
"\n",
"\n",
"files.download('deeplab/datasets/capsicum_annuum/exp/d/vis/segmentation_results/000003_prediction.png')"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "sglLqSQBIux3",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"%%bash -s \"$DEEPLAB_DIR\" \"$EXPORT_DIR\" \"$TRAIN_LOGDIR\" \"$TF_RECORD_DIR\" \"$RESEARCH_DIR\"\n",
"\n",
"DEEPLAB_DIR=$1\n",
"EXPORT_DIR=$2\n",
"TRAIN_LOGDIR=$3\n",
"TF_RECORD_DIR=$4\n",
"RESEARCH_DIR=$5\n",
"\n",
"NUM_ITERATIONS=10\n",
"\n",
"cd \"${RESEARCH_DIR}\"\n",
"export PYTHONPATH=$PYTHONPATH:`pwd`/slim\n",
"\n",
"CKPT_PATH=\"${TRAIN_LOGDIR}/model.ckpt-${NUM_ITERATIONS}\"\n",
"EXPORT_PATH=\"${EXPORT_DIR}/frozen_inference_graph.pb\"\n",
"\n",
"python \"${DEEPLAB_DIR}\"/export_model.py \\\n",
" --logtostderr \\\n",
" --checkpoint_path=\"${CKPT_PATH}\" \\\n",
" --export_path=\"${EXPORT_PATH}\" \\\n",
" --model_variant=\"xception_65\" \\\n",
" --atrous_rates=6 \\\n",
" --atrous_rates=12 \\\n",
" --atrous_rates=18 \\\n",
" --output_stride=16 \\\n",
" --decoder_output_stride=4 \\\n",
" --num_classes=21 \\\n",
" --crop_size=513 \\\n",
" --crop_size=513 \\\n",
"--inference_scales=1.0"
],
"execution_count": 0,
"outputs": []
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment