Created
February 22, 2019 11:57
-
-
Save 0xfaust/27a53a09d540e1c71baef2018e74aca9 to your computer and use it in GitHub Desktop.
Jupyter Notebook for Experiment D in 'Data synthesis methods for semantic segmentation in agriculture: A Capsicum annuum dataset' paper
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"nbformat": 4, | |
"nbformat_minor": 0, | |
"metadata": { | |
"colab": { | |
"name": "capsicum_annuum.ipynb", | |
"version": "0.3.2", | |
"provenance": [], | |
"collapsed_sections": [], | |
"toc_visible": true | |
}, | |
"kernelspec": { | |
"name": "python2", | |
"display_name": "Python 2" | |
}, | |
"accelerator": "GPU" | |
}, | |
"cells": [ | |
{ | |
"metadata": { | |
"id": "vvk4jDgC70VO", | |
"colab_type": "text" | |
}, | |
"cell_type": "markdown", | |
"source": [ | |
"### Jupyter Notebook for Experiments in [Capsicum Annuum Dataset](https://doi.org/10.1016/j.compag.2017.12.001) Paper using Deeplab Models\n" | |
] | |
}, | |
{ | |
"metadata": { | |
"id": "J8HY15Mv9RAb", | |
"colab_type": "text" | |
}, | |
"cell_type": "markdown", | |
"source": [ | |
"**Import Libraries and Download Tensorflow Models**" | |
] | |
}, | |
{ | |
"metadata": { | |
"id": "jBg6BPGZ1vWd", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"import os\n", | |
"import math\n", | |
"import sys\n", | |
"from IPython.display import HTML" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "Wfzu6S35WZDj", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"import tensorflow as tf\n", | |
"device_name = tf.test.gpu_device_name()\n", | |
"if device_name != '/device:GPU:0':\n", | |
" raise SystemError('GPU device not found')\n", | |
"print('Found GPU at: {}'.format(device_name))" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "dTVhVb4SRN0J", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"# Install the PyDrive wrapper & import libraries.\n", | |
"# This only needs to be done once in a notebook.\n", | |
"!pip install -U -q PyDrive\n", | |
"from pydrive.auth import GoogleAuth\n", | |
"from pydrive.drive import GoogleDrive\n", | |
"from google.colab import auth\n", | |
"from oauth2client.client import GoogleCredentials\n", | |
"\n", | |
"# Authenticate and create the PyDrive client.\n", | |
"# This only needs to be done once in a notebook.\n", | |
"auth.authenticate_user()\n", | |
"gauth = GoogleAuth()\n", | |
"gauth.credentials = GoogleCredentials.get_application_default()\n", | |
"drive = GoogleDrive(gauth)" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "5FknKhTZ1vW4", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"!git clone https://github.com/tensorflow/models.git" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "Q0NboZCxIQ97", | |
"colab_type": "text" | |
}, | |
"cell_type": "markdown", | |
"source": [ | |
"### Experiment A \n", | |
"> Train: synthetic (1–8750). Test: synthetic (8851–8900).\n", | |
">> *This experiment was run to obtain a performance reference point of the model when having access to a large and detailed annotated dataset for this domain.*\n", | |
" \n", | |
"### Experiment B\n", | |
">Train: synthetic (1–8750). Test: empirical (41–50).\n", | |
">> *To determine to what extent a synthetically trained model can generalise to a similar set in the same domain without fine-tuning.*\n", | |
"\n", | |
"### Experiment C\n", | |
">Train: empirical (1–30). Test: empirical (41–50).\n", | |
">> *As a reference to see if the model can learn using a small dataset, using empirical data.*\n", | |
"\n", | |
"### Experiment D.\n", | |
">Train: PASCAL VOC. Fine-tune: empirical (1–30). Test: empirical (41–50).\n", | |
">> *To compare the effect of bootstrapping with a non-related dataset.*\n", | |
"\n", | |
"### Experiment E.\n", | |
">Train: synthetic (1–8750). Fine-tune: empirical (1–30). Test: empirical (41–50).\n", | |
">> *To assess the effect of bootstrapping with a related dataset.*\n" | |
] | |
}, | |
{ | |
"metadata": { | |
"id": "qc8O5pszIV5d", | |
"colab_type": "code", | |
"cellView": "form", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"#@title Select Experiment\n", | |
"\n", | |
"EXP = 'Exp. A' #@param [\"Pascal Voc Test\", \"Exp. A\", \"Exp. B\", \"Exp. C\", \"Exp. D\", \"Exp. E\"]\n", | |
"\n", | |
"print(EXP)" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "L_R3xxs3RbnA", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"print (EXP)" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "Y6OIRkFPMMhN", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"%%bash\n", | |
"pwd\n", | |
"cd models/research/deeplab\n", | |
"pwd\n", | |
"sh ./local_test.sh" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "PH8kC47d9dO7", | |
"colab_type": "text" | |
}, | |
"cell_type": "markdown", | |
"source": [ | |
"**Setup Directories and Download Dataset Data and Uncompress Files** \n", | |
"Dataset: https://data.4tu.nl/repository/uuid:884958f5-b868-46e1-b3d8-a0b5d91b02c0 \n", | |
"Execute in /tensorflow/models/research/deeplab/datasets" | |
] | |
}, | |
{ | |
"metadata": { | |
"id": "2sX-yrNx1vXI", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"# empirical data\n", | |
"BASE_URL = 'https://data.4tu.nl/bulk/uuid_884958f5-b868-46e1-b3d8-a0b5d91b02c0'\n", | |
"FILENAME_DATA = 'empirical_image_color.zip'\n", | |
"FILENAME_LABELS = 'empirical_label_class_grayscale.zip'" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "KRN7Jozf1vXZ", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"# synthetic data\n", | |
"BASE_URL = 'https://data.4tu.nl/bulk/uuid_884958f5-b868-46e1-b3d8-a0b5d91b02c0'\n", | |
"FILENAME_DATA = 'synthetic_image_color.zip'\n", | |
"FILENAME_LABELS = 'synthetic_label_class_grayscale.zip'" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "CWwAGNA91vXx", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"# setup directories\n", | |
"RESEARCH_DIR = os.getcwd()+'/models/research'\n", | |
"DEEPLAB_DIR = RESEARCH_DIR+'/deeplab'\n", | |
"DATASET_DIR = DEEPLAB_DIR+'/datasets'\n", | |
"CAPSICUM_ANNUUM_DIR = DATASET_DIR+'/capsicum_annuum'\n", | |
"LIST_DIR = CAPSICUM_ANNUUM_DIR+'/image_sets'\n", | |
"ANNOTATED_DIR = CAPSICUM_ANNUUM_DIR+'/segmentation_class'\n", | |
"INIT_DIR = CAPSICUM_ANNUUM_DIR+'/init_models'\n", | |
"EXP_DIR = CAPSICUM_ANNUUM_DIR+'/exp'\n", | |
"\n", | |
"if(EXP == 'Exp. A'):\n", | |
" IMAGE_DIR = CAPSICUM_ANNUUM_DIR+'/synthetic_image_color'\n", | |
" GROUND_TRUTH_DIR = CAPSICUM_ANNUUM_DIR+'/synthetic_label_class_grayscale/synthetic_label_class_all_grayscale'\n", | |
" EXP_ID = EXP_DIR+'/a'\n", | |
"elif(EXP == 'Exp. B'):\n", | |
" IMAGE_DIR = CAPSICUM_ANNUUM_DIR+'/x'\n", | |
" GROUND_TRUTH_DIR = CAPSICUM_ANNUUM_DIR+'/x'\n", | |
" EXP_ID = EXP_DIR+'/b'\n", | |
"elif(EXP == 'Exp. C'):\n", | |
" IMAGE_DIR = CAPSICUM_ANNUUM_DIR+'/x'\n", | |
" GROUND_TRUTH_DIR = CAPSICUM_ANNUUM_DIR+'/x'\n", | |
" EXP_ID = EXP_DIR+'/c'\n", | |
"elif(EXP == 'Exp. D'):\n", | |
" IMAGE_DIR = CAPSICUM_ANNUUM_DIR+'/empirical_image_color'\n", | |
" GROUND_TRUTH_DIR = CAPSICUM_ANNUUM_DIR+'/empirical_label_class_grayscale/empirical_label_class_all_grayscale'\n", | |
" EXP_ID = EXP_DIR+'/d'\n", | |
"else:\n", | |
" IMAGE_DIR = CAPSICUM_ANNUUM_DIR+'/x'\n", | |
" GROUND_TRUTH_DIR = CAPSICUM_ANNUUM_DIR+'/x'\n", | |
" EXP_ID = EXP_DIR+'/e'\n", | |
"\n", | |
"TRAIN_LOGDIR = EXP_ID+'/train'\n", | |
"EVAL_LOGDIR = EXP_ID+'/eval'\n", | |
"VIS_LOGDIR = EXP_ID+'/vis'\n", | |
"EXPORT_DIR = EXP_ID+'/export'\n", | |
"TF_RECORD_DIR = CAPSICUM_ANNUUM_DIR+'/tfrecord'\n", | |
" \n", | |
"%mkdir -p \"$CAPSICUM_ANNUUM_DIR\"\n", | |
"%mkdir -p \"$LIST_DIR\"\n", | |
"%mkdir -p \"$ANNOTATED_DIR\"\n", | |
"%mkdir -p \"$INIT_DIR\"\n", | |
"%mkdir -p \"$TRAIN_LOGDIR\"\n", | |
"%mkdir -p \"$EVAL_LOGDIR\"\n", | |
"%mkdir -p \"$VIS_LOGDIR\"\n", | |
"%mkdir -p \"$EXPORT_DIR\"\n", | |
"%mkdir -p \"$TF_RECORD_DIR\"" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "oBdtIkgn1vX8", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"%%bash -s \"$CAPSICUM_ANNUUM_DIR\" \"$BASE_URL\" \"$FILENAME_DATA\" \"$FILENAME_LABELS\" \n", | |
"CAPSICUM_ANNUUM_DIR=$1\n", | |
"cd \"${CAPSICUM_ANNUUM_DIR}\"\n", | |
"\n", | |
"# file urls\n", | |
"BASE_URL=$2\n", | |
"FILENAME_DATA=$3\n", | |
"FILENAME_LABELS=$4\n", | |
"\n", | |
"# Helper function to download dataset.\n", | |
"download(){\n", | |
" local BASE_URL=${1}\n", | |
" local FILENAME=${2}\n", | |
"\n", | |
" if [ ! -f \"${FILENAME}\" ]; then\n", | |
" echo \"Downloading ${FILENAME} to ${CAPSICUM_ANNUUM_DIR}\"\n", | |
" wget -q -nd -c \"${BASE_URL}/${FILENAME}\"\n", | |
" fi\n", | |
"}\n", | |
"\n", | |
"# Download the images.\n", | |
"download \"${BASE_URL}\" \"${FILENAME_DATA}\"\n", | |
"download \"${BASE_URL}\" \"${FILENAME_LABELS}\"" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "5bqeRvnyDtIY", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"%%bash -s \"$CAPSICUM_ANNUUM_DIR\" \"$BASE_URL\" \"$FILENAME_DATA\" \"$FILENAME_LABELS\" \n", | |
"CAPSICUM_ANNUUM_DIR=$1\n", | |
"cd \"${CAPSICUM_ANNUUM_DIR}\"\n", | |
"\n", | |
"# file urls\n", | |
"BASE_URL=$2\n", | |
"FILENAME_DATA=$3\n", | |
"FILENAME_LABELS=$4\n", | |
"\n", | |
"# Helper function to unpack dataset.\n", | |
"uncompress() {\n", | |
" local BASE_URL=${1}\n", | |
" local FILENAME=${2}\n", | |
"\n", | |
" echo \"Uncompressing ${FILENAME}\"\n", | |
" unzip \"${FILENAME}\"\n", | |
"}\n", | |
"\n", | |
"# Uncompress the images.\n", | |
"uncompress \"${BASE_URL}\" \"${FILENAME_DATA}\"\n", | |
"uncompress \"${BASE_URL}\" \"${FILENAME_LABELS}\"" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "H89SXXi310zl", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"%%bash -s \"$CAPSICUM_ANNUUM_DIR\" \"$GROUND_TRUTH_DIR\" \"$ANNOTATED_DIR\" \n", | |
"CAPSICUM_ANNUUM_DIR=$1\n", | |
"GROUND_TRUTH_DIR=$2\n", | |
"ANNOTATED_DIR=$3\n", | |
"\n", | |
"cd \"${CAPSICUM_ANNUUM_DIR}\"\n", | |
"\n", | |
"echo \"Removing the color map in ground truth annotations...\"\n", | |
"echo \"Ground truth directory: $GROUND_TRUTH_DIR\"\n", | |
"\n", | |
"python ../remove_gt_colormap.py \\\n", | |
" --original_gt_folder=\"$GROUND_TRUTH_DIR\" \\\n", | |
"--output_dir=\"$ANNOTATED_DIR/raw\"" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "xekr_spl-CKu", | |
"colab_type": "text" | |
}, | |
"cell_type": "markdown", | |
"source": [ | |
"**Define Training and Evaluation Images**" | |
] | |
}, | |
{ | |
"metadata": { | |
"id": "g4x87JyT1vYN", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"%%bash -s \"$IMAGE_DIR\" \"$LIST_DIR\"\n", | |
"\n", | |
"IMAGE_DIR=$1\n", | |
"LIST_DIR=$2\n", | |
"\n", | |
"cd \"${IMAGE_DIR}\"\n", | |
"\n", | |
"ls -v | head -30 | cut -d '.' -f 1 > ${LIST_DIR}/train.txt\n", | |
"ls -v | tail -9 | cut -d '.' -f 1 > ${LIST_DIR}/val.txt\n", | |
"cat ${LIST_DIR}/train.txt ${LIST_DIR}/val.txt > ${LIST_DIR}/trainval.txt" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "M_Z5uT4tV7sb", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"%%bash -s \"$IMAGE_DIR\" \"$LIST_DIR\"\n", | |
"\n", | |
"IMAGE_DIR=$1\n", | |
"LIST_DIR=$2\n", | |
"\n", | |
"cd \"${IMAGE_DIR}\"\n", | |
"\n", | |
"ls -v | head -8750 | cut -d '.' -f 1 > ${LIST_DIR}/train.txt\n", | |
"ls -v | tail -9 | cut -d '.' -f 1 > ${LIST_DIR}/val.txt\n", | |
"cat ${LIST_DIR}/train.txt ${LIST_DIR}/val.txt > ${LIST_DIR}/trainval.txt\n", | |
"\n", | |
"> Train: synthetic (1–8750). Test: synthetic (8851–8900).\n", | |
"Fine-tune: empirical (1–30). Test: empirical (41–50)\n" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "d82aFYsy-Nj6", | |
"colab_type": "text" | |
}, | |
"cell_type": "markdown", | |
"source": [ | |
"**Set Environment Path for Colab and Linux**" | |
] | |
}, | |
{ | |
"metadata": { | |
"id": "Q10uAcnJ4It7", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"os.environ['PYTHONPATH'] += \":/content/models/research\"\n", | |
"os.environ['PYTHONPATH'] += \":/content/models/research/slim\"" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "BKirt22E1vYi", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"%%bash -s \"$RESEARCH_DIR\" \"$DATASET_DIR\"\n", | |
"\n", | |
"RESEARCH_DIR=$1\n", | |
"DATASET_DIR=$2\n", | |
"\n", | |
"cd \"${RESEARCH_DIR}\"\n", | |
"pwd\n", | |
"echo \"${PYTHONPATH}\"\n", | |
"export PYTHONPATH=$PYTHONPATH:`pwd`/slim\n", | |
"python deeplab/model_test.py\n", | |
"echo \"${DATASET_DIR}\"\n", | |
"cd \"${DATASET_DIR}\"\n", | |
"pwd" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "fmhJJpMA-aZB", | |
"colab_type": "text" | |
}, | |
"cell_type": "markdown", | |
"source": [ | |
"**Clean and Rename Annotated Data**" | |
] | |
}, | |
{ | |
"metadata": { | |
"scrolled": true, | |
"id": "tX1Kw5lM1vZE", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"%%bash -s \"$ANNOTATED_DIR\"\n", | |
"\n", | |
"ANNOTATED_DIR=$1\n", | |
"\n", | |
"cd \"${ANNOTATED_DIR}/raw\"\n", | |
"echo \"${ANNOTATED_DIR}\"\n", | |
"cp *.png ../\n", | |
"cd \"${ANNOTATED_DIR}\"\n", | |
"rename 's/label_class_all_grayscale/image_color/' *.png\n" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "GTXPro0I-g73", | |
"colab_type": "text" | |
}, | |
"cell_type": "markdown", | |
"source": [ | |
"**Generate TFRecords for Tensorflow**" | |
] | |
}, | |
{ | |
"metadata": { | |
"id": "lzw_yRQc1vY6", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"%cd \"{DATASET_DIR}\"\n", | |
"!pwd" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "nxHolq7c1vZT", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"import build_data\n", | |
"\n", | |
"FLAGS = tf.app.flags.FLAGS\n", | |
"\n", | |
"####Delete all flags before declare#####\n", | |
"\n", | |
"def del_all_flags(FLAGS):\n", | |
" flags_dict = FLAGS._flags() \n", | |
" keys_list = [keys for keys in flags_dict] \n", | |
" for keys in keys_list:\n", | |
" FLAGS.__delattr__(keys)\n", | |
"\n", | |
"del_all_flags(tf.flags.FLAGS)\n", | |
"\n", | |
"tf.app.flags.DEFINE_string('image_folder',\n", | |
" IMAGE_DIR,\n", | |
" 'Folder containing images.')\n", | |
"\n", | |
"tf.app.flags.DEFINE_string(\n", | |
" 'semantic_segmentation_folder',\n", | |
" ANNOTATED_DIR,\n", | |
" 'Folder containing semantic segmentation annotations.')\n", | |
"\n", | |
"tf.app.flags.DEFINE_string(\n", | |
" 'list_folder',\n", | |
" LIST_DIR,\n", | |
" 'Folder containing lists for training and validation')\n", | |
"\n", | |
"tf.app.flags.DEFINE_string(\n", | |
" 'image_format',\n", | |
" \"png\",\n", | |
" 'Format of images.')\n", | |
"\n", | |
"tf.app.flags.DEFINE_string(\n", | |
" 'label_format',\n", | |
" \"png\",\n", | |
" 'Format of labels.')\n", | |
"\n", | |
"tf.app.flags.DEFINE_string(\n", | |
" 'output_dir',\n", | |
" TF_RECORD_DIR,\n", | |
" 'Path to save converted SSTable of TensorFlow examples.')\n", | |
"\n", | |
"\n", | |
"_NUM_SHARDS = 4\n", | |
"\n", | |
"\n", | |
"def _convert_dataset(dataset_split):\n", | |
" \"\"\"Converts the specified dataset split to TFRecord format.\n", | |
"\n", | |
" Args:\n", | |
" dataset_split: The dataset split (e.g., train, test).\n", | |
"\n", | |
" Raises:\n", | |
" RuntimeError: If loaded image and label have different shape.\n", | |
" \"\"\"\n", | |
" dataset = os.path.basename(dataset_split)[:-4]\n", | |
" sys.stdout.write('Processing ' + dataset)\n", | |
" filenames = [x.strip('\\n') for x in open(dataset_split, 'r')]\n", | |
" num_images = len(filenames)\n", | |
" num_per_shard = int(math.ceil(num_images / float(_NUM_SHARDS)))\n", | |
"\n", | |
" image_reader = build_data.ImageReader('png', channels=3)\n", | |
" label_reader = build_data.ImageReader('png', channels=1)\n", | |
"\n", | |
" for shard_id in range(_NUM_SHARDS):\n", | |
" output_filename = os.path.join(\n", | |
" FLAGS.output_dir,\n", | |
" '%s-%05d-of-%05d.tfrecord' % (dataset, shard_id, _NUM_SHARDS))\n", | |
" with tf.python_io.TFRecordWriter(output_filename) as tfrecord_writer:\n", | |
" start_idx = shard_id * num_per_shard\n", | |
" end_idx = min((shard_id + 1) * num_per_shard, num_images)\n", | |
" for i in range(start_idx, end_idx):\n", | |
" sys.stdout.write('\\r>> Converting image %d/%d shard %d' % (\n", | |
" i + 1, len(filenames), shard_id))\n", | |
" sys.stdout.flush()\n", | |
" # Read the image.\n", | |
" image_filename = os.path.join(\n", | |
" FLAGS.image_folder, filenames[i] + '.png') #+ FLAGS.image_format)\n", | |
" image_data = tf.gfile.FastGFile(image_filename, 'rb').read()\n", | |
" height, width = image_reader.read_image_dims(image_data)\n", | |
" # Read the semantic segmentation annotation.\n", | |
" seg_filename = os.path.join(\n", | |
" FLAGS.semantic_segmentation_folder,\n", | |
" filenames[i] + '.png') #+ FLAGS.label_format)\n", | |
" seg_data = tf.gfile.FastGFile(seg_filename, 'rb').read()\n", | |
" seg_height, seg_width = label_reader.read_image_dims(seg_data)\n", | |
" if height != seg_height or width != seg_width:\n", | |
" raise RuntimeError('Shape mismatched between image and label.')\n", | |
" # Convert to tf example.\n", | |
" example = build_data.image_seg_to_tfexample(\n", | |
" image_data, filenames[i], height, width, seg_data)\n", | |
" tfrecord_writer.write(example.SerializeToString())\n", | |
" sys.stdout.write('\\n')\n", | |
" sys.stdout.flush()\n", | |
"\n", | |
"\n", | |
"def main(unused_argv):\n", | |
" dataset_splits = tf.gfile.Glob(os.path.join(FLAGS.list_folder, '*.txt'))\n", | |
" for dataset_split in dataset_splits:\n", | |
" _convert_dataset(dataset_split)\n", | |
"\n", | |
"\n", | |
"if __name__ == '__main__':\n", | |
" tf.app.run()\n" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "Gdv4geJ3-m4b", | |
"colab_type": "text" | |
}, | |
"cell_type": "markdown", | |
"source": [ | |
"**Download PASCAL VOC Checkpoint for Exp. D**" | |
] | |
}, | |
{ | |
"metadata": { | |
"id": "rfzDtOg41vZo", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"%%bash -s \"$INIT_DIR\" \n", | |
"\n", | |
"INIT_DIR=$1\n", | |
"\n", | |
"TF_INIT_ROOT=\"http://download.tensorflow.org/models\"\n", | |
"TF_INIT_CKPT=\"deeplabv3_pascal_train_aug_2018_01_04.tar.gz\"\n", | |
"cd \"${INIT_DIR}\"\n", | |
"pwd\n", | |
"wget -nd -c \"${TF_INIT_ROOT}/${TF_INIT_CKPT}\"\n", | |
"tar -xf \"${TF_INIT_CKPT}\"" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "OLx7rbvu-0Gh", | |
"colab_type": "text" | |
}, | |
"cell_type": "markdown", | |
"source": [ | |
"**Modify Segmentation Dataset Config**" | |
] | |
}, | |
{ | |
"metadata": { | |
"id": "35KklVKf1vZ4", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"!pwd\n", | |
"%cd /content/models/research/" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "1jrAUwB71vap", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"#@title segmentation_dataset.py {display-mode: \"form\"}\n", | |
"\n", | |
"# This code will be hidden when the notebook is loaded.\n", | |
"SEG_DATA = \"\"\"import collections\n", | |
"import os.path\n", | |
"import tensorflow as tf\n", | |
"\n", | |
"slim = tf.contrib.slim\n", | |
"\n", | |
"dataset = slim.dataset\n", | |
"\n", | |
"tfexample_decoder = slim.tfexample_decoder\n", | |
"\n", | |
"\n", | |
"_ITEMS_TO_DESCRIPTIONS = {\n", | |
" 'image': 'A color image of varying height and width.',\n", | |
" 'labels_class': ('A semantic segmentation label whose size matches image.'\n", | |
" 'Its values range from 0 (background) to num_classes.'),\n", | |
"}\n", | |
"\n", | |
"# Named tuple to describe the dataset properties.\n", | |
"DatasetDescriptor = collections.namedtuple(\n", | |
" 'DatasetDescriptor',\n", | |
" ['splits_to_sizes', # Splits of the dataset into training, val, and test.\n", | |
" 'num_classes', # Number of semantic classes, including the background\n", | |
" # class (if exists). For example, there are 20\n", | |
" # foreground classes + 1 background class in the PASCAL\n", | |
" # VOC 2012 dataset. Thus, we set num_classes=21.\n", | |
" 'ignore_label', # Ignore label value.\n", | |
" ]\n", | |
")\n", | |
"\n", | |
"_CITYSCAPES_INFORMATION = DatasetDescriptor(\n", | |
" splits_to_sizes={\n", | |
" 'train': 2975,\n", | |
" 'val': 500,\n", | |
" },\n", | |
" num_classes=19,\n", | |
" ignore_label=0,\n", | |
")\n", | |
"\n", | |
"_PASCAL_VOC_SEG_INFORMATION = DatasetDescriptor(\n", | |
" splits_to_sizes={\n", | |
" 'train': 1464,\n", | |
" 'train_aug': 10582,\n", | |
" 'trainval': 2913,\n", | |
" 'val': 1449,\n", | |
" },\n", | |
" num_classes=21,\n", | |
" ignore_label=255,\n", | |
")\n", | |
"\n", | |
"_CAPSICUM_ANNUUM_INFORMATION = DatasetDescriptor(\n", | |
" splits_to_sizes={\n", | |
" 'train': 30,\n", | |
" 'trainval': 39,\n", | |
" 'val': 9,\n", | |
" },\n", | |
" num_classes=9,\n", | |
" ignore_label=255,\n", | |
")\n", | |
"# These number (i.e., 'train'/'test') seems to have to be hard coded\n", | |
"# You are required to figure it out for your training/testing example.\n", | |
"_ADE20K_INFORMATION = DatasetDescriptor(\n", | |
" splits_to_sizes={\n", | |
" 'train': 20210, # num of samples in images/training\n", | |
" 'val': 2000, # num of samples in images/validation\n", | |
" },\n", | |
" num_classes=151,\n", | |
" ignore_label=0,\n", | |
")\n", | |
"\n", | |
"\n", | |
"_DATASETS_INFORMATION = {\n", | |
" 'cityscapes': _CITYSCAPES_INFORMATION,\n", | |
" 'pascal_voc_seg': _PASCAL_VOC_SEG_INFORMATION,\n", | |
" 'ade20k': _ADE20K_INFORMATION,\n", | |
" 'capsicum_annuum': _CAPSICUM_ANNUUM_INFORMATION,\n", | |
"}\n", | |
"\n", | |
"# Default file pattern of TFRecord of TensorFlow Example.\n", | |
"_FILE_PATTERN = '%s-*'\n", | |
"\n", | |
"\n", | |
"def get_cityscapes_dataset_name():\n", | |
" return 'cityscapes'\n", | |
"\n", | |
"\n", | |
"def get_dataset(dataset_name, split_name, dataset_dir):\n", | |
" \n", | |
" if dataset_name not in _DATASETS_INFORMATION:\n", | |
" raise ValueError('The specified dataset is not supported yet.')\n", | |
"\n", | |
" splits_to_sizes = _DATASETS_INFORMATION[dataset_name].splits_to_sizes\n", | |
"\n", | |
" if split_name not in splits_to_sizes:\n", | |
" raise ValueError('data split name %s not recognized' % split_name)\n", | |
"\n", | |
" # Prepare the variables for different datasets.\n", | |
" num_classes = _DATASETS_INFORMATION[dataset_name].num_classes\n", | |
" ignore_label = _DATASETS_INFORMATION[dataset_name].ignore_label\n", | |
"\n", | |
" file_pattern = _FILE_PATTERN\n", | |
" file_pattern = os.path.join(dataset_dir, file_pattern % split_name)\n", | |
"\n", | |
" # Specify how the TF-Examples are decoded.\n", | |
" keys_to_features = {\n", | |
" 'image/encoded': tf.FixedLenFeature(\n", | |
" (), tf.string, default_value=''),\n", | |
" 'image/filename': tf.FixedLenFeature(\n", | |
" (), tf.string, default_value=''),\n", | |
" 'image/format': tf.FixedLenFeature(\n", | |
" (), tf.string, default_value='jpeg'),\n", | |
" 'image/height': tf.FixedLenFeature(\n", | |
" (), tf.int64, default_value=0),\n", | |
" 'image/width': tf.FixedLenFeature(\n", | |
" (), tf.int64, default_value=0),\n", | |
" 'image/segmentation/class/encoded': tf.FixedLenFeature(\n", | |
" (), tf.string, default_value=''),\n", | |
" 'image/segmentation/class/format': tf.FixedLenFeature(\n", | |
" (), tf.string, default_value='png'),\n", | |
" }\n", | |
" items_to_handlers = {\n", | |
" 'image': tfexample_decoder.Image(\n", | |
" image_key='image/encoded',\n", | |
" format_key='image/format',\n", | |
" channels=3),\n", | |
" 'image_name': tfexample_decoder.Tensor('image/filename'),\n", | |
" 'height': tfexample_decoder.Tensor('image/height'),\n", | |
" 'width': tfexample_decoder.Tensor('image/width'),\n", | |
" 'labels_class': tfexample_decoder.Image(\n", | |
" image_key='image/segmentation/class/encoded',\n", | |
" format_key='image/segmentation/class/format',\n", | |
" channels=1),\n", | |
" }\n", | |
"\n", | |
" decoder = tfexample_decoder.TFExampleDecoder(\n", | |
" keys_to_features, items_to_handlers)\n", | |
"\n", | |
" return dataset.Dataset(\n", | |
" data_sources=file_pattern,\n", | |
" reader=tf.TFRecordReader,\n", | |
" decoder=decoder,\n", | |
" num_samples=splits_to_sizes[split_name],\n", | |
" items_to_descriptions=_ITEMS_TO_DESCRIPTIONS,\n", | |
" ignore_label=ignore_label,\n", | |
" num_classes=num_classes,\n", | |
" name=dataset_name,\n", | |
" multi_label=True)\n", | |
"\"\"\"\n", | |
"with open(DATASET_DIR+'/segmentation_dataset.py', \"w\") as file:\n", | |
" file.write(SEG_DATA)" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "ZiPTvA2fbc4T", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"#@title train_utils.py {display-mode: \"form\"}\n", | |
"\n", | |
"# This code will be hidden when the notebook is loaded.\n", | |
"\n", | |
"TRAIN_UTILS = \"\"\"\n", | |
"import six\n", | |
"\n", | |
"import tensorflow as tf\n", | |
"from deeplab.core import preprocess_utils\n", | |
"\n", | |
"slim = tf.contrib.slim\n", | |
"\n", | |
"\n", | |
"def add_softmax_cross_entropy_loss_for_each_scale(scales_to_logits,\n", | |
" labels,\n", | |
" num_classes,\n", | |
" ignore_label,\n", | |
" loss_weight=1.0,\n", | |
" upsample_logits=True,\n", | |
" scope=None):\n", | |
" if labels is None:\n", | |
" raise ValueError('No label for softmax cross entropy loss.')\n", | |
"\n", | |
" for scale, logits in six.iteritems(scales_to_logits):\n", | |
" loss_scope = None\n", | |
" if scope:\n", | |
" loss_scope = '%s_%s' % (scope, scale)\n", | |
"\n", | |
" if upsample_logits:\n", | |
" # Label is not downsampled, and instead we upsample logits.\n", | |
" logits = tf.image.resize_bilinear(\n", | |
" logits,\n", | |
" preprocess_utils.resolve_shape(labels, 4)[1:3],\n", | |
" align_corners=True)\n", | |
" scaled_labels = labels\n", | |
" else:\n", | |
" # Label is downsampled to the same size as logits.\n", | |
" scaled_labels = tf.image.resize_nearest_neighbor(\n", | |
" labels,\n", | |
" preprocess_utils.resolve_shape(logits, 4)[1:3],\n", | |
" align_corners=True)\n", | |
"\n", | |
" scaled_labels = tf.reshape(scaled_labels, shape=[-1])\n", | |
" not_ignore_mask = tf.to_float(tf.not_equal(scaled_labels,\n", | |
" ignore_label)) * loss_weight\n", | |
" one_hot_labels = slim.one_hot_encoding(\n", | |
" scaled_labels, num_classes, on_value=1.0, off_value=0.0)\n", | |
" tf.losses.softmax_cross_entropy(\n", | |
" one_hot_labels,\n", | |
" tf.reshape(logits, shape=[-1, num_classes]),\n", | |
" weights=not_ignore_mask,\n", | |
" scope=loss_scope)\n", | |
"\n", | |
"\n", | |
"def get_model_init_fn(train_logdir,\n", | |
" tf_initial_checkpoint,\n", | |
" initialize_last_layer,\n", | |
" last_layers,\n", | |
" ignore_missing_vars=False):\n", | |
" \n", | |
" if tf_initial_checkpoint is None:\n", | |
" tf.logging.info('Not initializing the model from a checkpoint.')\n", | |
" return None\n", | |
"\n", | |
" if tf.train.latest_checkpoint(train_logdir):\n", | |
" tf.logging.info('Ignoring initialization; other checkpoint exists')\n", | |
" return None\n", | |
"\n", | |
" tf.logging.info('Initializing model from path: %s', tf_initial_checkpoint)\n", | |
"\n", | |
" # Variables that will not be restored.\n", | |
" exclude_list = ['global_step','logits']\n", | |
" if not initialize_last_layer:\n", | |
" exclude_list.extend(last_layers)\n", | |
"\n", | |
" variables_to_restore = slim.get_variables_to_restore(exclude=exclude_list)\n", | |
"\n", | |
" if variables_to_restore:\n", | |
" return slim.assign_from_checkpoint_fn(\n", | |
" tf_initial_checkpoint,\n", | |
" variables_to_restore,\n", | |
" ignore_missing_vars=ignore_missing_vars)\n", | |
" return None\n", | |
"\n", | |
"\n", | |
"def get_model_gradient_multipliers(last_layers, last_layer_gradient_multiplier):\n", | |
" gradient_multipliers = {}\n", | |
"\n", | |
" for var in slim.get_model_variables():\n", | |
" # Double the learning rate for biases.\n", | |
" if 'biases' in var.op.name:\n", | |
" gradient_multipliers[var.op.name] = 2.\n", | |
"\n", | |
" # Use larger learning rate for last layer variables.\n", | |
" for layer in last_layers:\n", | |
" if layer in var.op.name and 'biases' in var.op.name:\n", | |
" gradient_multipliers[var.op.name] = 2 * last_layer_gradient_multiplier\n", | |
" break\n", | |
" elif layer in var.op.name:\n", | |
" gradient_multipliers[var.op.name] = last_layer_gradient_multiplier\n", | |
" break\n", | |
"\n", | |
" return gradient_multipliers\n", | |
"\n", | |
"\n", | |
"def get_model_learning_rate(\n", | |
" learning_policy, base_learning_rate, learning_rate_decay_step,\n", | |
" learning_rate_decay_factor, training_number_of_steps, learning_power,\n", | |
" slow_start_step, slow_start_learning_rate):\n", | |
" \n", | |
" global_step = tf.train.get_or_create_global_step()\n", | |
" if learning_policy == 'step':\n", | |
" learning_rate = tf.train.exponential_decay(\n", | |
" base_learning_rate,\n", | |
" global_step,\n", | |
" learning_rate_decay_step,\n", | |
" learning_rate_decay_factor,\n", | |
" staircase=True)\n", | |
" elif learning_policy == 'poly':\n", | |
" learning_rate = tf.train.polynomial_decay(\n", | |
" base_learning_rate,\n", | |
" global_step,\n", | |
" training_number_of_steps,\n", | |
" end_learning_rate=0,\n", | |
" power=learning_power)\n", | |
" else:\n", | |
" raise ValueError('Unknown learning policy.')\n", | |
"\n", | |
" # Employ small learning rate at the first few steps for warm start.\n", | |
" return tf.where(global_step < slow_start_step, slow_start_learning_rate,\n", | |
" learning_rate)\n", | |
"\n", | |
"\"\"\"\n", | |
"with open(DEEPLAB_DIR+'/utils/train_utils.py', \"w\") as file:\n", | |
" file.write(TRAIN_UTILS)" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "EvS21exIegWc", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"#@title train.py {display-mode: \"form\"}\n", | |
"\n", | |
"# This code will be hidden when the notebook is loaded.\n", | |
"\n", | |
"TRAIN = \"\"\"\n", | |
"\n", | |
"import six\n", | |
"import tensorflow as tf\n", | |
"from deeplab import common\n", | |
"from deeplab import model\n", | |
"from deeplab.datasets import segmentation_dataset\n", | |
"from deeplab.utils import input_generator\n", | |
"from deeplab.utils import train_utils\n", | |
"from deployment import model_deploy\n", | |
"\n", | |
"slim = tf.contrib.slim\n", | |
"\n", | |
"prefetch_queue = slim.prefetch_queue\n", | |
"\n", | |
"flags = tf.app.flags\n", | |
"\n", | |
"FLAGS = flags.FLAGS\n", | |
"\n", | |
"# Settings for multi-GPUs/multi-replicas training.\n", | |
"\n", | |
"flags.DEFINE_integer('num_clones', 1, 'Number of clones to deploy.')\n", | |
"\n", | |
"flags.DEFINE_boolean('clone_on_cpu', False, 'Use CPUs to deploy clones.')\n", | |
"\n", | |
"flags.DEFINE_integer('num_replicas', 1, 'Number of worker replicas.')\n", | |
"\n", | |
"flags.DEFINE_integer('startup_delay_steps', 15,\n", | |
" 'Number of training steps between replicas startup.')\n", | |
"\n", | |
"flags.DEFINE_integer('num_ps_tasks', 0,\n", | |
" 'The number of parameter servers. If the value is 0, then '\n", | |
" 'the parameters are handled locally by the worker.')\n", | |
"\n", | |
"flags.DEFINE_string('master', '', 'BNS name of the tensorflow server')\n", | |
"\n", | |
"flags.DEFINE_integer('task', 0, 'The task ID.')\n", | |
"\n", | |
"# Settings for logging.\n", | |
"\n", | |
"flags.DEFINE_string('train_logdir', None,\n", | |
" 'Where the checkpoint and logs are stored.')\n", | |
"\n", | |
"flags.DEFINE_integer('log_steps', 10,\n", | |
" 'Display logging information at every log_steps.')\n", | |
"\n", | |
"flags.DEFINE_integer('save_interval_secs', 1200,\n", | |
" 'How often, in seconds, we save the model to disk.')\n", | |
"\n", | |
"flags.DEFINE_integer('save_summaries_secs', 600,\n", | |
" 'How often, in seconds, we compute the summaries.')\n", | |
"\n", | |
"flags.DEFINE_boolean('save_summaries_images', False,\n", | |
" 'Save sample inputs, labels, and semantic predictions as '\n", | |
" 'images to summary.')\n", | |
"\n", | |
"# Settings for training strategy.\n", | |
"\n", | |
"flags.DEFINE_enum('learning_policy', 'poly', ['poly', 'step'],\n", | |
" 'Learning rate policy for training.')\n", | |
"\n", | |
"# Use 0.007 when training on PASCAL augmented training set, train_aug. When\n", | |
"# fine-tuning on PASCAL trainval set, use learning rate=0.0001.\n", | |
"flags.DEFINE_float('base_learning_rate', .00005,\n", | |
" 'The base learning rate for model training.')\n", | |
"\n", | |
"flags.DEFINE_float('learning_rate_decay_factor', 0.1,\n", | |
" 'The rate to decay the base learning rate.')\n", | |
"\n", | |
"flags.DEFINE_integer('learning_rate_decay_step', 2000,\n", | |
" 'Decay the base learning rate at a fixed step.')\n", | |
"\n", | |
"flags.DEFINE_float('learning_power', 0.9,\n", | |
" 'The power value used in the poly learning policy.')\n", | |
"\n", | |
"flags.DEFINE_integer('training_number_of_steps', 30000,\n", | |
" 'The number of steps used for training')\n", | |
"\n", | |
"flags.DEFINE_float('momentum', 0.9, 'The momentum value to use')\n", | |
"\n", | |
"# When fine_tune_batch_norm=True, use at least batch size larger than 12\n", | |
"# (batch size more than 16 is better). Otherwise, one could use smaller batch\n", | |
"# size and set fine_tune_batch_norm=False.\n", | |
"flags.DEFINE_integer('train_batch_size', 10,\n", | |
" 'The number of images in each batch during training.')\n", | |
"\n", | |
"# For weight_decay, use 0.00004 for MobileNet-V2 or Xcpetion model variants.\n", | |
"# Use 0.0001 for ResNet model variants.\n", | |
"flags.DEFINE_float('weight_decay', 0.00004,\n", | |
" 'The value of the weight decay for training.')\n", | |
"\n", | |
"flags.DEFINE_multi_integer('train_crop_size', [513, 513],\n", | |
" 'Image crop size [height, width] during training.')\n", | |
"\n", | |
"flags.DEFINE_float('last_layer_gradient_multiplier', 1.0,\n", | |
" 'The gradient multiplier for last layers, which is used to '\n", | |
" 'boost the gradient of last layers if the value > 1.')\n", | |
"\n", | |
"flags.DEFINE_boolean('upsample_logits', True,\n", | |
" 'Upsample logits during training.')\n", | |
"\n", | |
"# Settings for fine-tuning the network.\n", | |
"\n", | |
"flags.DEFINE_string('tf_initial_checkpoint', None,\n", | |
" 'The initial checkpoint in tensorflow format.')\n", | |
"\n", | |
"# Set to False if one does not want to re-use the trained classifier weights.\n", | |
"flags.DEFINE_boolean('initialize_last_layer', False,\n", | |
" 'Initialize the last layer.')\n", | |
"\n", | |
"flags.DEFINE_boolean('last_layers_contain_logits_only', False,\n", | |
" 'Only consider logits as last layers or not.')\n", | |
"\n", | |
"flags.DEFINE_integer('slow_start_step', 0,\n", | |
" 'Training model with small learning rate for few steps.')\n", | |
"\n", | |
"flags.DEFINE_float('slow_start_learning_rate', 1e-4,\n", | |
" 'Learning rate employed during slow start.')\n", | |
"\n", | |
"# Set to True if one wants to fine-tune the batch norm parameters in DeepLabv3.\n", | |
"# Set to False and use small batch size to save GPU memory.\n", | |
"flags.DEFINE_boolean('fine_tune_batch_norm', False,\n", | |
" 'Fine tune the batch norm parameters or not.')\n", | |
"\n", | |
"flags.DEFINE_float('min_scale_factor', 0.5,\n", | |
" 'Mininum scale factor for data augmentation.')\n", | |
"\n", | |
"flags.DEFINE_float('max_scale_factor', 2.,\n", | |
" 'Maximum scale factor for data augmentation.')\n", | |
"\n", | |
"flags.DEFINE_float('scale_factor_step_size', 0.25,\n", | |
" 'Scale factor step size for data augmentation.')\n", | |
"\n", | |
"# For `xception_65`, use atrous_rates = [12, 24, 36] if output_stride = 8, or\n", | |
"# rates = [6, 12, 18] if output_stride = 16. For `mobilenet_v2`, use None. Note\n", | |
"# one could use different atrous_rates/output_stride during training/evaluation.\n", | |
"flags.DEFINE_multi_integer('atrous_rates', None,\n", | |
" 'Atrous rates for atrous spatial pyramid pooling.')\n", | |
"\n", | |
"flags.DEFINE_integer('output_stride', 16,\n", | |
" 'The ratio of input to output spatial resolution.')\n", | |
"\n", | |
"# Dataset settings.\n", | |
"flags.DEFINE_string('dataset', 'capsicum_annuum',\n", | |
" 'Name of the segmentation dataset.')\n", | |
"\n", | |
"flags.DEFINE_string('train_split', 'train',\n", | |
" 'Which split of the dataset to be used for training')\n", | |
"\n", | |
"flags.DEFINE_string('dataset_dir', None, 'Where the dataset reside.')\n", | |
"\n", | |
"\n", | |
"def _build_deeplab(inputs_queue, outputs_to_num_classes, ignore_label):\n", | |
"\n", | |
" samples = inputs_queue.dequeue()\n", | |
"\n", | |
" # Add name to input and label nodes so we can add to summary.\n", | |
" samples[common.IMAGE] = tf.identity(\n", | |
" samples[common.IMAGE], name=common.IMAGE)\n", | |
" samples[common.LABEL] = tf.identity(\n", | |
" samples[common.LABEL], name=common.LABEL)\n", | |
"\n", | |
" model_options = common.ModelOptions(\n", | |
" outputs_to_num_classes=outputs_to_num_classes,\n", | |
" crop_size=FLAGS.train_crop_size,\n", | |
" atrous_rates=FLAGS.atrous_rates,\n", | |
" output_stride=FLAGS.output_stride)\n", | |
" outputs_to_scales_to_logits = model.multi_scale_logits(\n", | |
" samples[common.IMAGE],\n", | |
" model_options=model_options,\n", | |
" image_pyramid=FLAGS.image_pyramid,\n", | |
" weight_decay=FLAGS.weight_decay,\n", | |
" is_training=True,\n", | |
" fine_tune_batch_norm=FLAGS.fine_tune_batch_norm)\n", | |
"\n", | |
" # Add name to graph node so we can add to summary.\n", | |
" output_type_dict = outputs_to_scales_to_logits[common.OUTPUT_TYPE]\n", | |
" output_type_dict[model.MERGED_LOGITS_SCOPE] = tf.identity(\n", | |
" output_type_dict[model.MERGED_LOGITS_SCOPE],\n", | |
" name=common.OUTPUT_TYPE)\n", | |
"\n", | |
" for output, num_classes in six.iteritems(outputs_to_num_classes):\n", | |
" train_utils.add_softmax_cross_entropy_loss_for_each_scale(\n", | |
" outputs_to_scales_to_logits[output],\n", | |
" samples[common.LABEL],\n", | |
" num_classes,\n", | |
" ignore_label,\n", | |
" loss_weight=1.0,\n", | |
" upsample_logits=FLAGS.upsample_logits,\n", | |
" scope=output)\n", | |
"\n", | |
" return outputs_to_scales_to_logits\n", | |
"\n", | |
"\n", | |
"def main(unused_argv):\n", | |
" tf.logging.set_verbosity(tf.logging.INFO)\n", | |
" # Set up deployment (i.e., multi-GPUs and/or multi-replicas).\n", | |
" config = model_deploy.DeploymentConfig(\n", | |
" num_clones=FLAGS.num_clones,\n", | |
" clone_on_cpu=FLAGS.clone_on_cpu,\n", | |
" replica_id=FLAGS.task,\n", | |
" num_replicas=FLAGS.num_replicas,\n", | |
" num_ps_tasks=FLAGS.num_ps_tasks)\n", | |
"\n", | |
" # Split the batch across GPUs.\n", | |
" assert FLAGS.train_batch_size % config.num_clones == 0, (\n", | |
" 'Training batch size not divisble by number of clones (GPUs).')\n", | |
"\n", | |
" clone_batch_size = FLAGS.train_batch_size // config.num_clones\n", | |
"\n", | |
" # Get dataset-dependent information.\n", | |
" dataset = segmentation_dataset.get_dataset(\n", | |
" FLAGS.dataset, FLAGS.train_split, dataset_dir=FLAGS.dataset_dir)\n", | |
"\n", | |
" tf.gfile.MakeDirs(FLAGS.train_logdir)\n", | |
" tf.logging.info('Training on %s set', FLAGS.train_split)\n", | |
"\n", | |
" with tf.Graph().as_default() as graph:\n", | |
" with tf.device(config.inputs_device()):\n", | |
" samples = input_generator.get(\n", | |
" dataset,\n", | |
" FLAGS.train_crop_size,\n", | |
" clone_batch_size,\n", | |
" min_resize_value=FLAGS.min_resize_value,\n", | |
" max_resize_value=FLAGS.max_resize_value,\n", | |
" resize_factor=FLAGS.resize_factor,\n", | |
" min_scale_factor=FLAGS.min_scale_factor,\n", | |
" max_scale_factor=FLAGS.max_scale_factor,\n", | |
" scale_factor_step_size=FLAGS.scale_factor_step_size,\n", | |
" dataset_split=FLAGS.train_split,\n", | |
" is_training=True,\n", | |
" model_variant=FLAGS.model_variant)\n", | |
" inputs_queue = prefetch_queue.prefetch_queue(\n", | |
" samples, capacity=128 * config.num_clones)\n", | |
"\n", | |
" # Create the global step on the device storing the variables.\n", | |
" with tf.device(config.variables_device()):\n", | |
" global_step = tf.train.get_or_create_global_step()\n", | |
"\n", | |
" # Define the model and create clones.\n", | |
" model_fn = _build_deeplab\n", | |
" model_args = (inputs_queue, {\n", | |
" common.OUTPUT_TYPE: dataset.num_classes\n", | |
" }, dataset.ignore_label)\n", | |
" clones = model_deploy.create_clones(config, model_fn, args=model_args)\n", | |
"\n", | |
" # Gather update_ops from the first clone. These contain, for example,\n", | |
" # the updates for the batch_norm variables created by model_fn.\n", | |
" first_clone_scope = config.clone_scope(0)\n", | |
" update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS, first_clone_scope)\n", | |
"\n", | |
" # Gather initial summaries.\n", | |
" summaries = set(tf.get_collection(tf.GraphKeys.SUMMARIES))\n", | |
"\n", | |
" # Add summaries for model variables.\n", | |
" for model_var in slim.get_model_variables():\n", | |
" summaries.add(tf.summary.histogram(model_var.op.name, model_var))\n", | |
"\n", | |
" # Add summaries for images, labels, semantic predictions\n", | |
" if FLAGS.save_summaries_images:\n", | |
" summary_image = graph.get_tensor_by_name(\n", | |
" ('%s/%s:0' % (first_clone_scope, common.IMAGE)).strip('/'))\n", | |
" summaries.add(\n", | |
" tf.summary.image('samples/%s' % common.IMAGE, summary_image))\n", | |
"\n", | |
" first_clone_label = graph.get_tensor_by_name(\n", | |
" ('%s/%s:0' % (first_clone_scope, common.LABEL)).strip('/'))\n", | |
" # Scale up summary image pixel values for better visualization.\n", | |
" pixel_scaling = max(1, 255 // dataset.num_classes)\n", | |
" summary_label = tf.cast(first_clone_label * pixel_scaling, tf.uint8)\n", | |
" summaries.add(\n", | |
" tf.summary.image('samples/%s' % common.LABEL, summary_label))\n", | |
"\n", | |
" first_clone_output = graph.get_tensor_by_name(\n", | |
" ('%s/%s:0' % (first_clone_scope, common.OUTPUT_TYPE)).strip('/'))\n", | |
" predictions = tf.expand_dims(tf.argmax(first_clone_output, 3), -1)\n", | |
"\n", | |
" summary_predictions = tf.cast(predictions * pixel_scaling, tf.uint8)\n", | |
" summaries.add(\n", | |
" tf.summary.image(\n", | |
" 'samples/%s' % common.OUTPUT_TYPE, summary_predictions))\n", | |
"\n", | |
" # Add summaries for losses.\n", | |
" for loss in tf.get_collection(tf.GraphKeys.LOSSES, first_clone_scope):\n", | |
" summaries.add(tf.summary.scalar('losses/%s' % loss.op.name, loss))\n", | |
"\n", | |
" # Build the optimizer based on the device specification.\n", | |
" with tf.device(config.optimizer_device()):\n", | |
" learning_rate = train_utils.get_model_learning_rate(\n", | |
" FLAGS.learning_policy, FLAGS.base_learning_rate,\n", | |
" FLAGS.learning_rate_decay_step, FLAGS.learning_rate_decay_factor,\n", | |
" FLAGS.training_number_of_steps, FLAGS.learning_power,\n", | |
" FLAGS.slow_start_step, FLAGS.slow_start_learning_rate)\n", | |
" optimizer = tf.train.MomentumOptimizer(learning_rate, FLAGS.momentum)\n", | |
" summaries.add(tf.summary.scalar('learning_rate', learning_rate))\n", | |
"\n", | |
" startup_delay_steps = FLAGS.task * FLAGS.startup_delay_steps\n", | |
" for variable in slim.get_model_variables():\n", | |
" summaries.add(tf.summary.histogram(variable.op.name, variable))\n", | |
"\n", | |
" with tf.device(config.variables_device()):\n", | |
" total_loss, grads_and_vars = model_deploy.optimize_clones(\n", | |
" clones, optimizer)\n", | |
" total_loss = tf.check_numerics(total_loss, 'Loss is inf or nan.')\n", | |
" summaries.add(tf.summary.scalar('total_loss', total_loss))\n", | |
"\n", | |
" # Modify the gradients for biases and last layer variables.\n", | |
" last_layers = model.get_extra_layer_scopes(\n", | |
" FLAGS.last_layers_contain_logits_only)\n", | |
" grad_mult = train_utils.get_model_gradient_multipliers(\n", | |
" last_layers, FLAGS.last_layer_gradient_multiplier)\n", | |
" if grad_mult:\n", | |
" grads_and_vars = slim.learning.multiply_gradients(\n", | |
" grads_and_vars, grad_mult)\n", | |
"\n", | |
" # Create gradient update op.\n", | |
" grad_updates = optimizer.apply_gradients(\n", | |
" grads_and_vars, global_step=global_step)\n", | |
" update_ops.append(grad_updates)\n", | |
" update_op = tf.group(*update_ops)\n", | |
" with tf.control_dependencies([update_op]):\n", | |
" train_tensor = tf.identity(total_loss, name='train_op')\n", | |
"\n", | |
" # Add the summaries from the first clone. These contain the summaries\n", | |
" # created by model_fn and either optimize_clones() or _gather_clone_loss().\n", | |
" summaries |= set(\n", | |
" tf.get_collection(tf.GraphKeys.SUMMARIES, first_clone_scope))\n", | |
"\n", | |
" # Merge all summaries together.\n", | |
" summary_op = tf.summary.merge(list(summaries))\n", | |
"\n", | |
" # Soft placement allows placing on CPU ops without GPU implementation.\n", | |
" session_config = tf.ConfigProto(\n", | |
" allow_soft_placement=True, log_device_placement=False)\n", | |
"\n", | |
" # Start the training.\n", | |
" slim.learning.train(\n", | |
" train_tensor,\n", | |
" logdir=FLAGS.train_logdir,\n", | |
" log_every_n_steps=FLAGS.log_steps,\n", | |
" master=FLAGS.master,\n", | |
" number_of_steps=FLAGS.training_number_of_steps,\n", | |
" is_chief=(FLAGS.task == 0),\n", | |
" session_config=session_config,\n", | |
" startup_delay_steps=startup_delay_steps,\n", | |
" init_fn=train_utils.get_model_init_fn(\n", | |
" FLAGS.train_logdir,\n", | |
" FLAGS.tf_initial_checkpoint,\n", | |
" FLAGS.initialize_last_layer,\n", | |
" last_layers,\n", | |
" ignore_missing_vars=True),\n", | |
" summary_op=summary_op,\n", | |
" save_summaries_secs=FLAGS.save_summaries_secs,\n", | |
" save_interval_secs=FLAGS.save_interval_secs)\n", | |
"\n", | |
"\n", | |
"if __name__ == '__main__':\n", | |
" flags.mark_flag_as_required('train_logdir')\n", | |
" flags.mark_flag_as_required('tf_initial_checkpoint')\n", | |
" flags.mark_flag_as_required('dataset_dir')\n", | |
" tf.app.run()\n", | |
"\"\"\"\n", | |
"with open(DEEPLAB_DIR+'/train.py', \"w\") as file:\n", | |
" file.write(TRAIN)\n" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "Jm1MbTLAfHQ8", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"#@title eval.py {display-mode: \"form\"}\n", | |
"\n", | |
"# This code will be hidden when the notebook is loaded.\n", | |
"\n", | |
"EVAL = \"\"\"\n", | |
"# Copyright 2018 The TensorFlow Authors All Rights Reserved.\n", | |
"#\n", | |
"# Licensed under the Apache License, Version 2.0 (the \"License\");\n", | |
"# you may not use this file except in compliance with the License.\n", | |
"# You may obtain a copy of the License at\n", | |
"#\n", | |
"# http://www.apache.org/licenses/LICENSE-2.0\n", | |
"#\n", | |
"# Unless required by applicable law or agreed to in writing, software\n", | |
"# distributed under the License is distributed on an \"AS IS\" BASIS,\n", | |
"# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", | |
"# See the License for the specific language governing permissions and\n", | |
"# limitations under the License.\n", | |
"# ==============================================================================\n", | |
"\n", | |
"\n", | |
"import math\n", | |
"import six\n", | |
"import tensorflow as tf\n", | |
"from deeplab import common\n", | |
"from deeplab import model\n", | |
"from deeplab.datasets import segmentation_dataset\n", | |
"from deeplab.utils import input_generator\n", | |
"\n", | |
"slim = tf.contrib.slim\n", | |
"\n", | |
"flags = tf.app.flags\n", | |
"\n", | |
"FLAGS = flags.FLAGS\n", | |
"\n", | |
"flags.DEFINE_string('master', '', 'BNS name of the tensorflow server')\n", | |
"\n", | |
"# Settings for log directories.\n", | |
"\n", | |
"flags.DEFINE_string('eval_logdir', None, 'Where to write the event logs.')\n", | |
"\n", | |
"flags.DEFINE_string('checkpoint_dir', None, 'Directory of model checkpoints.')\n", | |
"\n", | |
"# Settings for evaluating the model.\n", | |
"\n", | |
"flags.DEFINE_integer('eval_batch_size', 1,\n", | |
" 'The number of images in each batch during evaluation.')\n", | |
"\n", | |
"flags.DEFINE_multi_integer('eval_crop_size', [513, 513],\n", | |
" 'Image crop size [height, width] for evaluation.')\n", | |
"\n", | |
"flags.DEFINE_integer('eval_interval_secs', 60 * 5,\n", | |
" 'How often (in seconds) to run evaluation.')\n", | |
"\n", | |
"# For `xception_65`, use atrous_rates = [12, 24, 36] if output_stride = 8, or\n", | |
"# rates = [6, 12, 18] if output_stride = 16. For `mobilenet_v2`, use None. Note\n", | |
"# one could use different atrous_rates/output_stride during training/evaluation.\n", | |
"flags.DEFINE_multi_integer('atrous_rates', None,\n", | |
" 'Atrous rates for atrous spatial pyramid pooling.')\n", | |
"\n", | |
"flags.DEFINE_integer('output_stride', 16,\n", | |
" 'The ratio of input to output spatial resolution.')\n", | |
"\n", | |
"# Change to [0.5, 0.75, 1.0, 1.25, 1.5, 1.75] for multi-scale test.\n", | |
"flags.DEFINE_multi_float('eval_scales', [1.0],\n", | |
" 'The scales to resize images for evaluation.')\n", | |
"\n", | |
"# Change to True for adding flipped images during test.\n", | |
"flags.DEFINE_bool('add_flipped_images', False,\n", | |
" 'Add flipped images for evaluation or not.')\n", | |
"\n", | |
"# Dataset settings.\n", | |
"\n", | |
"flags.DEFINE_string('dataset', 'capsicum_annuum',\n", | |
" 'Name of the segmentation dataset.')\n", | |
"\n", | |
"flags.DEFINE_string('eval_split', 'val',\n", | |
" 'Which split of the dataset used for evaluation')\n", | |
"\n", | |
"flags.DEFINE_string('dataset_dir', None, 'Where the dataset reside.')\n", | |
"\n", | |
"flags.DEFINE_integer('max_number_of_evaluations', 0,\n", | |
" 'Maximum number of eval iterations. Will loop '\n", | |
" 'indefinitely upon nonpositive values.')\n", | |
"\n", | |
"\n", | |
"def main(unused_argv):\n", | |
" tf.logging.set_verbosity(tf.logging.INFO)\n", | |
" # Get dataset-dependent information.\n", | |
" dataset = segmentation_dataset.get_dataset(\n", | |
" FLAGS.dataset, FLAGS.eval_split, dataset_dir=FLAGS.dataset_dir)\n", | |
"\n", | |
" tf.gfile.MakeDirs(FLAGS.eval_logdir)\n", | |
" tf.logging.info('Evaluating on %s set', FLAGS.eval_split)\n", | |
"\n", | |
" with tf.Graph().as_default():\n", | |
" samples = input_generator.get(\n", | |
" dataset,\n", | |
" FLAGS.eval_crop_size,\n", | |
" FLAGS.eval_batch_size,\n", | |
" min_resize_value=FLAGS.min_resize_value,\n", | |
" max_resize_value=FLAGS.max_resize_value,\n", | |
" resize_factor=FLAGS.resize_factor,\n", | |
" dataset_split=FLAGS.eval_split,\n", | |
" is_training=False,\n", | |
" model_variant=FLAGS.model_variant)\n", | |
"\n", | |
" model_options = common.ModelOptions(\n", | |
" outputs_to_num_classes={common.OUTPUT_TYPE: dataset.num_classes},\n", | |
" crop_size=FLAGS.eval_crop_size,\n", | |
" atrous_rates=FLAGS.atrous_rates,\n", | |
" output_stride=FLAGS.output_stride)\n", | |
"\n", | |
" if tuple(FLAGS.eval_scales) == (1.0,):\n", | |
" tf.logging.info('Performing single-scale test.')\n", | |
" predictions = model.predict_labels(samples[common.IMAGE], model_options,\n", | |
" image_pyramid=FLAGS.image_pyramid)\n", | |
" else:\n", | |
" tf.logging.info('Performing multi-scale test.')\n", | |
" predictions = model.predict_labels_multi_scale(\n", | |
" samples[common.IMAGE],\n", | |
" model_options=model_options,\n", | |
" eval_scales=FLAGS.eval_scales,\n", | |
" add_flipped_images=FLAGS.add_flipped_images)\n", | |
" predictions = predictions[common.OUTPUT_TYPE]\n", | |
" predictions = tf.reshape(predictions, shape=[-1])\n", | |
" labels = tf.reshape(samples[common.LABEL], shape=[-1])\n", | |
" weights = tf.to_float(tf.not_equal(labels, dataset.ignore_label))\n", | |
"\n", | |
" # Set ignore_label regions to label 0, because metrics.mean_iou requires\n", | |
" # range of labels = [0, dataset.num_classes). Note the ignore_label regions\n", | |
" # are not evaluated since the corresponding regions contain weights = 0.\n", | |
" labels = tf.where(\n", | |
" tf.equal(labels, dataset.ignore_label), tf.zeros_like(labels), labels)\n", | |
"\n", | |
" predictions_tag = 'miou'\n", | |
" for eval_scale in FLAGS.eval_scales:\n", | |
" predictions_tag += '_' + str(eval_scale)\n", | |
" if FLAGS.add_flipped_images:\n", | |
" predictions_tag += '_flipped'\n", | |
"\n", | |
" # Define the evaluation metric.\n", | |
" metric_map = {}\n", | |
" metric_map[predictions_tag] = tf.metrics.mean_iou(\n", | |
" predictions, labels, dataset.num_classes, weights=weights)\n", | |
"\n", | |
" metrics_to_values, metrics_to_updates = (\n", | |
" tf.contrib.metrics.aggregate_metric_map(metric_map))\n", | |
"\n", | |
" for metric_name, metric_value in six.iteritems(metrics_to_values):\n", | |
" slim.summaries.add_scalar_summary(\n", | |
" metric_value, metric_name, print_summary=True)\n", | |
"\n", | |
" num_batches = int(\n", | |
" math.ceil(dataset.num_samples / float(FLAGS.eval_batch_size)))\n", | |
"\n", | |
" tf.logging.info('Eval num images %d', dataset.num_samples)\n", | |
" tf.logging.info('Eval batch size %d and num batch %d',\n", | |
" FLAGS.eval_batch_size, num_batches)\n", | |
"\n", | |
" num_eval_iters = None\n", | |
" if FLAGS.max_number_of_evaluations > 0:\n", | |
" num_eval_iters = FLAGS.max_number_of_evaluations\n", | |
" slim.evaluation.evaluation_loop(\n", | |
" master=FLAGS.master,\n", | |
" checkpoint_dir=FLAGS.checkpoint_dir,\n", | |
" logdir=FLAGS.eval_logdir,\n", | |
" num_evals=num_batches,\n", | |
" eval_op=list(metrics_to_updates.values()),\n", | |
" max_number_of_evaluations=num_eval_iters,\n", | |
" eval_interval_secs=FLAGS.eval_interval_secs)\n", | |
"\n", | |
"\n", | |
"if __name__ == '__main__':\n", | |
" flags.mark_flag_as_required('checkpoint_dir')\n", | |
" flags.mark_flag_as_required('eval_logdir')\n", | |
" flags.mark_flag_as_required('dataset_dir')\n", | |
" tf.app.run()\n", | |
"\"\"\"\n", | |
"with open(DEEPLAB_DIR+'/eval.py', \"w\") as file:\n", | |
" file.write(EVAL)" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "qgre2l6dfZF2", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"#@title vis.py {display-mode: \"form\"}\n", | |
"\n", | |
"# This code will be hidden when the notebook is loaded.\n", | |
"\n", | |
"VIS = \"\"\"\n", | |
"# Copyright 2018 The TensorFlow Authors All Rights Reserved.\n", | |
"#\n", | |
"# Licensed under the Apache License, Version 2.0 (the \"License\");\n", | |
"# you may not use this file except in compliance with the License.\n", | |
"# You may obtain a copy of the License at\n", | |
"#\n", | |
"# http://www.apache.org/licenses/LICENSE-2.0\n", | |
"#\n", | |
"# Unless required by applicable law or agreed to in writing, software\n", | |
"# distributed under the License is distributed on an \"AS IS\" BASIS,\n", | |
"# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", | |
"# See the License for the specific language governing permissions and\n", | |
"# limitations under the License.\n", | |
"# ==============================================================================\n", | |
"\n", | |
"\n", | |
"import math\n", | |
"import os.path\n", | |
"import time\n", | |
"import numpy as np\n", | |
"import tensorflow as tf\n", | |
"from deeplab import common\n", | |
"from deeplab import model\n", | |
"from deeplab.datasets import segmentation_dataset\n", | |
"from deeplab.utils import input_generator\n", | |
"from deeplab.utils import save_annotation\n", | |
"\n", | |
"slim = tf.contrib.slim\n", | |
"\n", | |
"flags = tf.app.flags\n", | |
"\n", | |
"FLAGS = flags.FLAGS\n", | |
"\n", | |
"flags.DEFINE_string('master', '', 'BNS name of the tensorflow server')\n", | |
"\n", | |
"# Settings for log directories.\n", | |
"\n", | |
"flags.DEFINE_string('vis_logdir', None, 'Where to write the event logs.')\n", | |
"\n", | |
"flags.DEFINE_string('checkpoint_dir', None, 'Directory of model checkpoints.')\n", | |
"\n", | |
"# Settings for visualizing the model.\n", | |
"\n", | |
"flags.DEFINE_integer('vis_batch_size', 1,\n", | |
" 'The number of images in each batch during evaluation.')\n", | |
"\n", | |
"flags.DEFINE_multi_integer('vis_crop_size', [513, 513],\n", | |
" 'Crop size [height, width] for visualization.')\n", | |
"\n", | |
"flags.DEFINE_integer('eval_interval_secs', 60 * 5,\n", | |
" 'How often (in seconds) to run evaluation.')\n", | |
"\n", | |
"# For `xception_65`, use atrous_rates = [12, 24, 36] if output_stride = 8, or\n", | |
"# rates = [6, 12, 18] if output_stride = 16. For `mobilenet_v2`, use None. Note\n", | |
"# one could use different atrous_rates/output_stride during training/evaluation.\n", | |
"flags.DEFINE_multi_integer('atrous_rates', None,\n", | |
" 'Atrous rates for atrous spatial pyramid pooling.')\n", | |
"\n", | |
"flags.DEFINE_integer('output_stride', 16,\n", | |
" 'The ratio of input to output spatial resolution.')\n", | |
"\n", | |
"# Change to [0.5, 0.75, 1.0, 1.25, 1.5, 1.75] for multi-scale test.\n", | |
"flags.DEFINE_multi_float('eval_scales', [1.0],\n", | |
" 'The scales to resize images for evaluation.')\n", | |
"\n", | |
"# Change to True for adding flipped images during test.\n", | |
"flags.DEFINE_bool('add_flipped_images', False,\n", | |
" 'Add flipped images for evaluation or not.')\n", | |
"\n", | |
"# Dataset settings.\n", | |
"\n", | |
"flags.DEFINE_string('dataset', 'capsicum_annuum',\n", | |
" 'Name of the segmentation dataset.')\n", | |
"\n", | |
"flags.DEFINE_string('vis_split', 'val',\n", | |
" 'Which split of the dataset used for visualizing results')\n", | |
"\n", | |
"flags.DEFINE_string('dataset_dir', None, 'Where the dataset reside.')\n", | |
"\n", | |
"flags.DEFINE_enum('colormap_type', 'pascal', ['pascal', 'cityscapes'],\n", | |
" 'Visualization colormap type.')\n", | |
"\n", | |
"flags.DEFINE_boolean('also_save_raw_predictions', False,\n", | |
" 'Also save raw predictions.')\n", | |
"\n", | |
"flags.DEFINE_integer('max_number_of_iterations', 0,\n", | |
" 'Maximum number of visualization iterations. Will loop '\n", | |
" 'indefinitely upon nonpositive values.')\n", | |
"\n", | |
"# The folder where semantic segmentation predictions are saved.\n", | |
"_SEMANTIC_PREDICTION_SAVE_FOLDER = 'segmentation_results'\n", | |
"\n", | |
"# The folder where raw semantic segmentation predictions are saved.\n", | |
"_RAW_SEMANTIC_PREDICTION_SAVE_FOLDER = 'raw_segmentation_results'\n", | |
"\n", | |
"# The format to save image.\n", | |
"_IMAGE_FORMAT = '%06d_image'\n", | |
"\n", | |
"# The format to save prediction\n", | |
"_PREDICTION_FORMAT = '%06d_prediction'\n", | |
"\n", | |
"# To evaluate Cityscapes results on the evaluation server, the labels used\n", | |
"# during training should be mapped to the labels for evaluation.\n", | |
"_CITYSCAPES_TRAIN_ID_TO_EVAL_ID = [7, 8, 11, 12, 13, 17, 19, 20, 21, 22,\n", | |
" 23, 24, 25, 26, 27, 28, 31, 32, 33]\n", | |
"\n", | |
"\n", | |
"def _convert_train_id_to_eval_id(prediction, train_id_to_eval_id):\n", | |
"\n", | |
" converted_prediction = prediction.copy()\n", | |
" for train_id, eval_id in enumerate(train_id_to_eval_id):\n", | |
" converted_prediction[prediction == train_id] = eval_id\n", | |
"\n", | |
" return converted_prediction\n", | |
"\n", | |
"\n", | |
"def _process_batch(sess, original_images, semantic_predictions, image_names,\n", | |
" image_heights, image_widths, image_id_offset, save_dir,\n", | |
" raw_save_dir, train_id_to_eval_id=None):\n", | |
" \n", | |
" (original_images,\n", | |
" semantic_predictions,\n", | |
" image_names,\n", | |
" image_heights,\n", | |
" image_widths) = sess.run([original_images, semantic_predictions,\n", | |
" image_names, image_heights, image_widths])\n", | |
"\n", | |
" num_image = semantic_predictions.shape[0]\n", | |
" for i in range(num_image):\n", | |
" image_height = np.squeeze(image_heights[i])\n", | |
" image_width = np.squeeze(image_widths[i])\n", | |
" original_image = np.squeeze(original_images[i])\n", | |
" semantic_prediction = np.squeeze(semantic_predictions[i])\n", | |
" crop_semantic_prediction = semantic_prediction[:image_height, :image_width]\n", | |
"\n", | |
" # Save image.\n", | |
" save_annotation.save_annotation(\n", | |
" original_image, save_dir, _IMAGE_FORMAT % (image_id_offset + i),\n", | |
" add_colormap=False)\n", | |
"\n", | |
" # Save prediction.\n", | |
" save_annotation.save_annotation(\n", | |
" crop_semantic_prediction, save_dir,\n", | |
" _PREDICTION_FORMAT % (image_id_offset + i), add_colormap=True,\n", | |
" colormap_type=FLAGS.colormap_type)\n", | |
"\n", | |
" if FLAGS.also_save_raw_predictions:\n", | |
" image_filename = os.path.basename(image_names[i])\n", | |
"\n", | |
" if train_id_to_eval_id is not None:\n", | |
" crop_semantic_prediction = _convert_train_id_to_eval_id(\n", | |
" crop_semantic_prediction,\n", | |
" train_id_to_eval_id)\n", | |
" save_annotation.save_annotation(\n", | |
" crop_semantic_prediction, raw_save_dir, image_filename,\n", | |
" add_colormap=False)\n", | |
"\n", | |
"\n", | |
"def main(unused_argv):\n", | |
" tf.logging.set_verbosity(tf.logging.INFO)\n", | |
" # Get dataset-dependent information.\n", | |
" dataset = segmentation_dataset.get_dataset(\n", | |
" FLAGS.dataset, FLAGS.vis_split, dataset_dir=FLAGS.dataset_dir)\n", | |
" train_id_to_eval_id = None\n", | |
" if dataset.name == segmentation_dataset.get_cityscapes_dataset_name():\n", | |
" tf.logging.info('Cityscapes requires converting train_id to eval_id.')\n", | |
" train_id_to_eval_id = _CITYSCAPES_TRAIN_ID_TO_EVAL_ID\n", | |
"\n", | |
" # Prepare for visualization.\n", | |
" tf.gfile.MakeDirs(FLAGS.vis_logdir)\n", | |
" save_dir = os.path.join(FLAGS.vis_logdir, _SEMANTIC_PREDICTION_SAVE_FOLDER)\n", | |
" tf.gfile.MakeDirs(save_dir)\n", | |
" raw_save_dir = os.path.join(\n", | |
" FLAGS.vis_logdir, _RAW_SEMANTIC_PREDICTION_SAVE_FOLDER)\n", | |
" tf.gfile.MakeDirs(raw_save_dir)\n", | |
"\n", | |
" tf.logging.info('Visualizing on %s set', FLAGS.vis_split)\n", | |
"\n", | |
" g = tf.Graph()\n", | |
" with g.as_default():\n", | |
" samples = input_generator.get(dataset,\n", | |
" FLAGS.vis_crop_size,\n", | |
" FLAGS.vis_batch_size,\n", | |
" min_resize_value=FLAGS.min_resize_value,\n", | |
" max_resize_value=FLAGS.max_resize_value,\n", | |
" resize_factor=FLAGS.resize_factor,\n", | |
" dataset_split=FLAGS.vis_split,\n", | |
" is_training=False,\n", | |
" model_variant=FLAGS.model_variant)\n", | |
"\n", | |
" model_options = common.ModelOptions(\n", | |
" outputs_to_num_classes={common.OUTPUT_TYPE: dataset.num_classes},\n", | |
" crop_size=FLAGS.vis_crop_size,\n", | |
" atrous_rates=FLAGS.atrous_rates,\n", | |
" output_stride=FLAGS.output_stride)\n", | |
"\n", | |
" if tuple(FLAGS.eval_scales) == (1.0,):\n", | |
" tf.logging.info('Performing single-scale test.')\n", | |
" predictions = model.predict_labels(\n", | |
" samples[common.IMAGE],\n", | |
" model_options=model_options,\n", | |
" image_pyramid=FLAGS.image_pyramid)\n", | |
" else:\n", | |
" tf.logging.info('Performing multi-scale test.')\n", | |
" predictions = model.predict_labels_multi_scale(\n", | |
" samples[common.IMAGE],\n", | |
" model_options=model_options,\n", | |
" eval_scales=FLAGS.eval_scales,\n", | |
" add_flipped_images=FLAGS.add_flipped_images)\n", | |
" predictions = predictions[common.OUTPUT_TYPE]\n", | |
"\n", | |
" if FLAGS.min_resize_value and FLAGS.max_resize_value:\n", | |
" # Only support batch_size = 1, since we assume the dimensions of original\n", | |
" # image after tf.squeeze is [height, width, 3].\n", | |
" assert FLAGS.vis_batch_size == 1\n", | |
"\n", | |
" # Reverse the resizing and padding operations performed in preprocessing.\n", | |
" # First, we slice the valid regions (i.e., remove padded region) and then\n", | |
" # we reisze the predictions back.\n", | |
" original_image = tf.squeeze(samples[common.ORIGINAL_IMAGE])\n", | |
" original_image_shape = tf.shape(original_image)\n", | |
" predictions = tf.slice(\n", | |
" predictions,\n", | |
" [0, 0, 0],\n", | |
" [1, original_image_shape[0], original_image_shape[1]])\n", | |
" resized_shape = tf.to_int32([tf.squeeze(samples[common.HEIGHT]),\n", | |
" tf.squeeze(samples[common.WIDTH])])\n", | |
" predictions = tf.squeeze(\n", | |
" tf.image.resize_images(tf.expand_dims(predictions, 3),\n", | |
" resized_shape,\n", | |
" method=tf.image.ResizeMethod.NEAREST_NEIGHBOR,\n", | |
" align_corners=True), 3)\n", | |
"\n", | |
" tf.train.get_or_create_global_step()\n", | |
" saver = tf.train.Saver(slim.get_variables_to_restore())\n", | |
" sv = tf.train.Supervisor(graph=g,\n", | |
" logdir=FLAGS.vis_logdir,\n", | |
" init_op=tf.global_variables_initializer(),\n", | |
" summary_op=None,\n", | |
" summary_writer=None,\n", | |
" global_step=None,\n", | |
" saver=saver)\n", | |
" num_batches = int(math.ceil(\n", | |
" dataset.num_samples / float(FLAGS.vis_batch_size)))\n", | |
" last_checkpoint = None\n", | |
"\n", | |
" # Loop to visualize the results when new checkpoint is created.\n", | |
" num_iters = 0\n", | |
" while (FLAGS.max_number_of_iterations <= 0 or\n", | |
" num_iters < FLAGS.max_number_of_iterations):\n", | |
" num_iters += 1\n", | |
" last_checkpoint = slim.evaluation.wait_for_new_checkpoint(\n", | |
" FLAGS.checkpoint_dir, last_checkpoint)\n", | |
" start = time.time()\n", | |
" tf.logging.info(\n", | |
" 'Starting visualization at ' + time.strftime('%Y-%m-%d-%H:%M:%S',\n", | |
" time.gmtime()))\n", | |
" tf.logging.info('Visualizing with model %s', last_checkpoint)\n", | |
"\n", | |
" with sv.managed_session(FLAGS.master,\n", | |
" start_standard_services=False) as sess:\n", | |
" sv.start_queue_runners(sess)\n", | |
" sv.saver.restore(sess, last_checkpoint)\n", | |
"\n", | |
" image_id_offset = 0\n", | |
" for batch in range(num_batches):\n", | |
" tf.logging.info('Visualizing batch %d / %d', batch + 1, num_batches)\n", | |
" _process_batch(sess=sess,\n", | |
" original_images=samples[common.ORIGINAL_IMAGE],\n", | |
" semantic_predictions=predictions,\n", | |
" image_names=samples[common.IMAGE_NAME],\n", | |
" image_heights=samples[common.HEIGHT],\n", | |
" image_widths=samples[common.WIDTH],\n", | |
" image_id_offset=image_id_offset,\n", | |
" save_dir=save_dir,\n", | |
" raw_save_dir=raw_save_dir,\n", | |
" train_id_to_eval_id=train_id_to_eval_id)\n", | |
" image_id_offset += FLAGS.vis_batch_size\n", | |
"\n", | |
" tf.logging.info(\n", | |
" 'Finished visualization at ' + time.strftime('%Y-%m-%d-%H:%M:%S',\n", | |
" time.gmtime()))\n", | |
" time_to_next_eval = start + FLAGS.eval_interval_secs - time.time()\n", | |
" if time_to_next_eval > 0:\n", | |
" time.sleep(time_to_next_eval)\n", | |
"\n", | |
"\n", | |
"if __name__ == '__main__':\n", | |
" flags.mark_flag_as_required('checkpoint_dir')\n", | |
" flags.mark_flag_as_required('vis_logdir')\n", | |
" flags.mark_flag_as_required('dataset_dir')\n", | |
" tf.app.run()\n", | |
"\"\"\"\n", | |
"with open(DEEPLAB_DIR+'/vis.py', \"w\") as file:\n", | |
" file.write(VIS)" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "XrQIL9Xc1vbC", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"%%bash -s \"$DEEPLAB_DIR\" \"$INIT_DIR\" \"$TRAIN_LOGDIR\" \"$TF_RECORD_DIR\" \"$RESEARCH_DIR\"\n", | |
"\n", | |
"DEEPLAB_DIR=$1\n", | |
"INIT_DIR=$2\n", | |
"TRAIN_LOGDIR=$3\n", | |
"TF_RECORD_DIR=$4\n", | |
"RESEARCH_DIR=$5\n", | |
"\n", | |
"cd \"${RESEARCH_DIR}\"\n", | |
"export PYTHONPATH=$PYTHONPATH:`pwd`/slim\n", | |
"\n", | |
"NUM_ITERATIONS=30000\n", | |
"python \"${DEEPLAB_DIR}\"/train.py \\\n", | |
" --logtostderr \\\n", | |
" --train_split=\"trainval\" \\\n", | |
" --model_variant=\"xception_65\" \\\n", | |
" --atrous_rates=6 \\\n", | |
" --atrous_rates=12 \\\n", | |
" --atrous_rates=18 \\\n", | |
" --output_stride=16 \\\n", | |
" --decoder_output_stride=4 \\\n", | |
" --train_crop_size=513 \\\n", | |
" --train_crop_size=513 \\\n", | |
" --train_batch_size=4 \\\n", | |
" --training_number_of_steps=\"${NUM_ITERATIONS}\" \\\n", | |
" --fine_tune_batch_norm=true \\\n", | |
" --tf_initial_checkpoint=\"${INIT_DIR}/deeplabv3_pascal_train_aug/model.ckpt\" \\\n", | |
" --train_logdir=\"${TRAIN_LOGDIR}\" \\\n", | |
"--dataset_dir=\"${TF_RECORD_DIR}\"" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "6HYm6gu__tA9", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"%%bash -s \"$DEEPLAB_DIR\" \"$EVAL_LOGDIR\" \"$TRAIN_LOGDIR\" \"$TF_RECORD_DIR\" \"$RESEARCH_DIR\"\n", | |
"\n", | |
"DEEPLAB_DIR=$1\n", | |
"EVAL_LOGDIR=$2\n", | |
"TRAIN_LOGDIR=$3\n", | |
"TF_RECORD_DIR=$4\n", | |
"RESEARCH_DIR=$5\n", | |
"\n", | |
"cd \"${RESEARCH_DIR}\"\n", | |
"export PYTHONPATH=$PYTHONPATH:`pwd`/slim\n", | |
"\n", | |
"python \"${DEEPLAB_DIR}\"/eval.py \\\n", | |
" --logtostderr \\\n", | |
" --eval_split=\"val\" \\\n", | |
" --model_variant=\"xception_65\" \\\n", | |
" --atrous_rates=6 \\\n", | |
" --atrous_rates=12 \\\n", | |
" --atrous_rates=18 \\\n", | |
" --output_stride=16 \\\n", | |
" --decoder_output_stride=4 \\\n", | |
" --eval_crop_size=600 \\\n", | |
" --eval_crop_size=800 \\\n", | |
" --checkpoint_dir=\"${TRAIN_LOGDIR}\" \\\n", | |
" --eval_logdir=\"${EVAL_LOGDIR}\" \\\n", | |
" --dataset_dir=\"${TF_RECORD_DIR}\" \\\n", | |
"--max_number_of_evaluations=1" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "3b1Ct8uzAfCL", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"%%bash -s \"$DEEPLAB_DIR\" \"$VIS_LOGDIR\" \"$TRAIN_LOGDIR\" \"$TF_RECORD_DIR\" \"$RESEARCH_DIR\"\n", | |
"\n", | |
"DEEPLAB_DIR=$1\n", | |
"VIS_LOGDIR=$2\n", | |
"TRAIN_LOGDIR=$3\n", | |
"TF_RECORD_DIR=$4\n", | |
"RESEARCH_DIR=$5\n", | |
"\n", | |
"cd \"${RESEARCH_DIR}\"\n", | |
"export PYTHONPATH=$PYTHONPATH:`pwd`/slim\n", | |
"\n", | |
"python \"${DEEPLAB_DIR}\"/vis.py \\\n", | |
" --logtostderr \\\n", | |
" --vis_split=\"val\" \\\n", | |
" --model_variant=\"xception_65\" \\\n", | |
" --atrous_rates=6 \\\n", | |
" --atrous_rates=12 \\\n", | |
" --atrous_rates=18 \\\n", | |
" --output_stride=16 \\\n", | |
" --decoder_output_stride=4 \\\n", | |
" --vis_crop_size=600 \\\n", | |
" --vis_crop_size=800 \\\n", | |
" --checkpoint_dir=\"${TRAIN_LOGDIR}\" \\\n", | |
" --vis_logdir=\"${VIS_LOGDIR}\" \\\n", | |
" --dataset_dir=\"${TF_RECORD_DIR}\" \\\n", | |
"--max_number_of_iterations=1" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "aTtZ0WaiRWi8", | |
"colab_type": "code", | |
"cellView": "both", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"!pwd\n", | |
"from google.colab import files\n", | |
"\n", | |
"\n", | |
"files.download('deeplab/datasets/capsicum_annuum/exp/d/vis/segmentation_results/000003_prediction.png')" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"metadata": { | |
"id": "sglLqSQBIux3", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"cell_type": "code", | |
"source": [ | |
"%%bash -s \"$DEEPLAB_DIR\" \"$EXPORT_DIR\" \"$TRAIN_LOGDIR\" \"$TF_RECORD_DIR\" \"$RESEARCH_DIR\"\n", | |
"\n", | |
"DEEPLAB_DIR=$1\n", | |
"EXPORT_DIR=$2\n", | |
"TRAIN_LOGDIR=$3\n", | |
"TF_RECORD_DIR=$4\n", | |
"RESEARCH_DIR=$5\n", | |
"\n", | |
"NUM_ITERATIONS=10\n", | |
"\n", | |
"cd \"${RESEARCH_DIR}\"\n", | |
"export PYTHONPATH=$PYTHONPATH:`pwd`/slim\n", | |
"\n", | |
"CKPT_PATH=\"${TRAIN_LOGDIR}/model.ckpt-${NUM_ITERATIONS}\"\n", | |
"EXPORT_PATH=\"${EXPORT_DIR}/frozen_inference_graph.pb\"\n", | |
"\n", | |
"python \"${DEEPLAB_DIR}\"/export_model.py \\\n", | |
" --logtostderr \\\n", | |
" --checkpoint_path=\"${CKPT_PATH}\" \\\n", | |
" --export_path=\"${EXPORT_PATH}\" \\\n", | |
" --model_variant=\"xception_65\" \\\n", | |
" --atrous_rates=6 \\\n", | |
" --atrous_rates=12 \\\n", | |
" --atrous_rates=18 \\\n", | |
" --output_stride=16 \\\n", | |
" --decoder_output_stride=4 \\\n", | |
" --num_classes=21 \\\n", | |
" --crop_size=513 \\\n", | |
" --crop_size=513 \\\n", | |
"--inference_scales=1.0" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
} | |
] | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment