-
-
Save brendansudol/310beb9d8e6fe033229c4dac370acafb to your computer and use it in GitHub Desktop.
Finding similarities with a neural network that trained for object classification.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"First we load the network and check that we have separated fc7 into separate blobs so the ReLU pass does not override our input." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 5, | |
"metadata": { | |
"cellView": "both", | |
"colab_type": "code", | |
"collapsed": false, | |
"id": "i9hkSm1IOZNR" | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"data: 1505280 = (10, 3, 224, 224)\n", | |
"conv1_1: 32112640 = (10, 64, 224, 224)\n", | |
"conv1_2: 32112640 = (10, 64, 224, 224)\n", | |
"pool1: 8028160 = (10, 64, 112, 112)\n", | |
"conv2_1: 16056320 = (10, 128, 112, 112)\n", | |
"conv2_2: 16056320 = (10, 128, 112, 112)\n", | |
"pool2: 4014080 = (10, 128, 56, 56)\n", | |
"conv3_1: 8028160 = (10, 256, 56, 56)\n", | |
"conv3_2: 8028160 = (10, 256, 56, 56)\n", | |
"conv3_3: 8028160 = (10, 256, 56, 56)\n", | |
"conv3_4: 8028160 = (10, 256, 56, 56)\n", | |
"pool3: 2007040 = (10, 256, 28, 28)\n", | |
"conv4_1: 4014080 = (10, 512, 28, 28)\n", | |
"conv4_2: 4014080 = (10, 512, 28, 28)\n", | |
"conv4_3: 4014080 = (10, 512, 28, 28)\n", | |
"conv4_4: 4014080 = (10, 512, 28, 28)\n", | |
"pool4: 1003520 = (10, 512, 14, 14)\n", | |
"conv5_1: 1003520 = (10, 512, 14, 14)\n", | |
"conv5_2: 1003520 = (10, 512, 14, 14)\n", | |
"conv5_3: 1003520 = (10, 512, 14, 14)\n", | |
"conv5_4: 1003520 = (10, 512, 14, 14)\n", | |
"pool5: 250880 = (10, 512, 7, 7)\n", | |
"fc6: 40960 = (10, 4096)\n", | |
"fc6_fc6_0_split_0: 40960 = (10, 4096)\n", | |
"fc6_fc6_0_split_1: 40960 = (10, 4096)\n", | |
"fc6_fc6_0_split_2: 40960 = (10, 4096)\n", | |
"relu6: 40960 = (10, 4096)\n", | |
"drop6: 40960 = (10, 4096)\n", | |
"fc7: 40960 = (10, 4096)\n", | |
"relu7: 40960 = (10, 4096)\n", | |
"drop7: 40960 = (10, 4096)\n", | |
"fc8: 10000 = (10, 1000)\n", | |
"prob: 10000 = (10, 1000)\n" | |
] | |
} | |
], | |
"source": [ | |
"import numpy as np\n", | |
"from google.protobuf import text_format\n", | |
"import caffe\n", | |
"\n", | |
"# load network\n", | |
"# the prototxt has force_backward: true, and fc7 is separated into multiple blobs\n", | |
"model_name = 'vgg_ilsvrc_19'\n", | |
"model_path = '../caffe/models/' + model_name + '/'\n", | |
"net_fn = model_path + 'deploy-expanded.prototxt'\n", | |
"param_fn = model_path + 'net.caffemodel'\n", | |
"net = caffe.Classifier(net_fn, param_fn)\n", | |
"\n", | |
"# print blob names and sizes\n", | |
"for end in net.blobs.keys():\n", | |
" cur = net.blobs[end]\n", | |
" print end + ': {} = {}'.format(cur.count, cur.data.shape)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Then define a function that optimizes one layer (fc7) to produce a one-hot vector on the output (prob), starting from the layer immediately after fc7 (relu7). We use Nesterov momentum but it's probably overkill as this generally converges quickly even without it." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 6, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"def optimize(net,\n", | |
" hot=0,\n", | |
" step_size=.01,\n", | |
" iter_n=100,\n", | |
" mu=.9,\n", | |
" basename='fc7',\n", | |
" start='relu7',\n", | |
" end='prob'):\n", | |
" base = net.blobs[basename]\n", | |
" first = net.blobs[start]\n", | |
" last = net.blobs[end]\n", | |
" base.data[0] = np.random.normal(.5, .1, base.data[0].shape)\n", | |
" base.diff[0] = 0.\n", | |
" velocity = np.zeros_like(base.data[0])\n", | |
" velocity_previous = np.zeros_like(base.data[0])\n", | |
" for i in range(iter_n):\n", | |
" net.forward(start=start, end=end)\n", | |
" target = np.zeros_like(last.data[0])\n", | |
" target.flat[hot] = 1.\n", | |
" error = target - last.data[0]\n", | |
" last.diff[0] = error\n", | |
" net.backward(start=end, end=start)\n", | |
" grad = base.diff[0]\n", | |
" learning_rate = (step_size / np.abs(grad).mean())\n", | |
" velocity_previous = velocity\n", | |
" velocity = mu * velocity + learning_rate * grad\n", | |
" base.data[0] += -mu * velocity_previous + (1 + mu) * velocity\n", | |
" base.data[0] = np.clip(base.data[0], 0, +1)\n", | |
" return base.data[0]" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Checking that we get different vectors for different \"hot\" choices, and that the `optimize()` function is actually doing what we expect to the net." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 19, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"in: [ 1. 0. 1. 0. 0. 0. 0. 0.]\n", | |
"out: [ 9.91814017e-01 3.98620468e-05 9.90041463e-06 9.02536340e-06\n", | |
" 1.13274482e-05 1.72698910e-05 1.00160096e-05 3.25881274e-06]\n", | |
"in: [ 0. 0. 1. 0. 1. 1. 0. 1.]\n", | |
"out: [ 2.45179963e-05 9.93825078e-01 4.71601061e-06 8.41968267e-06\n", | |
" 8.20327932e-06 1.34181846e-05 5.93410823e-06 7.48563616e-06]\n" | |
] | |
} | |
], | |
"source": [ | |
"print 'in:', optimize(net, hot=0)[0:8]\n", | |
"print 'out:', net.blobs['prob'].data[0,0:8]\n", | |
"print 'in:', optimize(net, hot=1)[0:8]\n", | |
"print 'out:', net.blobs['prob'].data[0,0:8]" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Run `optimize()` for every classification and save to disk in a format `bh_tsne` will be able to parse." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 9, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stderr", | |
"output_type": "stream", | |
"text": [ | |
"100% (1000 of 1000) |#####################| Elapsed Time: 0:07:12 Time: 0:07:12\n" | |
] | |
} | |
], | |
"source": [ | |
"from progressbar import ProgressBar\n", | |
"vectors = []\n", | |
"pbar = ProgressBar()\n", | |
"for i in pbar(range(1000)):\n", | |
" vectors.append(optimize(net, hot=i).copy())\n", | |
"np.savetxt('vectors', vectors, fmt='%.2f', delimiter='\\t')" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Load a list of labels and print them with their associated vectors." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 20, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"[ 1. 0. 1. ..., 1. 0. 1.] tench\n", | |
"[ 0. 0. 1. ..., 1. 1. 1.] goldfish\n", | |
"[ 0. 0. 0. ..., 0. 0. 0.] great white shark\n", | |
"[ 1. 1. 0. ..., 0. 0. 0.] tiger shark\n", | |
"[ 0. 0. 0. ..., 1. 0. 0.] hammerhead\n", | |
"[ 1. 1. 1. ..., 1. 0. 0.] electric ray\n", | |
"[ 0. 0. 0. ..., 0. 0. 1.] stingray\n", | |
"[ 0. 0. 0. ..., 1. 1. 0.] cock\n", | |
"[ 0. 0. 1. ..., 0. 1. 0.] hen\n", | |
"[ 0. 1. 0. ..., 1. 0. 0.] ostrich\n" | |
] | |
} | |
], | |
"source": [ | |
"labels = []\n", | |
"with open('words') as f:\n", | |
" for line in f:\n", | |
" labels.append(line.strip())\n", | |
"for i in range(10):\n", | |
" print vectors[i], labels[i]" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"To double-check that the vectors are representative of some similarity, we set up a nearest neighbor search." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 17, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"NearestNeighbors(algorithm='auto', leaf_size=30, metric='minkowski',\n", | |
" metric_params=None, n_neighbors=10, p=2, radius=100)" | |
] | |
}, | |
"execution_count": 17, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"from sklearn.neighbors import NearestNeighbors\n", | |
"neigh = NearestNeighbors(n_neighbors=10, radius=100)\n", | |
"neigh.fit(vectors)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"And run it on one category to get similar results, and indeed \"electric ray\" is similar to \"stingray\" and the others." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 21, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"0.0 electric ray\n", | |
"39.6456756784 stingray\n", | |
"40.1148800322 dugong\n", | |
"40.6637639674 jellyfish\n", | |
"40.8964570593 tiger shark\n", | |
"41.2111950809 hammerhead\n", | |
"41.2288285063 flatworm\n", | |
"41.3787372934 sea slug\n", | |
"41.5759257263 loggerhead\n", | |
"41.6082491821 grey whale\n" | |
] | |
} | |
], | |
"source": [ | |
"neighbors = neigh.kneighbors([vectors[5]], n_neighbors=10, return_distance=True)\n", | |
"for distance, i in zip(neighbors[0][0], neighbors[1][0]):\n", | |
" print distance, labels[i]" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"collapsed": true | |
}, | |
"outputs": [], | |
"source": [] | |
} | |
], | |
"metadata": { | |
"colabVersion": "0.3.1", | |
"default_view": {}, | |
"kernelspec": { | |
"display_name": "Python 2", | |
"language": "python", | |
"name": "python2" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 2 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython2", | |
"version": "2.7.10" | |
}, | |
"views": {} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 0 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment