Last active
September 23, 2020 14:53
-
-
Save analyticsindiamagazine/74a37fa53422424c3ed40b81a62b3027 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Lets start with importing the libraries required" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"import gym\n", | |
"import numpy as np\n", | |
"import tensorflow as tf" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Loading the Frozen Lake Environment from OpenAI Gym" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"env = gym.make('FrozenLake-v0')" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Define the Placeholder for the Input Data and Variables for the Weights" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"The activation function we are using is the argmax, which returns the maximum value." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"\n", | |
"import gym\n", | |
"import numpy as np\n", | |
"import tensorflow as tf\n", | |
"env = gym.make('FrozenLake-v0')\n", | |
"#The grid with 4x4 gives 16 possible states, hence we have an array of 16 states.\n", | |
"inputs = tf.placeholder(shape=[1,16],dtype=tf.float32)\n", | |
"#Each state has 4 possible outcomes, hence we have 16x4 matrix with weights uniformly distributed\n", | |
"weights = tf.Variable(tf.random_uniform([16,4],0,0.1))\n", | |
"#Find the dot product of inputs and the weights\n", | |
"Q1 = tf.matmul(inputs,weights)\n", | |
"#The next state will be the opted based on the argmax function.\n", | |
"output = tf.argmax(Q1,1)" | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.6.3" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 2 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Thanks Bhaiya its awesome