Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save FrancescoSaverioZuppichini/0a2af492c841fe6ae2cc46c519c12c3d to your computer and use it in GitHub Desktop.
Save FrancescoSaverioZuppichini/0a2af492c841fe6ae2cc46c519c12c3d to your computer and use it in GitHub Desktop.
OpenAI gym tutorial

Getting Setup: Follow the instruction on https://gym.openai.com/docs

git clone https://github.com/openai/gym
cd gym
pip install -e . # minimal install

Basic Example using CartPole-v0:

Level 1: Getting environment up and running

import gym
env = gym.make('CartPole-v0')
env.reset()
for _ in range(1000): # run for 1000 steps
    env.render()
    action = env.action_space.sampe() # pick a random action
    env.step(action) # take action

Level 2: Running trials(AKA episodes)

import gym
env = gym.make('CartPole-v0')
for i_episode in range(20):
    observation = env.reset() # reset for each new trial
    for t in range(100): # run for 100 timesteps or until done, whichever is first
        env.render()
        action = env.action_space.sample() # select a random action (see https://github.com/openai/gym/wiki/CartPole-v0)
        observation, reward, done, info = env.step(action)
        if done:
            print("Episode finished after {} timesteps".format(t+1))
            break

Level 3: Non-random actions

import gym
env = gym.make('CartPole-v0')
highscore = 0
for i_episode in range(20): # run 20 episodes
  observation = env.reset()
  points = 0 # keep track of the reward each episode
  while True: # run until episode is done
    env.render()
    action = 1 if observation[2] > 0 else 0 # if angle if positive, move right. if angle is negative, move left
    observation, reward, done, info = env.step(action)
    points += reward
    if done:
      if points > highscore: # record high score
        highscore = points
        break
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment