π check the drlnd_py310 env setup notes
π check the p1 env setup notes
π course curriculum
π Colab notebooks
from gym.wrappers import Monitor caused ImportError: cannot import name 'Monitor' from 'gym.wrappers'.
- solution (2022'):
from gym.wrappers.record_video import RecordVideo env = gym.make('CartPole-v1', render_mode="rgb_array") env = RecordVideo(env, './video', episode_trigger = lambda episode_number: True) env.reset()
20240218_pong-PPO.ipynb
π training log for reference
1000 episodes, T4 GPU, Wall time: 1h 38min 14s

Episode: 20, score: -15.750000
[-16. -16. -16. -16. -16. -16. -16. -14.]
Episode: 40, score: -12.625000
20240217_pong_REINFORCE.ipynb
π training log for reference
1200 episodes on T4 GPU, Wall time: 2h 12min 12s
Episode: 20, score: -14.500000
[-14. -15. -16. -13. -14. -16. -16. -12.]
Episode: 40, score: -14.500000
20240217_pong_REINFORCE.ipynb
π training log for reference
800 episodes on T4 GPU, Wall time: 1h 17min 44s
Episode: 20, score: -14.000000
[-15. -17. -15. -14. -13. -13. -16. -9.]
Ankit404butfound/PyWhatKit#313
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/Xlib/support/unix_connect.py in get_socket(dname, host, dno)
75 s = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
---> 76 s.connect('/tmp/.X11-unix/X%d' % dno)
77 except OSError as val:
- For this toy game, two
Deep Q-networkmethods are tried out. Since the observations (states) are simple (not in pixels), convolutional layers are not in use. And the evaluation results confirm that linear layers are sufficient for solving the problem.- Double DQN, with 3 linear layers (hidden dims: 256*64, later tried with 64*64)
- Dueling DQN, with 2 linear layers + 2 split linear layers (hidden dims: 64*64)
βͺοΈ The Dueling DQN architecture is displayed as below.
- go to the cell in the notebook
- pip install the latest
mlagentsversion. usingbanana.ymalin the new format.
mlagents_envs.exception.UnityEnvironmentException: Environment shut down with return code -6 (SIGABRT).
π check the colab notebook
π go to the cell
mono_gdb_render_native_backtraces not supported on this platform, unable to find gdb or lldb
- installed
mlagentsrelease 1. usedtrainer_config.yamlin the old format.
π for the course projcts, Unity MLAgents - Banana Collector, etc.
π go to the Banana and VisualBanana notebooks
π go to the course repo
π check course curriculum
Window 11, VSCode, Minicoda, Powershell

