Skip to content

Instantly share code, notes, and snippets.

View nov05's full-sized avatar
πŸ’­
Homo Sapiens

nov05

πŸ’­
Homo Sapiens
View GitHub Profile

⚠️ issue: from gym.wrappers import Monitor caused ImportError: cannot import name 'Monitor' from 'gym.wrappers'.

  • solution (2022'):
    from gym.wrappers.record_video import RecordVideo
    env = gym.make('CartPole-v1', render_mode="rgb_array")
    env = RecordVideo(env, './video',  episode_trigger = lambda episode_number: True)
    env.reset()
    

20240218_pong-PPO.ipynb
πŸ‘‰ training log for reference
1000 episodes, T4 GPU, Wall time: 1h 38min 14s

Episode: 20, score: -15.750000
[-16. -16. -16. -16. -16. -16. -16. -14.]
Episode: 40, score: -12.625000
@nov05
nov05 / 20240218_reinforcement learning_pong training log 1200e.md
Created February 19, 2024 06:00
20240218_reinforcement learning_pong training log 1200e

20240217_pong_REINFORCE.ipynb
πŸ‘‰ training log for reference
1200 episodes on T4 GPU, Wall time: 2h 12min 12s

Episode: 20, score: -14.500000
[-14. -15. -16. -13. -14. -16. -16. -12.]
Episode: 40, score: -14.500000
@nov05
nov05 / 20240218_reinforcement learning_pong training log for reference.md
Last active February 19, 2024 04:05
20240218_reinforcement learning_pong training log for reference

20240217_pong_REINFORCE.ipynb
πŸ‘‰ training log for reference
800 episodes on T4 GPU, Wall time: 1h 17min 44s

Episode: 20, score: -14.000000
[-15. -17. -15. -14. -13. -13. -16.  -9.]
@nov05
nov05 / 20240218_python_PyWhatKit_issue_313.md
Created February 19, 2024 00:21
20240218 python PyWhatKit issue 313

Ankit404butfound/PyWhatKit#313

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/Xlib/support/unix_connect.py in get_socket(dname, host, dno)
     75             s = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
---> 76             s.connect('/tmp/.X11-unix/X%d' % dno)
     77     except OSError as val:
@nov05
nov05 / 20240215_udacity reinforcement learning_DQN project submission.md
Last active February 15, 2024 17:32
πŸ‘‰ Unity ML-Agents `Banana Collectors` Project Submission

πŸ‘‰ Unity ML-Agents Banana Collectors Project Submission

  1. For this toy game, two Deep Q-network methods are tried out. Since the observations (states) are simple (not in pixels), convolutional layers are not in use. And the evaluation results confirm that linear layers are sufficient for solving the problem.
    • Double DQN, with 3 linear layers (hidden dims: 256*64, later tried with 64*64)
    • Dueling DQN, with 2 linear layers + 2 split linear layers (hidden dims: 64*64)

β–ͺ️ The Dueling DQN architecture is displayed as below.

@nov05
nov05 / 20240211_stream unity alagents from colab to twitch.md
Last active February 12, 2024 03:23
20240211 【error】stream unity alagents from colab to twitch

⚠️ error

mlagents_envs.exception.UnityEnvironmentException: Environment shut down with return code -6 (SIGABRT).
@nov05
nov05 / 20240211_stream unity mlagents from colab to twitch.md
Last active February 12, 2024 03:27
20240211 stream unity mlagents display from google colab to twitch

πŸ‘‰ check the colab notebook
πŸ‘‰ go to the cell

⚠️ issue

mono_gdb_render_native_backtraces not supported on this platform, unable to find gdb or lldb  
  • installed mlagents release 1. used trainer_config.yaml in the old format.
@nov05
nov05 / 20240211_udacity_drlnd_mlagents.md
Last active February 25, 2024 11:52
20240211_udacity reinforcement learning unity mlagents env setup