Python 3.11
has to be downgraded to Python 3.10
, or Multiprocessing
will cause TypeError: code() argument 13 must be str, not int
in both Windows and Linux. Google Colab is currently using Python 3.10 as well.
Windows 11
(64-bit),VSCode
,Powershell
,Miniconda3
,Python 3.10
- repo: https://github.com/Nov05/udacity-deep-reinforcement-learning
- working dir: D:\github\
udacity-deep-reinforcement-learning\python
- package
deeprl
is copied and modified from https://github.com/ShangtongZhang/DeepRL/tree/master/deep_rl
into.\python
. - package
baselines
is installed from https://github.com/openai/baselines branchtf21
. - package
mujoco-py
is installed frommujoco-py-1.50.1.68
. Unity ML-Agents
builds are stored in..\data
, e.g...\data\Banana_Windows_x86_64
.
conda create --name drlnd_py310 python=3.10
conda activate drlnd_py310
cd python
nvidia-smi
conda install cuda --channel "nvidia/label/cuda-12.1.0"
nvcc --version
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
pip install -r requirements_py310.txt
python -m ipykernel install --user --name=drlnd_py310
jupyter notebook ../p1_navigation/Navigation.ipynb
jupyter notebook ../p1_navigation/Navigation_Pixels.ipynb
π’ after the above steps, the env should be ready for project p1, and you should be able to run the p1 notebook from your browser. (check the original setup notes)
- P.S.
requirements_py310.txt
(tensorflow==2.16.1
would cause error.)
tensorflow==2.15.0
bleach
matplotlib
pytest
pyyaml
protobuf==3.20.3
grpcio
numpy==1.23.5
pandas
scipy
ipykernel
jupyter
- get
mjpro150 win64
, placemjpro150
in%USERPROFILE%/.mujoco
. - get
mjkey.txt
, place it in%USERPROFILE%/.mujoco
.
$env:path -split ";"
$env:PATH += ";C:\Users\*\.mujoco\mjpro150\bin"
- download
mujoco-py-1.50.1.68.tar.gz
from https://pypi.org/project/mujoco-py/1.50.1.68/#files, unzip the file and placemujoco-py-1.50.1.68
in%USERPROFILE%/.mujoco
(or anywhere you can manage). from this version about,mujoco-py
doesn't support windows.
tar -xvzf mujoco-py-1.50.1.68.tar.gz
- add the following two lines of code in the file
%USERPROFILE%/.mujoco/mujoco-py-1.50.1.68/setup.py
for installation ofmujoco-py
.
os.add_dll_directory(r"C:/Users/guido/.mujoco/mjpro150/bin")
os.add_dll_directory(r"C:/Users/guido/.mujoco/mujoco-py-1.50.1.68/mujoco_py")
- add the code below in the file
.\deeprl\__init__.py
.
import os, platform
if platform.system()=='Windows':
os.add_dll_directory(r"C:/Users/guido/.mujoco/mjpro150/bin")
os.add_dll_directory(r"C:/Users/guido/.mujoco/mujoco-py-1.50.1.68/mujoco_py")
os.environ['TF_ENABLE_ONEDNN_OPTS'] = '0'
- install the following packages
pip install lockfile glfw imageio
pip install "cython<3"
pip install C:\Users\guido\.mujoco\mujoco-py-1.50.1.68
python mujoco-py\examples\body_interaction.py
π’ Now you should be able to see something like this in the video.
- install OpenAI
Baselines
(forTensorFlow 2
). after the installation, delete the filebaselines\.git
.
pip install --upgrade pip setuptools wheel
pip install opencv-python
git clone --branch tf2 https://github.com/openai/baselines.git
cd baselines
pip install -e .
cd..
- Install OpenAI
Gym
after the installation ofBaselines
.
Baselines will install gym==0.13.1, which would cause errors.
pip install gym==0.14.0
pip install gym[atari,accept-rom-license]
pip install bleach==1.5.0
download this file Atari-2600-VCS-ROM-Collection.zip, unzip it, place the folder 'ROM' in the .\data
dir. if you encounter FileNotFoundError: Could not find module ale_c.dll
, download ale_c.dll
and place it under dir D:\Users\*\miniconda3\envs\drlnd_py310\Lib\site-packages\atari_py\ale_interface\ale_c.dll
(yours would be different). then run the following command again.
python -m atari_py.import_roms D:\github\udacity-deep-reinforcement-learning\data\Atari-2600-VCS-ROM-Collection\ROMS
python -c "import atari_py; print(atari_py.list_games())"
π’ you should be able to see the following list of games.
π’ run a Baselines
example, and you should be able to see the following output.
$python -m baselines.run --alg=ppo2 --env=PongNoFrameskip-v4 --save_path=~/models/PongNoFrameskip-v4_1M_ppo2 --log_path=~/log
- install the rest packages in the
./python/deeprl/requirements.txt
for thedeeprl
pakcage
pip install -r .\deeprl_files\requirements.txt
jupyter notebook D:\github\udacity-deep-reinforcement-learning\p2_continuous-control\Continuous_Control.ipynb
jupyter notebook D:\github\udacity-deep-reinforcement-learning\p2_continuous-control\Crawler.ipynb
python -m tests2.test_deeprl_envs
π’ now you can use deeprl
and should be able to see some output. (check the original p2 & deeprl env setup notes.)
- if you see
WARNING:tensorflow:From D:\Users\*\miniconda3\envs\drlnd_py310\lib\site-packages\keras\src\losses.py:2976: The name tf.losses.sparse_softmax_cross_entropy is deprecated. Please use tf.compat.v1.losses.sparse_softmax_cross_entropy instead.
Suppress the warning function inD:\Users\*\miniconda3\envs\drlnd_py310\Lib\site-packages\tensorflow\python\util\module_wrapper.py
:133.
def _tfmw_add_deprecation_warning(self, name, attr):
return False ##---β
i added this line to bypass the whole function.
- run
deeprl
examples indeerl_files\examples.py
.
e.g. runReacher-v2
with theDDPG
algorithm
python -m deeprl_files.examples
if __name__ == '__main__':
mkdir('log')
mkdir('tf_log')
set_one_thread()
random_seed()
# -1 is CPU, an non-negative integer is the index of GPU
# select_device(-1)
select_device(0) ## GPU
game = 'Reacher-v2'
# a2c_continuous(game=game)
# ppo_continuous(game=game)
ddpg_continuous(game=game)
π’ you should be able to see something like this in the video.
drlnd_py310_backup
fordrlnd_py310
.