Udacity Deep Reinforcement Learning - p2 & `deeprl` env setup

👉 check the drlnd_py310 env setup notes
👉 check the p1 env setup notes
👉 course curriculum
👉 Colab notebooks

Window 11, VSCode, Minicoda, Powershell

👉 copy from the env where cuda and pytorch have been installed
🟢 conda create --name drlnd_p2 --clone drlnd (Python 3.6)

(base) PS D:\github\udacity-deep-reinforcement-learning\python> conda create --name drlnd_p2 --clone drlnd
Source:      D:\Users\*\miniconda3\envs\drlnd
Destination: D:\Users\*\miniconda3\envs\drlnd_p2
Packages: 159
Files: 13970

or check how to install cuda + pytorch in windows 11
conda install cuda --channel "nvidia/label/cuda-12.1.0"
or go to https://pytorch.org/, and select the right version to install
❌ pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
🟢 conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia

pip install torchmeta
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidi

🟢 Follow these steps to install mujoco-py on Windows

get mjpro150 win64
get mjkey.txt

🟢 Powershell $env:PATH += ";C:\Users\*\.mujoco\mjpro150\bin"
Powershell $env:path -split ";" to display path variables

🟢 download mujoco-py-1.50.1.68.tar.gz from https://pypi.org/project/mujoco-py/1.50.1.68/#files

pip install "cython<3"  
pip install mujoco-py-1.50.1.68.tar.gz  
python D:\github\udacity-deep-reinforcement-learning\python\mujoco-py\examples\body_interaction.py

you might need this pip install lockfile and some other packages. install them according to the error messages.
a worse case is that your python version is too high (maybe >=3.9?), you might need to install mujoco_py manually.
now you should be able to see this.

👉 install gym atari and lincense
https://stackoverflow.com/a/69602242

pip install -U gym
pip install -U gym[atari,accept-rom-license]
pip install bleach==1.5.0  
pip install --upgrade numpy   
pip install --upgrade tensorboard

👉 install OpenAI Baselines

pip install --upgrade pip setuptools wheel   
pip install opencv-python==4.5.5.64  
git clone https://github.com/openai/baselines.git
cd baselines
pip install -e .

for python 3.11, you can pip install opencv-python.
and i Successfully installed opencv-python-4.9.0.80.

👉 intall the rest packages for the deeprl folder.
pip install -r .\deeprl_files\requirements.txt

requirements.txt

# torch
# torchvision
# torchmeta 
# gym==0.15.7
# tensorflow==1.15.0
# opencv-python==4.0.0.21
atari-py
scikit-image==0.14.2
tqdm
pandas
pathlib
seaborn
# roboschool==1.0.34
dm-control2gym  
tensorflow-io

for python 3.11, losen the version requirement scikit-image.
I got scikit-image-0.22.0 installed.

👉 test the env setup

run notebooks

python -m ipykernel install --user --name=drlnd_p2
jupyter notebook D:\github\udacity-deep-reinforcement-learning\p2_continuous-control\Continuous_Control.ipynb  
jupyter notebook D:\github\udacity-deep-reinforcement-learning\p2_continuous-control\Crawler.ipynb

🟢 python -m deeprl.component.envs

if __name__ == '__main__':
    import time
    ## num_envs=5 will only create 3 env and cause error
    ## "results = _flatten_list(results)"
    ## in "baselines\baselines\common\vec_env\subproc_vec_env.py"
    task = Task('Hopper-v2', num_envs=3, single_process=False)
    state = task.reset()

    ## This might be helpful for custom env debugging
    # env_dict = gym.envs.registration.registry.env_specs.copy()
    # for item in env_dict.items():
    #     print(item)

    start_time = time.time()
    while True:
        action = np.random.rand(task.action_space.shape[0])
        next_state, reward, done, _ = task.step(action)
        print(done)
        if time.time()-start_time > 10: ## run about 10s
            break  
    task.close()

🟢 run examples:
D:\github\udacity-deep-reinforcement-learning\python\deeprl_files\examples.py

if __name__ == '__main__':
    mkdir('log')
    mkdir('tf_log')
    set_one_thread()
    random_seed()
    # -1 is CPU, an non-negative integer is the index of GPU
    # select_device(-1)
    select_device(0) ## GPU
    
    game = 'Reacher-v2'
    # a2c_continuous(game=game)
    # ppo_continuous(game=game)
    ddpg_continuous(game=game)

you should be able to see something like this in the video.

folder `./python/deeprl` structure

https://github.com/ShangtongZhang/DeepRL
https://github.com/ChalamPVS/Unity-Reacher

🟢 copied python files from repo @ShangtongZhang/DeepRL to repo @Nov05/udacity-deep-reinforcement-learning under the './python' folder.

DeepRL/template_jobs.py

ddpg_continuous(game='Reacher-v2', run=0, env=env,
	remark=ddpg_continuous.__name__)

DeepRL/examples.py

def ddpg_continuous(**kwargs):
	config.task_fn = lambda: Task(config.game, env=env)
	run_steps(DDPGAgent(config))

deep_rl/utils/config.py

class Config:
	def __init__(self):
		self.task_fn = None

DeepRL/deep_rl/utils/misc.py

def run_steps(agent):
    config = agent.config
    agent.step()

deep_rl/agent/DDPG_agent.py

class DDPGAgent(BaseAgent):
	self.task = config.task_fn()
	def step(self):

deep_rl/component/envs.py

def make_env(env_id, seed, rank, episode_life=True):
class Task:
    def __init__(self,
                 name,
                 num_envs=1,
		 env=env,
if __name__ == '__main__':
    task = Task('Hopper-v2', 5, single_process=False)

🟢⚠️ issue solved: google colab, matd3 notebook

!pip install protobuf==3.19.0
!export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
[<ipython-input-5-03a59240fed2>](https://localhost:8080/#) in <cell line: 13>()
     11 import numpy
     12 import torch
---> 13 from unityagents import UnityEnvironment
     14 
     15 import matplotlib.pyplot as plt

4 frames
[/usr/local/lib/python3.10/dist-packages/google/protobuf/descriptor.py](https://localhost:8080/#) in __new__(cls, name, full_name, index, number, type, cpp_type, label, default_value, message_type, enum_type, containing_type, is_extension, extension_scope, options, serialized_options, has_default_value, containing_oneof, json_name, file, create_key)
    551                 has_default_value=True, containing_oneof=None, json_name=None,
    552                 file=None, create_key=None):  # pylint: disable=redefined-builtin
--> 553       _message.Message._CheckCalledFromGeneratedFile()
    554       if is_extension:
    555         return _message.default_pool.FindExtensionByName(full_name)

TypeError: Descriptors cannot be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
 1. Downgrade the protobuf package to 3.20.x or lower.
 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates

Nov05/20240223_udacity_drlnd_p2_env.md

Udacity Deep Reinforcement Learning - p2 & `deeprl` env setup

folder `./python/deeprl` structure

Nov05 commented Nov 16, 2024 •

edited

Loading

Nov05/20240223_udacity_drlnd_p2_env.md

Udacity Deep Reinforcement Learning - p2 & deeprl env setup

folder ./python/deeprl structure

Nov05 commented Nov 16, 2024 • edited Loading

Udacity Deep Reinforcement Learning - p2 & `deeprl` env setup

folder `./python/deeprl` structure

Nov05 commented Nov 16, 2024 •

edited

Loading