Nov05/20240223_udacity_drlnd_p2_env.md

Last active November 16, 2024 15:05

Star () You must be signed in to star a gist
Fork () You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/Nov05/1d49183a91456a63e13782e5f49436be.js"></script>
Save Nov05/1d49183a91456a63e13782e5f49436be to your computer and use it in GitHub Desktop.

Download ZIP

Raw

20240223_udacity_drlnd_p2_env.md

Udacity Deep Reinforcement Learning - p2 & `deeprl` env setup

👉 check the drlnd_py310 env setup notes
👉 check the p1 env setup notes
👉 course curriculum
👉 Colab notebooks

Window 11, VSCode, Minicoda, Powershell

👉 copy from the env where cuda and pytorch have been installed
🟢 conda create --name drlnd_p2 --clone drlnd (Python 3.6)

(base) PS D:\github\udacity-deep-reinforcement-learning\python> conda create --name drlnd_p2 --clone drlnd
Source:      D:\Users\*\miniconda3\envs\drlnd
Destination: D:\Users\*\miniconda3\envs\drlnd_p2
Packages: 159
Files: 13970

or check how to install cuda + pytorch in windows 11
conda install cuda --channel "nvidia/label/cuda-12.1.0"
or go to https://pytorch.org/, and select the right version to install
❌ pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
🟢 conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia

pip install torchmeta
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidi

🟢 Follow these steps to install mujoco-py on Windows

get mjpro150 win64
get mjkey.txt

🟢 Powershell $env:PATH += ";C:\Users\*\.mujoco\mjpro150\bin"
Powershell $env:path -split ";" to display path variables

🟢 download mujoco-py-1.50.1.68.tar.gz from https://pypi.org/project/mujoco-py/1.50.1.68/#files

pip install "cython<3"  
pip install mujoco-py-1.50.1.68.tar.gz  
python D:\github\udacity-deep-reinforcement-learning\python\mujoco-py\examples\body_interaction.py

you might need this pip install lockfile and some other packages. install them according to the error messages.
a worse case is that your python version is too high (maybe >=3.9?), you might need to install mujoco_py manually.
now you should be able to see this.

👉 install gym atari and lincense
https://stackoverflow.com/a/69602242

pip install -U gym
pip install -U gym[atari,accept-rom-license]
pip install bleach==1.5.0  
pip install --upgrade numpy   
pip install --upgrade tensorboard

👉 install OpenAI Baselines

pip install --upgrade pip setuptools wheel   
pip install opencv-python==4.5.5.64  
git clone https://github.com/openai/baselines.git
cd baselines
pip install -e .

for python 3.11, you can pip install opencv-python.
and i Successfully installed opencv-python-4.9.0.80.

👉 intall the rest packages for the deeprl folder.
pip install -r .\deeprl_files\requirements.txt

requirements.txt

# torch
# torchvision
# torchmeta 
# gym==0.15.7
# tensorflow==1.15.0
# opencv-python==4.0.0.21
atari-py
scikit-image==0.14.2
tqdm
pandas
pathlib
seaborn
# roboschool==1.0.34
dm-control2gym  
tensorflow-io

for python 3.11, losen the version requirement scikit-image.
I got scikit-image-0.22.0 installed.

👉 test the env setup

run notebooks

python -m ipykernel install --user --name=drlnd_p2
jupyter notebook D:\github\udacity-deep-reinforcement-learning\p2_continuous-control\Continuous_Control.ipynb  
jupyter notebook D:\github\udacity-deep-reinforcement-learning\p2_continuous-control\Crawler.ipynb

🟢 python -m deeprl.component.envs

if __name__ == '__main__':
    import time
    ## num_envs=5 will only create 3 env and cause error
    ## "results = _flatten_list(results)"
    ## in "baselines\baselines\common\vec_env\subproc_vec_env.py"
    task = Task('Hopper-v2', num_envs=3, single_process=False)
    state = task.reset()

    ## This might be helpful for custom env debugging
    # env_dict = gym.envs.registration.registry.env_specs.copy()
    # for item in env_dict.items():
    #     print(item)

    start_time = time.time()
    while True:
        action = np.random.rand(task.action_space.shape[0])
        next_state, reward, done, _ = task.step(action)
        print(done)
        if time.time()-start_time > 10: ## run about 10s
            break  
    task.close()

🟢 run examples:
D:\github\udacity-deep-reinforcement-learning\python\deeprl_files\examples.py

if __name__ == '__main__':
    mkdir('log')
    mkdir('tf_log')
    set_one_thread()
    random_seed()
    # -1 is CPU, an non-negative integer is the index of GPU
    # select_device(-1)
    select_device(0) ## GPU
    
    game = 'Reacher-v2'
    # a2c_continuous(game=game)
    # ppo_continuous(game=game)
    ddpg_continuous(game=game)

you should be able to see something like this in the video.

folder `./python/deeprl` structure

https://github.com/ShangtongZhang/DeepRL
https://github.com/ChalamPVS/Unity-Reacher

🟢 copied python files from repo @ShangtongZhang/DeepRL to repo @Nov05/udacity-deep-reinforcement-learning under the './python' folder.

DeepRL/template_jobs.py

ddpg_continuous(game='Reacher-v2', run=0, env=env,
	remark=ddpg_continuous.__name__)

DeepRL/examples.py

def ddpg_continuous(**kwargs):
	config.task_fn = lambda: Task(config.game, env=env)
	run_steps(DDPGAgent(config))

deep_rl/utils/config.py

class Config:
	def __init__(self):
		self.task_fn = None

DeepRL/deep_rl/utils/misc.py

def run_steps(agent):
    config = agent.config
    agent.step()

deep_rl/agent/DDPG_agent.py

class DDPGAgent(BaseAgent):
	self.task = config.task_fn()
	def step(self):

deep_rl/component/envs.py

def make_env(env_id, seed, rank, episode_life=True):
class Task:
    def __init__(self,
                 name,
                 num_envs=1,
		 env=env,
if __name__ == '__main__':
    task = Task('Hopper-v2', 5, single_process=False)

Author

Nov05 commented Apr 7, 2024 •

edited

Loading

🟢⚠️ issue solved: training has been slow. added torch.nn.BatchNorm1d, however, got the following error. my task has multiple unity envs, each env has multiple agents, torch.Size([1, 1, 33]) means there is 1 env 1 agent.

2024-04-07 03:32:29,914 - root - INFO: Episode 0, Step 0, 0.00 s/episode
🟢 Unity environment has been resetted.
👉 torch.Size([1, 1, 33]) BatchNorm1d(33, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)

refer to this set of model training hypermeters
solution: the original code is for mujoco, 1 env with 1 agent, hence the shape of tensors, such as actions and states, are 2 dimensional. for unity, 1 env with multiple agents, hence the shape of tensors need to reduce 1 dimension for the neural networks.

Author

Nov05 commented Apr 9, 2024 •

edited

Loading

🟢⚠️ issue solved: neural network nn.BatchNorm1d layer threw error ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, state_size]) when it was actually evaluating. during training, tensor sizes are usually like [mini_batch_size, state_size], no error will be given. it turned out that i forgot to turn on eval mode of the network. it makes sense that you can't normalize a single channel of values. and this layer probably is skipped during evaluation.

    ## neural network
    config.network_fn = lambda: DeterministicActorCriticNet(
        config.state_dim,  
        config.action_dim,  
        actor_body=FCBody(config.state_dim, (128,128), gate=nn.LeakyReLU, 
                          init_method='uniform_fan_in', 
                          batch_norm=nn.BatchNorm1d,),
        critic_body=FCBody(config.state_dim+config.action_dim, (128,128), gate=nn.LeakyReLU, 
                           init_method='uniform_fan_in', batch_norm=nn.BatchNorm1d),
        actor_opt_fn=lambda params: torch.optim.Adam(params, lr=1e-4),
        ## for the critic optimizer, it seems that 1e-3 won't converge
        critic_opt_fn=lambda params: torch.optim.Adam(params, lr=1e-4, weight_decay=1e-5),  
        # batch_norm=nn.BatchNorm1d,
        )

DeterministicActorCriticNet(
  (phi_body): DummyBody()
  (actor_body): FCBody(
    (layers): ModuleList(
      (0): Linear(in_features=33, out_features=128, bias=True)
      (1): LeakyReLU(negative_slope=0.01)
      (2): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (3): Linear(in_features=128, out_features=128, bias=True)
      (4): LeakyReLU(negative_slope=0.01)
      (5): Linear(in_features=128, out_features=4, bias=True)
      (6): Tanh()
    )
  )
  (critic_body): FCBody(
    (layers): ModuleList(
      (0): Linear(in_features=37, out_features=128, bias=True)
      (1): LeakyReLU(negative_slope=0.01)
      (2): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (3): Linear(in_features=128, out_features=128, bias=True)
      (4): LeakyReLU(negative_slope=0.01)
      (5): Linear(in_features=128, out_features=1, bias=True)
    )
  )
)

Author

Nov05 commented Apr 15, 2024 •

edited

Loading

🟢⚠️ issue solved: alphazero folder jupyter notebook: %matplotlib notebook threw Javascript Error: IPython is not defined.

$ jupyter notebook ..\alphazero\alphazero-TicTacToe-advanced.ipynb

✅solution 1: downgrade $ pip install "notebook<7".
then $ conda install -c conda-forge nbconvert mistune to fix the Jupyter Notebook - 500 Internal Server Error
❌solution 2: upgrade jupyterlab

jupyter lab --version
pip install --upgrade jupyterlab
ipython --version
pip install --upgrade ipython

my env drlnd_py310 upgraded jupyterlab from 4.1.4 to jupyterlab-4.1.6, ipython from 8.22.2 to ipython-8.23.0.

Author

Nov05 commented Oct 19, 2024 •

edited

Loading

🟢⚠️ issue solved: p3 unity tennis game (MADDPG), error raised when forwarding states through the local neural network to get actions.

$ python -m experiments.deeprl_maddpg_continuous --is_training True
RuntimeError: mat1 and mat2 shapes cannot be multiplied (4x24 and 8x128)

  File "D:\github\udacity-deep-reinforcement-learning\python\experiments\deeprl_maddpg_continuous.py", line 133, in <module>
    maddpg_continuous(game='unity-tennis',
  File "D:\github\udacity-deep-reinforcement-learning\python\experiments\deeprl_maddpg_continuous.py", line 76, in maddpg_continuous
    run_episodes(DDPGAgent(config))  ## log by episodes
  File "D:\github\udacity-deep-reinforcement-learning\python\deeprl\utils\misc.py", line 97, in run_episodes
    agent.eval_episodes(by_episode=config.by_episode)
  File "D:\github\udacity-deep-reinforcement-learning\python\deeprl\agent\BaseAgent.py", line 80, in eval_episodes
    episodic_returns = self.eval_episode()
  File "D:\github\udacity-deep-reinforcement-learning\python\deeprl\agent\BaseAgent.py", line 60, in eval_episode
    actions = self.eval_step(self.eval_states)
  File "D:\github\udacity-deep-reinforcement-learning\python\deeprl\agent\DDPG_agent.py", line 134, in eval_step
    actions = to_np(self.network(states))  ## get actions from the local network
  File "D:\Users\guido\miniconda3\envs\drlnd_py310\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\Users\guido\miniconda3\envs\drlnd_py310\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\github\udacity-deep-reinforcement-learning\python\deeprl\network\network_heads.py", line 169, in forward
    action = self.actor(phi)
  File "D:\github\udacity-deep-reinforcement-learning\python\deeprl\network\network_heads.py", line 175, in actor
    x = layer(x)
  File "D:\Users\guido\miniconda3\envs\drlnd_py310\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\Users\guido\miniconda3\envs\drlnd_py310\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Users\guido\miniconda3\envs\drlnd_py310\lib\site-packages\torch\nn\modules\linear.py", line 116, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (4x24 and 8x128)

◼️ Debug: Set breakpoints in ..\python\experiments\deeprl_maddpg_continuous.py to check actions and states shape. Both are fine. However the shape of mat1 is 4x24 for 2 eval envs, 10x24 for 5 eval envs. There is an issue with the input dimensions.

input length is supposed to be 24 - , now it is 8.
check the params in file ..\python\unityagents\brain.py.

class BrainParameters:
    def __init__(self, brain_name, brain_param):
        self.vector_observation_space_size = brain_param["vectorObservationSize"]  ## 8
        self.num_stacked_vector_observations = brain_param["numStackedVectorObservations"]  ## 3

change the code in file ..\python\deeprl\component\envs.py
from brain_params.vector_observation_space_size to brain_params.vector_observation_space_size*brain_params.num_stacked_vector_observations

def get_unity_spaces(brain_params: BrainParameters): 
    """
    tranlate Unity ML-Agents spaces to gym spaces for compatibility with the deeprl and Baselines packages
    """
    if brain_params.vector_observation_space_type=='continuous':
        observation_space = Box(
            float('-inf'), float('inf'), 
            (brain_params.vector_observation_space_size*brain_params.num_stacked_vector_observations,), 
            np.float64)

Author

Nov05 commented Oct 27, 2024 •

edited

Loading

🟢⚠️ issue solved: "Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). " Add .detach() to a (local actors output tensors concatenated together).

$ python -m experiments.deeprl_maddpg_continuous --is_training True
..\python\deeprl\agent\MADDPG_agent.py

Traceback (most recent call last):
  File "D:\Users\guido\miniconda3\envs\drlnd_py310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "D:\Users\guido\miniconda3\envs\drlnd_py310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "D:\github\udacity-deep-reinforcement-learning\python\experiments\deeprl_maddpg_continuous.py", line 142, in <module>
    maddpg_continuous(game='unity-tennis',
  File "D:\github\udacity-deep-reinforcement-learning\python\experiments\deeprl_maddpg_continuous.py", line 85, in maddpg_continuous
    run_episodes(MADDPGAgent(config))  ## log by episodes
  File "D:\github\udacity-deep-reinforcement-learning\python\deeprl\utils\misc.py", line 101, in run_episodes
    agent.step()
  File "D:\github\udacity-deep-reinforcement-learning\python\deeprl\agent\MADDPG_agent.py", line 159, in step
    actor_loss.backward()
  File "D:\Users\guido\miniconda3\envs\drlnd_py310\lib\site-packages\torch\_tensor.py", line 522, in backward
    torch.autograd.backward(
  File "D:\Users\guido\miniconda3\envs\drlnd_py310\lib\site-packages\torch\autograd\__init__.py", line 266, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward.

actor_loss = -self.networks[i].critic(
                    states_.reshape(self.config.mini_batch_size, -1), 
                    a.reshape(self.config.mini_batch_size, -1).detach()
                    ).mean(dim=0)

Author

Nov05 commented Oct 27, 2024 •

edited

Loading

🟢⚠️ issue solved: Tennis game, more than 1 env to train and test. reset the envs once they are done.

solution: when ``self.statesis None, the agent will reset the envs. hence make sureself.states = None ## reset`.

Max episodes:  39%|███████████████████████████████████▉                                                        | 78/200 [00:31<00:41,  2.91it/s]2024-10-27 02:36:37,299 - root - INFO: Episode 78, Step 1456, 0.04 s/episode
2024-10-27 02:36:37,368 - root - INFO: Episode 78, Step 1483, episodic_return_train 0.05000000074505806
2024-10-27 02:36:37,368 - root - INFO: Episode 79, Step 1484, 0.07 s/episode
Process SpawnProcess-2:
Process SpawnProcess-1:
Traceback (most recent call last):
Traceback (most recent call last):
  File "D:\Users\guido\miniconda3\envs\drlnd_py310\lib\multiprocessing\process.py", line 314, in _bootstrap
    self.run()
  File "D:\Users\guido\miniconda3\envs\drlnd_py310\lib\multiprocessing\process.py", line 314, in _bootstrap
    self.run()
  File "D:\Users\guido\miniconda3\envs\drlnd_py310\lib\multiprocessing\process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "D:\Users\guido\miniconda3\envs\drlnd_py310\lib\multiprocessing\process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "D:\github\udacity-deep-reinforcement-learning\python\deeprl\component\envs.py", line 372, in unity_worker
    brain_info = env.step(data)[brain_name] ## info type ".unityagents.brain.BrainInfo"
  File "D:\github\udacity-deep-reinforcement-learning\python\deeprl\component\envs.py", line 372, in unity_worker
    brain_info = env.step(data)[brain_name] ## info type ".unityagents.brain.BrainInfo"
  File "D:\github\udacity-deep-reinforcement-learning\python\unityagents\environment.py", line 384, in step
    raise UnityActionException("⚠️ The episode is completed. Reset the environment with 'reset()'")
  File "D:\github\udacity-deep-reinforcement-learning\python\unityagents\environment.py", line 384, in step
    raise UnityActionException("⚠️ The episode is completed. Reset the environment with 'reset()'")
unityagents.exception.UnityActionException: ⚠️ The episode is completed. Reset the environment with 'reset()'
unityagents.exception.UnityActionException: ⚠️ The episode is completed. Reset the environment with 'reset()'
Max episodes:  40%|████████████████████████████████████▎                                                       | 79/200 [00:38<00:58,  2.06it/s]
Traceback (most recent call last):
  File "D:\Users\guido\miniconda3\envs\drlnd_py310\lib\multiprocessing\connection.py", line 312, in _recv_bytes
    nread, err = ov.GetOverlappedResult(True)
BrokenPipeError: [WinError 109] The pipe has been ended

It seems for the Unity Reacher game (p2), all episodes have the same number of steps to finish. However for the Unity Tennis game, episodes' number of steps vary?
in python\deeprl\agent\BaseAgent.py:

        if self.config.num_workers > 0:  ## agent could have no task when eval
            self.total_episodic_returns = [None] * self.config.task.num_envs   ## added by nov05
            self.episode_dones = [False] * self.config.task.num_envs  ## added by nov05

and in MADDPGAgent and DDPGAgent, change the logic to decide whether all envs have done. do the same to the eval logic:

        ## check whether the episode is done
        for i,(done,info) in enumerate(zip(dones,infos)):
            if np.any(done):  ## or np.all(done) which should be the same
                self.episode_dones[i] = True
                self.total_episodic_returns[i] = info['episodic_return']
        if all(self.episode_dones): ## all envs finish one episode
            ## reset self.episode_dones in "python\deeprl\utils\misc.py"
            ## log train returns
            self.record_online_return(self.total_episodic_returns, 
                                      by_episode=self.config.by_episode)  
            self.states = None  ## reset
            self.total_episodic_returns = [None] * self.task.num_envs  ## reset
            self.total_episodes += 1
        self.total_steps += 1

Author

Nov05 commented Oct 28, 2024 •

edited

Loading

⚠️ issue: p3 Unity Tennis game, MADDPG agent, if it uses the PrioritizedReplay buffer, sampled states etc. will contain nans, which will cause all the neural network outputs, such as a_target (action), q_target (Q-value), a, q, etc. to be nans.

debug: the local critic gets NaNs, hence actor loss is NaN during training. However the target critic and previous local critic forward seem fine. states_ could range [-20, 20] or more, and a (actions) [-1, 1].
actor_loss = -self.networks[i].critic(states_.reshape(self.config.mini_batch_size,-1), a).mean(dim=0)
try to clip the actions to be within the action space, which is [-1,1] for Unity Tennis.
try to clip the states to be within the range of [-10,10].
config.state_normalizer = MeanStdNormalizer()
try to clip the gradients before the optimizers step.
torch.nn.utils.clip_grad_norm_(self.networks[i].critic_body.parameters(), max_norm=1.0)
torch.nn.utils.clip_grad_norm_(self.networks[i].actor_body.parameters(), max_norm=1.0)

debug network parameters:

              q = self.networks[i].critic(states_.reshape(self.config.mini_batch_size, -1), a)
              if torch.isnan(q).any():
                  print('🙄 q', q)
                  for param in self.networks[i].critic_body.parameters():
                      if torch.isnan(param).any():
                          print("🙄 NaN found in parameters")
                      if torch.isinf(param).any():
                          print("🙄 Inf found in parameters")

Then found a bug. I worte something wrong, sampling_probs_ = tensor(transitions.mask).

          sampling_probs_ = tensor(transitions.sampling_prob).unsqueeze(-1).transpose(0, 1)
          sample_weights_ = 1.0 / (sampling_probs_ * self.replay.size())  ## Caution: it might create Inf

Author

Nov05 commented Nov 2, 2024 •

edited

Loading

🟢⚠️ issue: pip install torchrl unsuccessfully. uninstalled it. then got error. run `` to reinstall torchvision. got another error.

no time for this. just reinstalled the drlnd_py310 local environment.
https://gist.github.com/Nov05/36ed6fff08f16f29c364090844eb1d24
create a back up env drlnd_py310 for drlnd_py310.

(drlnd_py310) PS D:\github\udacity-deep-reinforcement-learning\python> python -m experiments.deeprl_maddpg_continuous --is_training True                  
OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.

(drlnd_py310) PS D:\github\udacity-deep-reinforcement-learning> conda deactivate drlnd_py310
(base) PS D:\github\udacity-deep-reinforcement-learning> conda create --name drlnd_py310_backup --clone drlnd_py310
Source:      D:\Users\guido\miniconda3\envs\drlnd_py310
Destination: D:\Users\guido\miniconda3\envs\drlnd_py310_backup
Packages: 115
Files: 40037

Downloading and Extracting Packages:


Downloading and Extracting Packages:

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate drlnd_py310_backup
#
# To deactivate an active environment, use
#

Author

Nov05 commented Nov 16, 2024 •

edited

Loading

🟢⚠️ issue solved: google colab, matd3 notebook

!pip install protobuf==3.19.0
!export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
[<ipython-input-5-03a59240fed2>](https://localhost:8080/#) in <cell line: 13>()
     11 import numpy
     12 import torch
---> 13 from unityagents import UnityEnvironment
     14 
     15 import matplotlib.pyplot as plt

4 frames
[/usr/local/lib/python3.10/dist-packages/google/protobuf/descriptor.py](https://localhost:8080/#) in __new__(cls, name, full_name, index, number, type, cpp_type, label, default_value, message_type, enum_type, containing_type, is_extension, extension_scope, options, serialized_options, has_default_value, containing_oneof, json_name, file, create_key)
    551                 has_default_value=True, containing_oneof=None, json_name=None,
    552                 file=None, create_key=None):  # pylint: disable=redefined-builtin
--> 553       _message.Message._CheckCalledFromGeneratedFile()
    554       if is_extension:
    555         return _message.default_pool.FindExtensionByName(full_name)

TypeError: Descriptors cannot be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
 1. Downgrade the protobuf package to 3.20.x or lower.
 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates

Nov05/20240223_udacity_drlnd_p2_env.md

Udacity Deep Reinforcement Learning - p2 & deeprl env setup

folder ./python/deeprl structure

Nov05 commented Apr 7, 2024 • edited Loading

Nov05 commented Apr 9, 2024 • edited Loading

Nov05 commented Apr 15, 2024 • edited Loading

Nov05 commented Oct 19, 2024 • edited Loading

Nov05 commented Oct 27, 2024 • edited Loading

Nov05 commented Oct 27, 2024 • edited Loading

Nov05 commented Oct 28, 2024 • edited Loading

Nov05 commented Nov 2, 2024 • edited Loading

Nov05 commented Nov 16, 2024 • edited Loading

Udacity Deep Reinforcement Learning - p2 & `deeprl` env setup

folder `./python/deeprl` structure

Nov05 commented Apr 7, 2024 •

edited

Loading

Nov05 commented Apr 9, 2024 •

edited

Loading

Nov05 commented Apr 15, 2024 •

edited

Loading

Nov05 commented Oct 19, 2024 •

edited

Loading

Nov05 commented Oct 27, 2024 •

edited

Loading

Nov05 commented Oct 27, 2024 •

edited

Loading

Nov05 commented Oct 28, 2024 •

edited

Loading

Nov05 commented Nov 2, 2024 •

edited

Loading

Nov05 commented Nov 16, 2024 •

edited

Loading