This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import glob | |
| import os | |
| from gym import envs | |
| import shutil | |
| # collect all valid env names | |
| envnames = [] | |
| for env_spec in envs.registry.all(): | |
| name = env_spec.id | |
| if 'ram' not in name and "-v4" in name and 'Deterministic' not in name: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import re | |
| x = r"" | |
| x = re.sub("~", "", x) | |
| x = re.sub(r"\$[\\0-9a-zA-Z\{\},()\[\]_+-=\^|\s*%']*\$", "EQ", x) | |
| x = re.sub(r"emph{", "", x) | |
| x = re.sub(r"ref\s*}", "1", x) | |
| x = re.sub(r"\\citep\{[\w\s\d,-]*\}", "", x) | |
| x = re.sub(r"\\cite\{[\w\s\d,]*\}", "", x) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import numpy as np | |
| import matplotlib.pyplot as plt | |
| import tensorflow as tf | |
| config = tf.ConfigProto() | |
| config.gpu_options.visible_device_list = '0' | |
| config.gpu_options.allow_growth = True | |
| tf.enable_eager_execution(config=config) | |
| PRIOR_SCALE = 2. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # Paper: Szita I, Lörincz A. Learning Tetris using the noisy cross-entropy method.[J]. Neural Computation, 2006, 18(12):2936. | |
| # code: https://gym.openai.com/evaluations/eval_HIz0KjtWSvW06yKKPiaF5A | |
| # annotation | |
| # 1.Linear function approximation, the number of parameters to learn is the feature dimension of the state s + 1(bias) | |
| # 2.Cross-Entropy Method is an evolutionary algorithm that searches for the optimal parameters by iterating. | |
| First, “batch_size” vectors are sampled from a normal distribution of initial \mu and \sigma parameters, | |
| then these parameters are evaluated by a evaluate function, pip the top-n parameter vectors ordered by the evaluation result. | |
| use the new n_elite parameters to estimate the new \mu and \sigma. | |
| # 3. Iterate through the method 2 to update the values of \mu and \sigma. |