Skip to content

Instantly share code, notes, and snippets.

@pythonlessons
Created November 10, 2020 14:57
Show Gist options
  • Select an option

  • Save pythonlessons/65ba1196ac301dc4393f9f017117a814 to your computer and use it in GitHub Desktop.

Select an option

Save pythonlessons/65ba1196ac301dc4393f9f017117a814 to your computer and use it in GitHub Desktop.
get_gaes
def get_gaes(self, rewards, dones, values, next_values, gamma = 0.99, lamda = 0.9, normalize=True):
deltas = [r + gamma * (1 - d) * nv - v for r, d, nv, v in zip(rewards, dones, next_values, values)]
deltas = np.stack(deltas)
gaes = copy.deepcopy(deltas)
for t in reversed(range(len(deltas) - 1)):
gaes[t] = gaes[t] + (1 - dones[t]) * gamma * lamda * gaes[t + 1]
target = gaes + values
if normalize:
gaes = (gaes - gaes.mean()) / (gaes.std() + 1e-8)
return np.vstack(gaes), np.vstack(target)
@cnhuhao

cnhuhao commented Aug 31, 2021

Copy link
Copy Markdown

请问这个gaes是什么的简称啊?是哪几个单词的缩写的呢?好几个地方看到这个缩写,但不知道是什么意思

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment