I used the cross-entropy method (an evolutionary algorithm / derivative free optimization method) to optimize small two-layer neural networks.
Code used to obtain these results can be found at the url https://github.com/joschu/modular_rl, commit 3324639f82a81288e9d21ddcb6c2a37957cdd361. The command line expression used for all the environments can be found in the text file below. Note that the same exact parameters were used for all tasks. The important parameters are:
hid_sizes=10,5
: hidden layer sizes of MLPextra_std=0.01
: noise added to variance, see [1]