A basic_rl.py
provides a simple implementation of SARSA/Q-learning algorithms (specified by -a
flag) with epsilon-greedy/softmax policies (specified by -p
flag). You can also select the environment other than Roulette-v0 using -e
flag. It also generates a graphical summary of your simulation.
Type the following command in your console to run the simulation using the default setting.
chmod +x basic_rl.py
./basic_rl.py