pmineiro/opecsbern.ipynb

Last active June 9, 2022 16:38

Star (0) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/pmineiro/5863eb0ba0b1f6963447f8f500bf0f1c.js"></script>
Save pmineiro/5863eb0ba0b1f6963447f8f500bf0f1c to your computer and use it in GitHub Desktop.

The latest in OPE-CS. This can track the running mean of a predictable policy sequence in a nonstationary environment and does not require an explicit importance weight upper bound. For a fixed policy in a stationary environment the running intersection can be used to shrink the interval monotonically.

Raw

opecsbern.ipynb

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

Author

pmineiro commented Apr 18, 2022

Convergence is more rapid when evaluating policies nearer to the logging policy.

pmineiro/opecsbern.ipynb

pmineiro commented Apr 18, 2022

Uh oh!