Skip to content

Instantly share code, notes, and snippets.

@pmineiro
pmineiro / opecsbern1f1.ipynb
Last active October 17, 2022 18:56
Updated version of https://gist.github.com/pmineiro/5863eb0ba0b1f6963447f8f500bf0f1c which uses 1F1(...) in order to be a well defined test supermartingale (computationally, it's almost identical).
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@pmineiro
pmineiro / approxpolygammaone.ipynb
Last active August 17, 2022 18:40
Approximation of PolyGamma[1, b] for b >= 1 ... for the closed-form pdrop portion of the CS
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@pmineiro
pmineiro / opecsnsbet.ipynb
Last active August 16, 2022 00:29
This is an improvement over empirical Bernstein based OPE confidence sequence. The nonstationary generalization of a betting e-process provably dominates the empirical Bernstein e-process and does not require finite variance.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@pmineiro
pmineiro / opecsdynrangepdrop.ipynb
Last active August 3, 2022 19:20
An off-policy confidence sequence suitable for general purposes which supports oblivious data censorship.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@pmineiro
pmineiro / opecsdynrange.ipynb
Last active July 28, 2022 01:05
OPE CS with reward range robustness
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@pmineiro
pmineiro / opecsbern.ipynb
Last active June 9, 2022 16:38
The latest in OPE-CS. This can track the running mean of a predictable policy sequence in a nonstationary environment and does not require an explicit importance weight upper bound. For a fixed policy in a stationary environment the running intersection can be used to shrink the interval monotonically.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@pmineiro
pmineiro / empBSew.ipynb
Created April 15, 2022 16:32
Empirical Bernstein CS with E[w]=1 constraint works well
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@pmineiro
pmineiro / mnistactiondependence.ipynb
Last active February 17, 2022 17:39
IGL with action dependent feedback, mnist demo
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.