Last active
February 17, 2021 10:55
-
-
Save WhatIThinkAbout/a209e66c14369995a3329dc48dd63670 to your computer and use it in GitHub Desktop.
create and test a set of sockets over a single test run
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| class SocketTester(): | |
| """ create and test a set of sockets over a single test run """ | |
| def __init__(self, socket=PowerSocket, socket_order=socket_order, **kwargs ): | |
| # create supplied socket type with a mean value defined by socket order | |
| self.sockets = [socket((q*2)+2, **kwargs) for q in socket_order] | |
| def charge_and_update(self,socket_index): | |
| """ charge from the chosen socket and update its mean reward value """ | |
| reward = self.sockets[socket_index].charge() | |
| self.sockets[socket_index].update(reward) | |
| def select_socket( self, t ): | |
| """ choose the socket with the current highest mean reward | |
| or arbitrarily select a socket in the case of a tie """ | |
| socket_index = random_argmax([socket.sample() for socket in self.sockets]) | |
| return socket_index | |
| def run( self, number_of_steps ): | |
| """ perform a single run for the defined number of steps """ | |
| for t in range(number_of_steps): | |
| # select a socket | |
| socket_index = self.select_socket(t) | |
| # charge from the chosen socket and update its mean reward value | |
| self.charge_and_update(socket_index) |
Author
WhatIThinkAbout
commented
Feb 17, 2021
via email
Hi,
random_argmax is a function that is declared in the main code file and is the correct function to use. When there are multiple choices that have the same maximum value it will randomly choose between these.
On the other hand, the standard version of argmax, always only chooses the first value when multiple values are present with the maximum value, which isn’t good when you want to encourage exploration in the bandit problem.
If you look at the “Part 2 – The Bandit Framework” notebook it explains it further.
Cheers,
Steve
Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
From: zero<mailto:notifications@github.com>
Sent: 17 February 2021 10:02
To: WhatIThinkAbout<mailto:WhatIThinkAbout@noreply.github.com>
Cc: Steve Roberts<mailto:steve@steveroberts.name>; Author<mailto:author@noreply.github.com>
Subject: Re: WhatIThinkAbout/SocketTester.py
@ho4040 commented on this gist.
sir, I think random_argmax should be argmax.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<https://gist.github.com/a209e66c14369995a3329dc48dd63670#gistcomment-3634182>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/APHIOYRZ2SORDQ7AXPC2GLLS7OH3HANCNFSM4XYBRKQA>.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment