Last active
August 29, 2015 13:55
-
-
Save portante/8778905 to your computer and use it in GitHub Desktop.
This is UNVERIFIED. Need to mock time.time for Eventlet to verify.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Analysis of Eventlet sleep(0) yields and its implications | |
| ========================================================= | |
| Author: | |
| Date: 2014-02-02 23:53:11 EST | |
| Table of Contents | |
| ================= | |
| 1 Summary | |
| 2 Background | |
| 2.1 Hubs | |
| 2.2 BaseHub | |
| 2.2.1 Timers | |
| 2.2.2 Scheduling | |
| 3 Implications | |
| 1 Summary | |
| ========== | |
| It appears that Eventlet's sleep(0), the canonical way to "yield" the | |
| current execution context is not guaranteed to always yield the CPU | |
| under CPython. | |
| Two timers created sequentially can have Timer() objects with "id" | |
| values in reverse order of their creation timings, causing the second | |
| timer created to fire before the first. | |
| This can manifest as a spawn() followed by a sleep(0) resulting in the | |
| spawned coroutine never running. | |
| 2 Background | |
| ============= | |
| The canonical way to "yield" the current execution context is to call | |
| sleep(0). The eventlet.sleep() method is defined in | |
| eventlet/greenthread.py[1] to schedule the current greenlet to run | |
| again in "seconds" from now. The method uses the current hub's | |
| schedule_call_global method to record the request. For all hubs | |
| which work off of the base class BaseHub[2], schedule_call_global | |
| creates a Timer object and adds it to the list of timers. | |
| The OpenStack Swift project uses the "poll" event let hub by | |
| default[3], with a fallback to the "selects" hub if poll is not | |
| available for some reason. Most of the analysis here flows from that | |
| context. | |
| 2.1 Hubs | |
| --------- | |
| Eventlet provides a set of hubs using different underlying | |
| technologies for cooperative scheduling. Four of them use the provided | |
| BaseHub class for scheduling[4]: | |
| Hub BaseHub libevent Twisted | |
| ----------+---------+----------+--------- | |
| epolls X | |
| kqueue X | |
| poll X | |
| pyevent X | |
| selects X | |
| twistedr X | |
| 2.2 BaseHub | |
| ------------ | |
| The BaseHub class provides all the functionality a given hub | |
| implementation needs, except for how the hub waits for the next event | |
| to event to occur. | |
| 2.2.1 Timers | |
| ~~~~~~~~~~~~~ | |
| The BaseHub maintains its timers in a list of (abstime, Timer()[5]) | |
| tuples, ordered and managed by the heapq module methods. Tuples are | |
| lexicographically ordered, so that only when two abstime values in the | |
| list are equal will Timer() objects be compared. | |
| It appears that Timer() objects define a __lt__ method | |
| which in turn compares the id() of the objects in question. | |
| If two timers in the list have the same absolute time, under | |
| CPython, where memory addresses are used as the "id" of an object, the | |
| timers will be ordered by where the objects are in memory. | |
| 2.2.2 Scheduling | |
| ~~~~~~~~~~~~~~~~~ | |
| The BaseHub[2] schedules all greenlets using a combination of timers | |
| and file descriptor "listeners". Timers are clocked using wall clock | |
| time by default (one can specify the clock as a parameter on hub | |
| creation). Newly created greenlets are scheduled with a delta timeout | |
| of 0. | |
| The hub is always in a loop, defined in BaseHub.run(), which services | |
| all timers first, then waits for events until the next timeout using | |
| the particular hub instance's wait method for FDs. If no FDs are ready | |
| before the next, the hub's wait() method will block the process | |
| waiting for an event or the approximate time of the next timeout. | |
| 3 Implications | |
| =============== | |
| On sufficiently fast systems, it is possible for the parent creating | |
| a child coroutine to be scheduled before its child runs even when | |
| the parent cooperatively yields using sleep(0). | |
| The can occur because of how Eventlet sorts its timer list. When a | |
| system is fast enough to run the coroutine creation, such that the | |
| time between the timestamp taken for the new coroutine schedule and | |
| the time the parent schedules its yield get the same time stamp, it | |
| is possible for the child to never run. | |
| For example: | |
| from eventlet import GreenPool, sleep | |
| thelist = [] | |
| def doset(i): | |
| global thelist | |
| thelist.append(i) | |
| p = GreenPool() | |
| for i in xrange(1000000): | |
| thelist = [] | |
| p.spawn(doset, 10) | |
| sleep(0) | |
| thelist.append(11) | |
| if thelist[0] == 11: | |
| print thelist | |
| print "Done" | |
| [1] [https://github.com/eventlet/eventlet/blob/master/eventlet/greenthread.py] | |
| [2] [https://github.com/eventlet/eventlet/blob/master/eventlet/hubs/hub.py] | |
| [3] [https://github.com/openstack/swift/blob/master/swift/common/utils.py] get_hub() | |
| method | |
| [4] [https://github.com/eventlet/eventlet/blob/master/eventlet/hubs] | |
| [5] [https://github.com/eventlet/eventlet/blob/master/eventlet/hubs/timer.py] |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment