Last active
January 24, 2025 19:58
-
-
Save everilae/9697228 to your computer and use it in GitHub Desktop.
Threaded generator wrapper for Python
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# A simple generator wrapper, not sure if it's good for anything at all. | |
# With basic python threading | |
from threading import Thread | |
try: | |
from queue import Queue | |
except ImportError: | |
from Queue import Queue | |
# ... or use multiprocessing versions | |
# WARNING: use sentinel based on value, not identity | |
from multiprocessing import Process, Queue as MpQueue | |
class ThreadedGenerator(object): | |
""" | |
Generator that runs on a separate thread, returning values to calling | |
thread. Care must be taken that the iterator does not mutate any shared | |
variables referenced in the calling thread. | |
""" | |
def __init__(self, iterator, | |
sentinel=object(), | |
queue_maxsize=0, | |
daemon=False, | |
Thread=Thread, | |
Queue=Queue): | |
self._iterator = iterator | |
self._sentinel = sentinel | |
self._queue = Queue(maxsize=queue_maxsize) | |
self._thread = Thread( | |
name=repr(iterator), | |
target=self._run | |
) | |
self._thread.daemon = daemon | |
def __repr__(self): | |
return 'ThreadedGenerator({!r})'.format(self._iterator) | |
def _run(self): | |
try: | |
for value in self._iterator: | |
self._queue.put(value) | |
finally: | |
self._queue.put(self._sentinel) | |
def __iter__(self): | |
self._thread.start() | |
for value in iter(self._queue.get, self._sentinel): | |
yield value | |
self._thread.join() |
Hi @everilae - I know this gist is quite a few years old, but I still wanted to thank you for sharing it! I was working on something similar when I found this gist, and it was helpful to see what you had done. In case you are interested I have just posted mine here.
@Kaiserouo - I found your comments very useful. I was already keen to make things as watertight as possible, especially if consuming from the generator in a way that might not exhaust it. Because the code cannot know how many values will be consumed, the only way is to provide a close method - preferebly with the class acting as a context manager to invoke it.
My version would therefore write your example as
with ThreadedGenerator(range(1000000), maxq=1) as tg:
print(list(zip(range(6), tg)))
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This is useful, but will have problem for e.g. infinite iterators, queue_maxsize>0, or other scenarios in which you don't exhaust the iterator.
For example, for this code:
This will never halt, since the thread is still running. This also (sort of) happens if you set queue_maxsize=0, it will try to exhaust the iterator.
The main problem is that for this kind of cases where we just stop iterating in the middle, the thread wouldn't be closed. But I use this code to zip 2 generators, so if my 2 generators don't have exact same length, I will be in trouble, which in my use case is a bit troublesome to deal with.
I don't think there is a simple solution for this, since for the above code,
__del__
won't be called so you don't know if this kind of scenario occurs. Stoppable thread is another part of implementation that needs (copy & paste) effort.So TL;DR: