Created
August 22, 2014 16:17
-
-
Save drincruz/b3904cc338fc60a19f44 to your computer and use it in GitHub Desktop.
Simple example of using a Python generator to fetch a row from a MySQL database
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
""" | |
Here is a simple example of how you can | |
use a generator to fetch a row from a database | |
""" | |
import MySQLdb | |
def subscriber_generator(): | |
""" | |
Simple generator example to return a row | |
from a MySQL database | |
""" | |
# Simplicity purposes, we're connecting locally | |
db = MySQLdb.connect(user='root',db='your_db_name') | |
# Get a cursor | |
cursor = db.cursor() | |
# Execute your MySQL query | |
cursor.execute( | |
'SELECT id,first_name FROM subscribers_subscribers ' | |
+ 'WHERE id > %s AND id < %s ' | |
+ 'ORDER BY first_name ASC', (1234,1334)) | |
# Iterate through fetchall(), but yield the row | |
for row in cursor.fetchall(): | |
yield row | |
def main(): | |
""" | |
Main | |
""" | |
# You can use the next() method to get a row | |
print(subscriber_generator().next()) | |
# Or you can iterate through all with a for loop | |
for row in subscriber_generator(): | |
print(row) | |
if '__main__' == __name__: | |
main() |
for Python 3.12 I used:
print(next(subscriber_generator()))
I think it's not the best use of memory resources
How about replacing this
for row in cursor.fetchall():
yield row
with this
batch_size = 1000 # your size
while True:
rows = cursor.fetchmany(batch_size)
if not rows:
break
yield rows
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@odravison thank you! 🙏 it's a bit old, but i'm glad it's useful.
This is a very good question! 💯 I think the important parts for generators are that they get lazy loaded; meaning if we didn't have lines 44:45, they won't get called at all. a more better-versed python expert can probably explain it better than me, but here's a simple example i'll try to explain it with below.
Consider this piece of code, yield_data.py here:
yield_data.py
Since we're using a generator here, we're lazy loading the data (on demand), so we actually don't use much. We are caling
yield_all_rows
, but we're not doing anything with the data. So, our profiler gives us the following:But if we update the call to actually print out values, we actually see the query calls go through. Consider this updated code for yield_data.py
If we run the program now with the profiler, we'll see the following
yield_data.py profiler results
To get back to your point, this last snippet here, will look like a similar code call as if we were just using a list to store the data. So, I guess it really just depends on your needs for data if you wanted to store them in memory or on-demand.