Last active
November 19, 2018 14:29
-
-
Save jasonhdavis/73664baf24bdb595ffe0b66d01703c01 to your computer and use it in GitHub Desktop.
Efficient MongoDB query in Python to group results by day. This uses a single, simple find query to get all results within a date range. We then build a nested dictionary of [year][month][day] and add results to the dictionary structure
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
################################ | |
########## WHY & How ########### | |
################################ | |
# I had written a query inside of a loop that pulled results for every day | |
# This was very ineffecient | |
# While I belive this could be solved with a MongoDB Aggrigation, the syntax seemed complicated | |
# I could not find good examples for what I needed to do (aggrigate results of a day based on timestamp in one query) | |
# This is likely due to my ignorance | |
# Nonetheless, I thought this method may be helpful to someone else having the same issues as me | |
# Finally, dealing with nested dictionaries in python is slightly difficult as a nested key needs to be added with 'update' | |
# In a non-nested dictionary, the syntax is simple, however, when nesting things get complicated | |
# you can not use update for each iteration because it overwrites the previous value | |
# This was solved by building a key list for Year and Month as they are added to the dictionary | |
# This keylist is converted to a String because if your date range is over 12 months, you will duplicate the month key | |
################################ | |
########## Import ############## | |
################################ | |
from datetime import datetime, timedelta | |
from pymongo import MongoClient | |
from pprint import pprint | |
################################ | |
###### Build Dictionary ######## | |
################################ | |
# In my app, user supplies a date range | |
end = datetime.today() | |
start = end - timedelta(days=600) | |
delta = end - start | |
date_dict={} | |
date_dict.update({start.year:{start.month:{start.day: start}}}) | |
# Incriment over days in the range, setting each key value to 0 | |
y_keys = [str(start.year)] | |
m_keys = [str(start.month)+"-"+str(start.year)] | |
for i in range(delta.days + 1): | |
iter_date = start+timedelta(days=i) | |
year = iter_date.year | |
month = iter_date.month | |
day = iter_date.day | |
if str(year) in y_keys and str(month)+"-"+str(year) in m_keys : | |
date_dict[year][month][day] = 0 | |
elif str(year) in y_keys : | |
date_dict[year].update({month:{day:0}}) | |
m_keys.append(str(month)+"-"+str(year)) | |
else : | |
date_dict.update({year:{month:{day:0}}}) | |
y_keys.append(str(year)) | |
m_keys.append(str(month)+"-"+str(year)) | |
################################ | |
# MongDB Query & Data Handling # | |
################################ | |
mongo = MongoClient('mongodb://localhost:27017') | |
range_search = mongo.sales.orders.find({'paymentDate':{'$lt': end, '$gt':start}}) | |
for item in range_search : | |
date = item['paymentDate'] | |
date_dict[date.year][date.month][date.day]+= item['orderTotal'] | |
################################ | |
########## Results ############# | |
################################ | |
## A dictionary of days inside of the requested date range | |
## order totals are summed based on the payment date timestamp |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment