-
-
Save hest/8798884 to your computer and use it in GitHub Desktop.
def get_count(q): | |
count_q = q.statement.with_only_columns([func.count()]).order_by(None) | |
count = q.session.execute(count_q).scalar() | |
return count | |
q = session.query(TestModel).filter(...).order_by(...) | |
# Slow: SELECT COUNT(*) FROM (SELECT ... FROM TestModel WHERE ...) ... | |
print q.count() | |
# Fast: SELECT COUNT(*) FROM TestModel WHERE ... | |
print get_count(q) |
This worked perfectly for me after I upgraded from sa 1.3 to 1.4 and a count query went from 80ms to 800ms.
by switching form the built in .count()
to the suggested first gist the query went back to 80ms. I believe the problem was .count()
loading all columns into python which is not required and very slow for thousands or rows, using with_only_columns
and removing the sub query took it back to 80ms. No idea what broke it in 1.4.
count_q = q.statement.with_only_columns([func.count()]).order_by(None)
count = q.session.execute(count_q).scalar()
I'm not sure why people are putting warnings on this gist or writing very long replies expecting help. If you are stuck ask on Stackoverflow!
My thanks to @hest
This worked perfectly for me after I upgraded from sa 1.3 to 1.4 and a count query went from 80ms to 800ms.
by switching form the built in
.count()
to the suggested first gist the query went back to 80ms. I believe the problem was.count()
loading all columns into python which is not required and very slow for thousands or rows, usingwith_only_columns
and removing the sub query took it back to 80ms. No idea what broke it in 1.4.count_q = q.statement.with_only_columns([func.count()]).order_by(None) count = q.session.execute(count_q).scalar()
I'm not sure why people are putting warnings on this gist or writing very long replies expecting help. If you are stuck ask on Stackoverflow!
My thanks to @hest
This thread is generally a kind of misinformation. If you check the stackoverflow link above, these sqls are exactly the same:
-- on postgres
EXPLAIN ANALYZE SELECT COUNT(*) FROM some_big_table WHERE some_col = 'some_val'
EXPLAIN ANALYZE SELECT COUNT(*) FROM ( SELECT col1, col2, col3, col4 FROM some_big_table WHERE some_col = 'some_val' )
If you find your query executing slow, the first is to try removing order by
.
If you find your query executing slow, the first is to try removing order by.
I tried this, and the methods outlined in the stackoverflow link, and it did not work. Please don't say I'm providing misinformation
by outlining what worked for me it is rude.
Sorry for letting you misunderstand. I meant the gist (which was posted in 2015) is likely misinformation now, not your comment.
This is the method, I'm using
def get_count(self, model_fields, filter_clause):
""" Note: filter_clause should not be 'None' or 'Null' for this method to work """
query = self.session.query().with_entities(*model_fields)
query = query.filter(filter_clause)
count_query = query.statement \
.with_only_columns([func.count()]) \
.order_by(None)
result = query.session.execute(count_query).scalar()
return result
https://dba.stackexchange.com/questions/168022/performance-of-count-in-subquery
ORDER BY
FIRST