Comparing Two MySQL Queries for Top Solution Authors

This summarizes two queries to get the top users by solutions count, their corrected forms, simulated execution plans, and recommendations. For this video: Laravel Performance: Careful with withCount() by Povilas Korop.

Query 1 – Subquery in SELECT

Corrected Query

SELECT
    users.*,
    (
        SELECT COUNT(*)
        FROM replies
        INNER JOIN threads ON threads.solution_reply_id = replies.id
            AND threads.author_id != replies.author_id
        WHERE
            users.id = replies.author_id
            AND replies.replyable_type = 'threads'
            AND replies.created_at > '2024-09-10 11:29:36'
            AND replies.deleted_at IS NULL
    ) AS solutions_count
FROM users
HAVING
    solutions_count > 0
ORDER BY
    solutions_count DESC
LIMIT 5;

Simulated Execution Plan

id	select_type	table	type	possible_keys	key	rows	Extra
1	PRIMARY	users	ALL	NULL	NULL	1000000	Using temporary; Using filesort
2	DEPENDENT SUBQUERY	replies	ref	idx_replies_filters,author_id	author_id	10	Using where
2	DEPENDENT SUBQUERY	threads	ref	idx_threads_solution,solution_reply_id	idx_threads_solution	5	Using where

Execution Flow Diagram

        +------------+
        |   users    |
        +------------+
              |
              v
     For each user, run subquery
              |
        +------------+
        |  replies   |
        +------------+
              |
              v
        +------------+
        |  threads   |
        +------------+
              |
              v
       Count matching replies

Notes:

Performs a full scan of users.
Executes a dependent subquery per user → N+1 pattern.
Can be costly on large tables.

Query 2 – JOIN + GROUP BY

Corrected Query

SELECT
    users.*,
    COUNT(DISTINCT replies.id) AS solutions_count
FROM users
INNER JOIN replies ON replies.author_id = users.id
INNER JOIN threads ON threads.solution_reply_id = replies.id
    AND threads.author_id != replies.author_id
WHERE
    replies.replyable_type = 'threads'
    AND replies.deleted_at IS NULL
    AND replies.created_at > '2024-09-10 11:29:36'
GROUP BY
    users.id
HAVING
    solutions_count > 0
ORDER BY
    solutions_count DESC
LIMIT 5;

Simulated Execution Plan

id	select_type	table	type	possible_keys	key	rows	Extra
1	SIMPLE	replies	range	idx_replies_filters,author_id	idx_replies_filters	500000	Using where; Using temporary; Using filesort
1	SIMPLE	threads	ref	idx_threads_solution,solution_reply_id	idx_threads_solution	5	Using where
1	SIMPLE	users	eq_ref	PRIMARY	PRIMARY	1

Execution Flow Diagram

        +------------+
        |  replies   | <- filtered by date, type, deleted_at
        +------------+
              |
              v
        +------------+
        |  threads   | <- join on solution_reply_id
        +------------+
              |
              v
        +------------+
        |   users    | <- join on author_id
        +------------+
              |
              v
   GROUP BY user, COUNT(distinct replies)

Notes:

Efficient for leaderboards or global top N queries.
Uses GROUP BY with temporary table and filesort.
Works well with proper indexes.

Recommended Indexes

-- For filtering replies efficiently
CREATE INDEX idx_replies_filters
    ON replies (replyable_type, deleted_at, created_at, author_id);

-- For solution lookups
CREATE INDEX idx_threads_solution
    ON threads (solution_reply_id, author_id);

Comparison & Recommendation

Query	Pros	Cons
Subquery in SELECT	Simple to read, per-user count	Executes subquery per user (N+1), expensive on large tables
JOIN + GROUP BY	Fast for global top N, single pass, leverages indexes	Temporary table + filesort may use memory on large datasets

Verdict:

For a leaderboard / top 5 authors → JOIN + GROUP BY (Query 2) is strongly recommended.
For solutions_count alongside a small filtered set of users → Subquery (Query 1) is readable but risky at scale.

Notes

Always test with EXPLAIN to see row estimates and index usage.
Proper indexing is critical for performance.
Query 2 scales better with millions of users and replies.

tegos/comparing-two-my-sql-queries-for-top-solution-authors.md

Comparing Two MySQL Queries for Top Solution Authors

This summarizes two queries to get the top users by solutions count, their corrected forms, simulated execution plans, and recommendations. For this video: Laravel Performance: Careful with withCount() by Povilas Korop.

Query 1 – Subquery in SELECT

Corrected Query

Simulated Execution Plan

Execution Flow Diagram

Query 2 – JOIN + GROUP BY

Corrected Query

Simulated Execution Plan

Execution Flow Diagram

Recommended Indexes

Comparison & Recommendation

Notes