Skip to content

Instantly share code, notes, and snippets.

@rohitdholakia
Created December 25, 2011 10:09
Show Gist options
  • Select an option

  • Save rohitdholakia/1519049 to your computer and use it in GitHub Desktop.

Select an option

Save rohitdholakia/1519049 to your computer and use it in GitHub Desktop.
Some queries on the data
mysql> select * from userdata order by average desc limit 0,10;
+-------+---------+--------+
| id | average | number |
+-------+---------+--------+
| 15617 | 5.00000 | 1 |
| 12047 | 5.00000 | 1 |
| 9551 | 5.00000 | 26 |
| 29199 | 5.00000 | 6 |
| 18881 | 5.00000 | 13 |
| 38940 | 5.00000 | 3 |
| 27064 | 5.00000 | 2 |
| 44081 | 5.00000 | 2 |
| 12842 | 5.00000 | 25 |
| 45163 | 5.00000 | 14 |
+-------+---------+--------+
10 rows in set (0.17 sec)
Observe how users with an average rating of 5 have given such less number of ratings, thus, making the average skewed
mysql> select * from moviedata order by average desc limit 0,10;
+-------+---------+--------+
| id | average | number |
+-------+---------+--------+
| 14961 | 4.72330 | 73335 |
| 7230 | 4.71660 | 73422 |
| 7057 | 4.70260 | 74912 |
| 3456 | 4.67100 | 7249 |
| 9864 | 4.63880 | 1747 |
| 15538 | 4.60500 | 1633 |
| 8964 | 4.60000 | 25 |
| 14791 | 4.60000 | 75 |
| 10464 | 4.59550 | 89 |
| 14550 | 4.59340 | 139660 |
+-------+---------+--------+
10 rows in set (0.01 sec)
mysql> select count(*) from userdata where number>=1 and number<=40;
+----------+
| count(*) |
+----------+
| 126413 |
+----------+
1 row in set (0.04 sec)
This shows that almost 25% of the database has users who have given less than 40 ratings .
mysql> select * from moviedata order by number desc limit 0,10;
+-------+---------+--------+
| id | average | number |
+-------+---------+--------+
| 5317 | 3.36130 | 232944 |
| 15124 | 3.72420 | 216596 |
| 14313 | 3.78390 | 200832 |
| 15205 | 3.44220 | 196397 |
| 1905 | 4.15390 | 193941 |
| 6287 | 3.90500 | 193295 |
| 11283 | 4.29990 | 181508 |
| 16377 | 4.30690 | 181426 |
| 16242 | 3.45440 | 178068 |
| 12470 | 3.41190 | 177556 |
+-------+---------+--------+
10 rows in set (0.00 sec)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment