Skip to content

Instantly share code, notes, and snippets.

before index on opens
adrocket.opens| 9.58M | 0.89G | 1.97G | 2.85G | 2.21
SELECT count(*) AS count_all FROM opens USE INDEX (IX_OPEN_OPENED) WHERE (group_id = '9' and opened between '2009-03-16 07:00:00' and '2009-03-17 07:00:00' and ad_id is not null and reason_id = 13);
+-----------+
| count_all |
+-----------+
| 1358 |
+-----------+
1 row in set (2.79 sec)
tablespace usage b4/after
+-------------------------------------+---------+--------+--------+------------+
| concat(table_schema,'.',table_name) | rows | DATA | idx | total_size |
+-------------------------------------+---------+--------+--------+------------+
| adrocket.delivered_ads | 273.21M | 13.63G | 19.22G | 32.85G |
| adrocket.ads | 5.58M | 8.85G | 9.48G | 18.33G |
| adrocket.responses | 61.88M | 4.32G | 11.24G | 15.57G |
| warehouse.ad_responses | 95.32M | 9.86G | 5.24G | 15.10G |
| adrocket.offline_keyword_database | 44.61M | 4.93G | 3.46G | 8.39G |
SELECT concat(table_schema,'.',table_name),concat(round(table_rows/1000000,2),'M') rows,concat(round(data_length/(1024*1024*1024),2),'G') DATA,concat(round(index_length/(1024*1024*1024),2),'G') idx,concat(round((data_length+index_length)/(1024*1024*1024),2),'G') total_size,round(index_length/data_length,2) idxfrac FROM information_schema.TABLES ORDER BY data_length+index_length DESC LIMIT 10;
sql for finding noimage open rate
total clicks for a group:
select count(distinct clicks.id) from clicks where group_id = 9 and scrubbed=0 and cookie_id is not null;
total clicks that have a corresponding impression
select count(distinct c.id) from clicks as c, opens as o where c.group_id = 9 and scrubbed=0 and cookie_id is not null and c.group_id = o.group_id and c.ad_id = o.ad_id and c.email_address_id = o.email_address_id and c.batch_id = o.batch_id;
estimated run time 1min
without apache in the way (haproxy -> tomcat)
transactions: 10000 hits
Availability: 100.00 %
Elapsed time: 14.01 secs
Data transferred: 22.56 MB
Response time: 0.30 secs
Transaction rate: 713.78 trans/sec
Throughput: 1.61 MB/sec
Concurrency: 213.90
time for dumping entire 154 g database:
real 50m52.932s
user 12m6.586s
sys 2m46.209s
sql size: 36 g
time for then gzipping it with --best
clicks by retrievers withn last xxx hours
select count(*),source from clicks,delivered_ads,ads where clicks.delivered_ad_id = delivered_ads.id and delivered_ads.ad_id = ads.id and clicked > date_add(now(), interval -168 HOUR) and clicks.scrubbed=0 group by ads.source;
clicks for shopping
select count(*) from clicks as c,delivered_ads as d,ads as a where c.delivered_ad_id = d.id and d.ad_id = a.id and clicked > date_add(now(), interval -48 HOUR) and c.scrubbed=0 and a.source = 'ShoppingAdsRetriever';
clicks for shopping for those with has_products