I ws thinking on using the new TABLESAMPLE feature included in postgres 9.5 in order to calculate clusters for visualization. Obvioulsy doing sampling adds some error to the final clusters but in the other hand provides (theorically) some time improvemens (avoid full scans with SYSTEM method and reduces CPU usage).
I only tested with this table, it should be tested with different data distributions and datasizes but looks promising for some clustering methods.
BERNOULLI
cartodb_dev_user_accf0647-d942-4e37-b129-8287c117e687_db=# select CDB_JenksBins(array_agg(vel::numeric), 7) from sensor_log_2013_10_27_12_01_1 TABLESAMPLE BERNOULLI (10);
cdb_jenksbins
-------------------------------------------------------------------------
{4.3573446,10.612988,15.145245,18.386246,21.200254,24.348658,29.323336}
(1 row)
Time: 1617.736 ms
SYSTEM
cartodb_dev_user_accf0647-d942-4e37-b129-8287c117e687_db=# select CDB_JenksBins(array_agg(vel::numeric), 7) from sensor_log_2013_10_27_12_01_1 TABLESAMPLE SYSTEM (10); LOG: statement: select CDB_JenksBins(array_agg(vel::numeric), 7) from sensor_log_2013_10_27_12_01_1 TABLESAMPLE SYSTEM (10);
cdb_jenksbins
------------------------------------------------------------------------
{1.7491112,14.543345,17.362501,20.191946,21.745567,23.99369,28.706001}
(1 row)
Time: 1710.083 ms
100% sampling
cartodb_dev_user_accf0647-d942-4e37-b129-8287c117e687_db=# select CDB_JenksBins(array_agg(vel::numeric), 7) from sensor_log_2013_10_27_12_01_1;
cdb_jenksbins
-------------------------------------------------------------------------
{5.1650224,13.926011,16.328466,18.545721,21.298002,23.700457,29.858356}
(1 row)
Time: 2261.816 ms
cartodb_dev_user_accf0647-d942-4e37-b129-8287c117e687_db=# select CDB_QuantileBins(array_agg(vel::numeric), 4) from sensor_log_2013_10_27_12_01_1;
cdb_quantilebins
-------------------------------------------
{13.236655,17.239033,21.843311,29.858356}
(1 row)
Time: 23.375 ms
cartodb_dev_user_accf0647-d942-4e37-b129-8287c117e687_db=# select CDB_QuantileBins(array_agg(vel::numeric), 4) from sensor_log_2013_10_27_12_01_1 TABLESAMPLE BERNOULLI (10);
cdb_quantilebins
---------------------------------------
{13.236655,17.3162,21.8999,29.498245}
(1 row)
Time: 3.200 ms
cartodb_dev_user_accf0647-d942-4e37-b129-8287c117e687_db=# select CDB_QuantileBins(array_agg(vel::numeric), 4) from sensor_log_2013_10_27_12_01_1 TABLESAMPLE SYSTEM (10);
cdb_quantilebins
-----------------------------------------------------
{11.662456,17.8255015000000000,22.712723,29.652576}
(1 row)
Time: 2.816 ms
avg percentage and std deviation on 10% sampling for jenks, quartiles and head and tails methods
cartodb_dev_user_accf0647-d942-4e37-b129-8287c117e687_db=# select avg(100*abs(s10 - s100)/s100), stddev(abs(s10 - s100)/s100) from unnest((select CDB_QuantileBins(array_agg(vel::numeric), 4) from sensor_log_2013_10_27_12_01_1), (select CDB_QuantileBins(array_agg(vel::numeric), 4) from sensor_log_2013_10_27_12_01_1 TABLESAMPLE SYSTEM (10))) x(s100, s10);
avg | stddev
--------------------+--------------------------------------------
3.2430976170828547 | 0.0103033362384431124316161531298645641616
cartodb_dev_user_accf0647-d942-4e37-b129-8287c117e687_db=# select avg(100*abs(s10 - s100)/s100), stddev(abs(s10 - s100)/s100) from unnest((select CDB_JenksBins(array_agg(vel::numeric), 4) from sensor_log_2013_10_27_12_01_1), (select CDB_JenksBins(array_agg(vel::numeric), 4) from sensor_log_2013_10_27_12_01_1 TABLESAMPLE SYSTEM (10))) x(s100, s10);
LOG: statement: select avg(100*abs(s10 - s100)/s100), stddev(abs(s10 - s100)/s100) from unnest((select CDB_JenksBins(array_agg(vel::numeric), 4) from sensor_log_2013_10_27_12_01_1), (select CDB_JenksBins(array_agg(vel::numeric), 4) from sensor_log_2013_10_27_12_01_1 TABLESAMPLE SYSTEM (10))) x(s100, s10);
avg | stddev
-------------------------+--------------------------------------------
32.37922940131385264271 | 0.5877181351833277149595473155945874389842
(1 row)
Time: 3558.188 ms
cartodb_dev_user_accf0647-d942-4e37-b129-8287c117e687_db=# select avg(100*abs(s10 - s100)/s100), stddev(abs(s10 - s100)/s100) from unnest((select CDB_HeadsTailsBins(array_agg(vel::numeric), 4) from sensor_log_2013_10_27_12_01_1), (select CDB_HeadsTailsBins(array_agg(vel::numeric), 4) from sensor_log_2013_10_27_12_01_1 TABLESAMPLE SYSTEM(10))) x(s100, s10);
LOG: statement: select avg(100*abs(s10 - s100)/s100), stddev(abs(s10 - s100)/s100) from unnest((select CDB_HeadsTailsBins(array_agg(vel::numeric), 4) from sensor_log_2013_10_27_12_01_1), (select CDB_HeadsTailsBins(array_agg(vel::numeric), 4) from sensor_log_2013_10_27_12_01_1 TABLESAMPLE SYSTEM(10))) x(s100, s10);
avg | stddev
------------------------+--------------------------------------------
1.85379607859246236981 | 0.0094798681590236172686651446763114294001
With BERNOULLI the results are better