RethinkDB count issue and solutions

Problem

Count() is O(n).

This can send a new developer running to the hills, as it seems like a trivial problem, however it is not. While we hope this gets addressed in the future (even in a non ideal way), there are work arounds.

Relevant Issues:

Solutions

Use the tables info command if an estimate is enough -

r.db('DB').table('TABLE').info()('doc_count_estimates').nth(0)

Upgrade your cluster: A sharded cluster with strong servers (SSD, memory, etc) helps a lot. You can also increase --cache-size.
Add a table that saves your counts. You can:

increase on every insert
use a changefeed, prefarbly with a squash
just save the count result now and a again.

add a "position/i/inesrted" field to the table and mantain in memory on inserts. That way the last record sorted by index has the count as it's "position/i/inesrted" propery.

Comment

To the best of my knowledge, if the bulk of your work is with processing tables with millions of rows and analizing them RethinkDB is probably not your best solution. You could also combine it with another DB.

sagivf/RethinkDB count issue and solutions.md

Problem

Solutions

Comment