Last active
August 27, 2015 07:14
-
-
Save RubenKelevra/3e73cf0ed4b05e4ba8cd to your computer and use it in GitHub Desktop.
Blog: RethinkDB for system-statistics of freifunk-nodes
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This story is about: How do we fetch, process and display nice statistics for | |
freifunk-nodes, running a free, open WiFi-Network. | |
Else we want to implement pushes for the owner of each freifunk-node, when | |
something goes wrong or offline. | |
Status Quo: | |
Currently we use some hacky bash scripts[1] which generates a text-file, curl it | |
to a php running server over https and dropping the latest values to a | |
MySQL-database. | |
Means we do many updates on the database, but does not have any historical | |
informations. To fix this issue we do writing some rrds and parsing them with | |
perl, generating images for statistic-pages per freifunk-node[2]. | |
Now, today we start something new: | |
We have setup a RethinkDB 2.1 cluster on 5 v-servers in 3 data centers, to max | |
out reliability and duration. We plan to add two different data centers with two | |
nodes more. | |
With the new raft-implementation, RethinkDB 2.1 "Forbidden Planet" has gotten | |
mature enough to use it in our environment, as we think. | |
Today we going to implement the backend for the (nearly finished) complete new, | |
written from the scratch bash-implementation of the statistic-scripts[3] on the | |
node. Which are fully functional, pseudo objective and useable as libary for | |
different use-cases. | |
The backend is going to be written in python, which accepts the bsdiff-binary | |
transmissions, secured over a fastd 1:1 tunnel, with latest | |
salsa2012-poly1305-umac public-private crypto. The diff is going to be | |
decompressed, and added to the RethinkDB-Database. | |
We hope to implement the rest of the calculation nearly/entirely as map&reduce | |
in the RethinkDB-Cluster, to spread out the workload on different servers and | |
locations. | |
We have currently an avarage of 1.5 transmissions per second, with a diff-size | |
of 300 KB, and a uncompressed JSON-filesize of 4 KB. Means we got nearly | |
4 Millions transmissions per month with an uncompressed filesize of nearly | |
15 GB. So since we getting more and more freifunk-nodes, we have to think about | |
a good scaling implementation, and we think we found it in a RethinkDB-Cluster. | |
Since 15 GB per month are a huge size for a database, and we do not need a 5 | |
minutes resolution for times month ago, we we have to reduce the datasize each | |
hour, day, week, month and so on. We hope to implement the rest of the | |
calculation also entirely as map&reduce in the Cluster, since this would scale | |
best over more than one server. | |
I'm going to report updates here, so feel free to follow. | |
[1] https://github.com/VfN-NRW/legacy-stats/blob/master/gluon-legacy-https-stats/files/usr/sbin/ff-stats | |
[2] e.g. http://oldmap.vfn-nrw.de/nodes/24a43c43a8eb | |
[3] https://github.com/VfN-NRW/fnetstat/blob/complete_refactor/files/usr/sbin/fnetstat_stat |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment