I thought it might be fun to put together a horizontally scalable, distributed micro-services approach to this problem. This would allow the simulation to be run either locally (with multiple threads to make use of features on modern processors) or in a scalable cluster environment with any number of hosts, where vastly more resources would be available to simulate a much more populous environment.
The basic plan would be to use the following dockerised services to run the simulation:
4
Redis instances/clusters1
master
node; to control the clock, save state and monitor statsN
r_worker
nodes; to process the rules for each rabbitM
w_worker
nodes; to process the rules for each wolfL
render
nodes; to pull the state and render the map
Redis (the fast key/value datastore) v3.2 includes GeoHash
based structures and a GEORADIUS
command; which searches for members within a radius around a
given position. Since this function represents a decent portion of the trig algorithms required for
the simulation this makes it a good choice for storing the simulation data.
These services could be orchestrated with Kubernetes or Rancher to run threads across any number of
hosts, with work queues for managing and distributing work between r_worker
and w_worker
nodes.
Splitting the Redis instances/clusters into 4 shards (one each for geo:wolves
, geo:rabbits
,
geo:carrots
, and then the rest) allows the expensive (O(log(N))
) GEO
transactions to be run
on different hosts. These shards themselves could be single instances, or clusters using the
master-slave model to further distribute reads and writes.
The limitations of this architecture would be as follows:
- The simulation area would be limited by the Redis GeoHash implementation to longitudes between -180 and 180 degrees and latitudes between -85.05112878 and 85.05112878 degrees; so practically, planet Earth.
- The population of any single group (wolves, rabbits or carrots) would be limited to what could
be stored in memory of the host that redis instance lands on; since the GeoHash is stored as
52bit integers, this translates to populations in the order of 10 billion or so using AWS
m4.10xlarge
instances. - The simulation rate would probably be limited by latency between the nodes and the redis instances/clusters introduced by the the TCP/IP stack and the network (since even on a single machine, docker services communicate with an overlay network).
The follows keys would be Redis lists to be used as processing queues for each tick:
wolves
rabbits
The following sorted sets would be created and accessed with redis GEO
commands to track position
of each member:
geo:wolves
geo:wolves:near:rabbits
geo:rabbits
geo:rabbits:near:carrots
geo:carrots
These keys would store information and stats
<wolf>:count
<rabbit>:count
<wolf>:nearby:rabbit
<rabbit>:nearby:carrots
<rabbit>:carrot:quota
<wolf>:rabbit:quota
<carrot>:patch:qty
The master
node would be responsible for setting up the initial conditions for the simulation,
and filling the work queues with the contents of each sorted set at the start of each tick. It would
also store/log/serve stats and the homeostatic conditions on a regular basis and could probably even
be used to automatically scale the number of hosts and worker nodes in a cluster as required using
AWS and Kuberenetes APIs (or similar).
See master.js
for a brief pseudocode overview of the functionality of a basic master
node.
The r_worker
nodes would process the rules for the simulated rabbits and update their positions,
update eaten carrot quotas, drop them from geo:rabbits
when eaten, and spawn duplicates when full.
See r_worker.js
for a brief pseudocode overview of the functionality of a basic r_worker
node.
The w_worker
nodes would process the rules for the simulated wolves and update their positions,
update eaten rabbit quotas, and spawn duplicates when full.
See w_worker.js
for a brief pseudocode overview of the functionality of a basic w_worker
node.