- Change
cassandra-rackdc.properties
to: - dc=datacenter1
- rack=rack1
- Change cassandra.yaml snitch:
GossipingPropertyFileSnitch
- Rolling restart of nodes:
nodetool flush && nodetool drain && service cassandra stop
- Update application specific keyspaces to use NetworkTopology w/ only existing DC
ALTER KEYSPACE {keyspace} WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1': '3'} AND durable_writes = true;
- Create new instances and install DSE 5 on all nodes, don't start dse service
- Elect single node as DC's seed
- Update
cassandra.yaml
to all the same settings include cluster_name etc... and optimized settings and set seed of all but on to the elected DC seed - On elected seed set
cassandra.yaml
seeds to 1-2 IPs in datacenter1 (existing) - Change
cassandra-rackdc.properties
to: - dc=cassandra
- rack=rack1
- Start dse service (
service dse start
) on elected DC seed node - Check
nodetool status
that new node joins correctly incassandra
- Start other nodes 1-by-1 w/ 2mins between each start
DSE will make queries regularly to find out what workload type a node is (Cassandra/Search/SearchAnalytics) which if the result is null
will fill up the logs and cause performance issues on each DSE node. An indication this is occuring is a log message of Couldn't determine workload for /10.10.6.255 from value NULL
. To avoid this:
- On each DSE node update the local
system.peers
table and addCassandra
to the workload column for each OSS node
- Alter keyspaces to use NetworkTopStrategy:
ALTER KEYSPACE dse_perf WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1': '3', 'cassandra': '3'} AND durable_writes = true;
ALTER KEYSPACE system_auth WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1': '3', 'cassandra': '3'} AND durable_writes = true;
ALTER KEYSPACE dse_security WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1': '3', 'cassandra': '3'} AND durable_writes = true;
ALTER KEYSPACE system_distributed WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1': '3', 'cassandra': '3'} AND durable_writes = true;
ALTER KEYSPACE dse_system WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1': '3', 'cassandra': '3'} AND durable_writes = true;
ALTER KEYSPACE dse_leases WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1': '3', 'cassandra': '3'} AND durable_writes = true;
- Verify that
nodetool describecluster
shows all nodes sharing the same schema ie:Schema versions: d08412f3-1cba-3a9f-995d-9d97f007d329: [172.31.10.63, 172.31.10.68, 172.31.10.67, 172.31.10.66, 172.31.10.65, 172.31.10.64]
- Alter all remaining application specific keyspaces to be replicated in both existing and new DC
- Rebuild each node in new DC (cassandra)
nodetool rebuild -- datacenter1
- Note: rebuild max 2-3 at a time
- Verify that each nodes load/data looks correct by nodetool as well as cqlsh queries
- Make sure applications and spark jobs are using new DC and IPs
- Run full repair across DCs on new DC
- Alter all the listed keyspaces in #3 to remove datacenter1 from the topology
- ie.
ALTER KEYSPACE {keyspace} WITH replication = {'class': 'NetworkTopologyStrategy', 'cassandra': '3'} AND durable_writes = true;
- Use nodetool on each node to remove self from cluster
nodetool remove
- Verify on new DC that only those IPs exist and a single DC is shown
- Restore
dse_leases
keyspace to EverywhereStrategy (optional: system_auth) ALTER KEYSPACE dse_leases WITH replication = {'class': 'EverywhereStrategy'} AND durable_writes = true;