Testing Alternator

What is alternator

A drop in replacement for AWS dynamodb https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Introduction.html
Announced last year, in scylla open source https://www.scylladb.com/2019/09/11/scylla-alternator-the-open-source-dynamodb-compatible-api/

Testing

unit tests

during the development Nadav wrote py.test based test and using boto3 (the python aws python client) testing against one scylla server. https://github.com/scylladb/scylla/tree/master/test/alternator

Highlights

There were test from almost day one of the development,
those are sitting next to the scylla code

Lowlights

there was no automation/CI for then, until a very late stage
only single server was test, no topology/clustering were cover by the unittests

scylla-cluster-test

During the announcement we start working on a simple longevity test and the main challenge was finding and adapting a stress tool that can work with dyanmodb api YCSB was selected, an 3h test was setup, with a minimal collection of stats. dev team was using it a bit, mainly to produce screenshot of the monitor, and the new alternator dashboards. only when starting to working on 4.0, we actually start expanding this an utilizing YCSB better

Now we have in SCT:

3h basic scenario (equivalent to 4h longevity)
48h longevity with authentication (equivalent to 48h longevity)
performance benchmark - throughput and latency
multi-region longevity - still WIP

Highlights

We start early on with SCT, which uncover import issues, regarding cluster and replication factor, which isn't covered in the unittests at all.
YCSB was proven very helpful tool, even that it took a while to figure how to enable data integrity checks. we now support both CQL an dynamodb with it
We introduced the docker based loader, which open lots of possibilities for SCT
helped creating a new report that compare multiple types stress (subtests) from the same run (mainly alex work for CDC)

Lowlights

Dynamodb client are not cluster aware, which was made using nemesis a bit of a pain. in the end we are using the "DNS" solution, in scylla-cloud a load balancer would be used.
Creating the performance benchmark, was very complex, it has lots of moving part to get it working correctly
- how store the cassandra-stress information, and how we retrieve it, is hidden deep in SCT code
- need a ability to suppress events, since LWT was causing error prints on high throughput
- had to run each case 3 times, with cql, without lwt, and with lwt.

dtest

Very close to the release of 4.0, we start pushing for having tests in dtest assuming lots of the functionally is covered in the unittest and the millage we had with SCT in a ~3wk effort we have 14 test, that are all utilizing 3 or more nodes, running nodetool commands, adding/decommissioning nodes.

Highlights

writing the first test was quite easy, since it's only one configuration flag and boto3 already introduced by Shlomo for manager backup (only once small change need for ccm for supplying the alternator_address)

Lowlights

since we are not very used to working on "close quarters", it was challenging not to break each other code all the time.
since we were not doing the test as the development was done, we weren't very specific in our testing, hopefully in next step we'll be able to closely follow the alternator development

fruch/alternator_testing_overview.md

Select an option

No results found

Select an option

No results found

Testing Alternator

What is alternator

Testing

unit tests

Highlights

Lowlights

scylla-cluster-test

Highlights

Lowlights

dtest

Highlights

Lowlights