- Build and test a feature in a mattermost-server branch.
- Scan through markdown docs in mattermost-load-test-ng. Make sure you get an idea of what a coordinator, agent, controller, bounded and unbounded load-test is, and why do we need metrics collection(a deployment of prometheus) in this setup.
This would include
- Writing new load testing actions to mattermost-load-test-ng
- Testing the changes locally.
- Testing the changes in terraform: its purpose is to load-test under a heavier load/dataset.
- Analyse load-test results
- Getting the changes merged to load-test repository, so the release-manager can test the same changes for unexpected behaviors during the upcoming releases.
- Go through coverage.md, make a list of changes needed to load-test the new feature.
- Optionally check out this video walkthrough by @claudio.costa
- Make the necessary changes in
loadtest/
- Go through local_loadtest.md
- Some additional information on the above document
- In step where configs are copied from samples,
simplecontroller.json
doesn't have to be generated since by default theconfig.json
uses asimulative
controller. - updates to
config.json
- Make sure to change
ConnectionConfiguration
section in config.json according to the local deployment of mattermost-server. InstanceConfiguration
refers to doc, this is used byinit
command later to initially populate mattermost database.
- Make sure to change
- increase frequency of the new action, so its easier to debug while running locally. sample
- Expect some failures in running-a-basic-loadtest, if one sees > 10-15 error logs, there's a troubleshooting guide.
- check
ltagent.log
inmattermost-load-test-ng
directory, and server logs for details on errors, if any. - Further sections of the document i.e using load-test agent API server and using load-test coordinator are highly recommended to go through, since the Terraform deployment uses the latter method to execute the load-tests and it'd be easier to debug the setup if the developer understands these underlying principles.
- In step where configs are copied from samples,
Even without the framework, a general load-test workflow in the cloud will be similar to the following.
- Create a database (to be used by mattermost servers)
- Create deployments of mattermost servers - let's call them app servers for convenience.
- Create 'agent' machines to ping app-servers using controllers
- Populate database, and start app, agents (i.e loadtest)
- Collect metrics from app, and agent deployments to either,
- manage an unbounded load test
- analyse api performance after the load-test completes.
Loadtest instances created with this framework achieve the same goals as mentioned above, only some of the things like creating a deployment, running a loadtest, etc. are automated.
Following are some steps to load-test a new feature in production, after testing new actions locally.
- Go through terraform_loadtest.md
- Some additional information on the above document
-
AWS credentials are to be fetched from onelogin and added to terraform config.
-
Enterprise license is to be fetched from Developers:private
-
When performing this step, edit
MattermostLicenseFile
value to the path containing the license.- The fields
MattermostDownloadURL
andLoadTestDownloadURL
point to the latest mattermost-server, and load-test package respectively to be used in the load-test. - That's the default option, when there's unmerged changes to
mattermost-server
, ormattermost-load-test-ng
make build-linux
inmattermost-server
directory, changeMattermostDownloadURL
value to the path containing mattermost executable. For examplefile:///somepath/mattermost-server/bin/linux_amd64/mattermost
make package
inmattermost-load-test-ng
directory, changeLoadTestDownloadURL
value to the path containing gzip of load-test package. For examplefile:///somepath/mattermost-load-test-ng/dist/v1.5.0-8-gd4f18cf/mattermost-load-test-ng-v1.5.0-8-gd4f18cf-linux-amd64.tar.gz
- The fields
-
Edit
SSHPublicKey
in deployer.json after setting up ssh. -
go run ./cmd/ltctl deployment create
- Limit operations of
deployment
to a single shell window. - If the deployment gets stuck, check for
ps -ef | grep terraform
, if there are running processes, restart the computer and start again. - Terraform actions are idempotent, so one would rarely have to destroy the deployment, if things go wrong while creating resources.
- Limit operations of
-
Once the deployment gets created successfully, stdout will have information on server addresses for app, agent, coordinator, and Grafana deployments.
- Open the mattermost URL in browser to check if the app is working as required.
- At this point, one might check the server logs by ssh-ing into the app instance. Once in there, open
/opt/mattermost/logs/mattermost.log
.
-
Gearing up to start the load-test
- Use
agents'
URL and Prometheus URL in thecoordinator.json
file generated here - Change
ConnectionConfiguration
in theconfig.json
generated here - Configure
InstanceConfiguration
in the sameconfig.json
(which as mentioned earlier, creates the seeds mm-server's database with required data for the loadtest). Note that a heavier config withNumPosts
would take a very long time get seeded. Please refer to the NB section below to manually seed the database from a backup, in order to bypass 'data-generation.
- Use
-
Start the load test with
go run ./cmd/ltctl loadtest start
-
Once a loadtest is running, its status can be checked with
go run ./cmd/ltctl loadtest status
. -
ssh
into one of the agent machines, andcat ~/mattermost-load-test-ng/logs/ltagent.log
to verify the load-test is working without errors. -
Open the Grafana deployment with the URL from
go run ./cmd/ltctl deployment info
. -
It takes some time for the deployment to stabilize, i.e while the loadtest tool connects
MaxActiveUsers
count of users to app, there might be a big count of HTTP 4xx errors in this duration.
-
-
"My load test is running, now what?"
- If it's a bounded loadtest, it has to be manually stopped with
go run ./cmd/ltctl loadtest stop
after an hour - If it's an unbounded loadtest, the load-test will finish with a stdout listing the maximum concurrent users the deployment supports. The load-test status check command will say status as
Done
when it's complete.
- If it's a bounded loadtest, it has to be manually stopped with
-
"My load tests ran successfully, what to make of it?"
- In case of unbounded load-tests, when they finish,
go run ./cmd/ltctl loadtest status
would give you a count of maximum concurrent users which is a metric to compare the performance of that version of mattermost-server. - In case of two bounded loadtests with same
MaxConcurrentUsers
count, one can generate a report comparing performance of various server metrics.
- In case of unbounded load-tests, when they finish,
-
- For seeding the database manually :
- InstanceConfiguration section would be as minimal as possible to reduce
db init
time. - Message in Developers:Performance for a migration file.
ssh
into the app machine, andpsql
into the connected database.- Drop all tables, log out of psql. Run the migration, which might take a while.
- Now, the app service needs to be restarted so the server can run the necessary migrations.
ssh
into app-instance(s) and runsudo systemctl restart mattermost && until $(curl -sSf http://localhost:8065 --output /dev/null); do sleep 1; done;
- InstanceConfiguration section would be as minimal as possible to reduce
- If the feature is behind a feature flag: link to Claudio's message to add environment variables to app-service