Stress Thoughts

Requirements

Hosting

The platform hosts multiple developer-authored test scenarios, encapsulating end-to-end use cases for the client library with varied complexity.
The platform can execute test scenarios for a prolonged duration spanning multiple days, while monitoring and managing scenarios that are not in a responsive or healthy state.
The platform allows access to Azure services.
The platform allows scheduled runs and ad-hoc runs.
The version of the SDK package can be specified for a given run, defaulting to the latest nightly build.

Configuration

Test scenarios can be configured per-run, using configuration items specific to the given scenario.
Configuration elements can support at least string as a primitive; test scenarios own responsibility for any deserialization needed to support complex types or other primitives.

Metrics and Reporting

The platform allows test scenarios to report metrics in a recurring fashion, where each report is considered a snapshot for that specific point in time.
The platform collects host environment metrics scoped to a test scenario including CPU and Memory usage on a recurring basis.
Test scenarios can define custom metrics for the scenario which support at least a string-based key-value pair.
The platform provides reports for scenario metrics in real-time when requested and at the end of a run.
Scenario metric reports can be sent via email and/or posted to teams upon completion.
Scenarios have a means to surface errors for logging, capturing exception details such as the message and stack trace, along with a textual description of context independent of the exception itself.

Nice to Have

Hosting

The platform allows for configurable fault injection with respect to network connectivity, DNS, latency, and other "chaos monkey"-type factors.
The platform allows for Azure resource setup/clean-up in a similar manner to test infrastructure.
The version of client libraries for a run can be based on a direct upload, allowing for a private build to be used for testing.

Configuration

Test scenario configuration supports a common set of elements, such as run duration, and is extensible with elements specific to that scenario.
Test scenario configuration can be defined as a set containing one or more elements; configuration sets may be applied to one or more scenarios to reduce duplication.

Metrics and Reporting

Scenarios may define a threshold for pass/fail based on individual metrics and on the final state of metrics when the scenario run has completed.
The platform supports custom scenario metrics of different types, ideally allowing for a label, raw value, and percentage component. Example:
Events Read: 12,000 (99.89%)
The platform supports formulas for custom metrics, which may be based on other metrics. Example:
Events Read: { this } ({ totalSent / this }%).

Metrics can be assigned to a display category; when displaying metrics, the platform groups them by category. Example:

Processing
==========================================
Events Published :  12,229
Events Read      :  12,228 (99.99%)

Errors
==========================================
General Exceptions : 100
Send Exceptions    :  10 (10.00%)
Read Exceptions:   :  90 (90.00%)

The platform allows opting into SDK log events using the AzureEventListener or similar construct, allowing the log level to be specified.
The platform allows routing informational events and error events to different sinks, with the error-based events defaulting to the same area as scenario exceptions.
The platform allows for observing logs using a tail-style approach.

jsquire/deep-thoughts-by-jack-handy.md

Stress Thoughts

Requirements

Hosting

Configuration

Metrics and Reporting

Nice to Have

Hosting

Configuration

Metrics and Reporting

MiYanni commented Jul 22, 2020

Uh oh!

heaths commented Jul 22, 2020

Uh oh!