@puerco @Claudiu Belu @leonardpahlke @krol @RobKielty @arsh @hsalahi @mkorbi @rajula96reddy
- Meeting was attended by a mix of SIG Release CI Signalers, people interested in applying to working on CI Signal and developers
- Robert Kielty gave an overview of the CI Signal Tooling, demoing how to detect flakes using test grid.
- Demo notes
- Although some training is required, it is not that difficult to find flaky tests using Testgrid. CI Jobs that have flaky tests are highlighted in the summary pages such as https://testgrid.k8s.io/sig-release-master-blocking and https://testgrid.k8s.io/sig-release-master-informing; when you drill down into a job that is flaking, at the time of the meeting say seek out a flaky failure by mousing over a red cell that shows when a test has failed and move to neighboring runs where the test passed. If the k/k git commit is the same for both a passing run and a failing run, congratulations, you have found a flaking test.
- Post meeting during the write up @RobKikelty notes that some grids on Test Grid callout flakiness on the UI. This is an excellent UX improvement! ref https://testgrid.k8s.io/sig-release-master-informing#periodic-conformance-main-k8s-main
- Fixing flaky tests is the technically challenging work.
- Rob noted that the starting point for this work is Jordan Liggits video on this topic https://www.youtube.com/watch?v=Ewp8LNY_qTg
- Leonard Pahlke pointed out the existance of The CI Signal One Page document which acts a top level page that introduces the CI Signal Work. Shoutout to Leonard for this I love this page. https://github.com/kubernetes/sig-release/blob/master/release-team/role-handbooks/ci-signal/one-pager.md
- Spyglass, Prow’s job result web-app, could do with some improvement. To see Spyglass in action click on a cgrid in test grid.
- Usefully, Spyglass is extensible via a plugin mechanism where an on-screen component can be developed as a plugin. Such components are called lenses. We briefly visited this lens https://github.com/kubernetes/test-infra/tree/master/prow/spyglass/lenses/junit which would display the test results
- Robert suggested that there are lenses that could stand being swtiched over to a using a mono font this would be a first good issue for a new contributor
- Claudiu suggested that it would be great for test maintainers if Spyglass could give a clearer indication of *when (and how many tests) tests are being run in concurrently. (Ex meeting at the social event Rob and Claudiu did an informal design session disucussing the design of such a new lens, would be great to continue that conversation.)
Triage Board - https://storage.googleapis.com/k8s-triage/index.html Latest Flakes - https://storage.googleapis.com/k8s-metrics/flakes-latest.json (Needs a issue logged will do this today)
- Demo notes
- Discussions
- Sippy
- There is interest in the CI Signal team inexploring the use of Sippy https://kube-sippy.dptools.openshift.org/sippy-ng/release/kube-master
- Sippy provides a dashboard of CI Job Result data.
- CI Signal are happy to provide feedback to the Sippt dev team and possibly develop use cases for Sippy in they are interested in that.
- We discussed the best way of incorporating Sippy usage for a Kubernetes Release cycle, there are multiple options that could be explored @puerco mentioned creating a Sippy instance that is dedicated to summarizing the k8s jobs on the SIG Testing operated prow instance.
Kettle is subscribed to listens to change events on the GCS buckets that store CI Job results and job data is added to a BigQuery dataset. Latest Flakes are query reuslts that use this data set. And there is an issue to skin this with a front end for easier viewing that Robert is working on now.
- Next steps for CI Signal are to rollout Leonard’s CI Signal report so that it is run by prow periodically, Rob is working on that now ETA this week. The next steps for that is to demo that work to the Reliabililty WG and decide as a community what the next steps there could be in relation to backing up the Reliability WGs with data gathered by CI Signal over the course of releases.
- Sippy
This write up was done the one day later so if I missed anything feel free to continue the disucssion here in Slack