Metrics and Graphs for Buildbot

Personal Details

Name : Prasoon Shukla
Email : [email protected]
IRC : prasoon2211

Project Description

The problem

Buildbot gathers large amounts of data from its builds. This captured data is usually a quantity that provides information about the build. A prominent example of this is the build time. At the moment, the user can only get access to this data for an individual build and not collectively which is vital to performing any kind of analysis on data. Also, there is no possibility for the user to make Buildbot gather specialized data from the build - so, for example, timing a single function in the test suite is not possible.pr

This project aims to rectify both these problems. Specifically, it aims to:

Allow the user to gather arbitrary statistics about a project (such as the build size).
Allow collective access to build statistics via InfluxDB. Additionally, provide graphical analysis of this data via Graphana.

The advantages for Buildbot users

This project, if implemented, can be a very beneficial for Buildbot users as it would:

Allow users to measure arbitrary quantities from the build. So, a user can measure, for each build:
- The final build size
- Runtime of a single function
- Total lines of code
- Number of function calls made by a test, etc.
Let users perform statistical analysis on this data, using influxDB. Influx has several statistical functions that operate on the collected data via its SQL like querying language and produce results in table form. This can help provide insights into the development process. For example, the user can calculate mean build time over a certain time period.
Produce graphs from gathered test-data which are great for visualizing changes to the project.

The project

The project will consist of the following tasks. Note that these tasks are also logically separate so they can act as microtasks that can be completed one step at a time.

Tasks

Updating the existing metrics module to unify data-capture. Also, creating new data reporting methods.

Currently, there are three different MetricEvents in the metrics module which can be invoked to capture data. These are:
- MetricCountEvent: For making count of a quantity.
- MetricTimeEvent: For keeping track of the time a process takes. This MetricEvent keeps track of last 10 reported values and reports the average of those values.
- MetricAlarmEvent: For checking whether an event completed successfully. This has three states: OK, WARN, CRITICAL for representing the state of the systems.
The first thing needed to be done would be to remove both MetricCountEvent and MetricTimeEvent along with their MetricHandlers and replace them with a single captureData function. This function will be callable everywhere in Buildbot to allow capturing and storing data. The captured data will then be sent to influxDB to be stored as a time series. Example: metrics.captureData('some_func_called', 1, accumulate=True) for an accumulator. metrics.captureData('some_func_called', 1, accumulate=True, abs_vale=True) for an accumulator that records absolute values. metrics.captureData('some_func_called', 1, diff=True) for an storing difference w.r.t. the last reported values. metrics.captureData('some_func_called', 1) for counting absolutely (no accumulating). metrics.captureData('runtime_of_foo', 20) for capturing time.

We will also need to remove the ad-hoc, in-memory storage methods implemented in these MetricEvents. Instead, captureData will simply and export the captured data to influxDB. We'll also need to provide methods to retrieve this data. Currently, metrics module simply reports data to twisted log file. Instead, we will add several statistical data-reporting methods. These methods will query influxDB for required values via the influxDB Python API. Finally, we will need to expose these data-getter methods via the JSON API to be usable in the frontend. We will also add links in the frontend that can be used for viewing the results of these methods directly via the Buildbot web interface.

We'll add the following getData methods (called with a time-series name) which are analogs to standard statistical functions:
- getAverage: This gathers the reported statistic and produce the average of all numbers ever reported. For this to work, it will need to interact with influxDB via its API. InfluxDB will do the actual job of storing and reporting the average. Can also be called with a time range to get average in builds during that range.
- getDeviation: Exactly like getAverage. Reports the standard deviation instead.
- getMinimum: It will store the all reported values in the influxDB but report the minimum using influxDB aggregate function.
- getMaximum: Similar to getMinimum, except, it reports the maximum value.
These are some basic data-getters that act as analogs for standard statistical methods. More such analogous methods could be added later on, as needed. The process for adding a new method should be quite simple.
Changing Steps for collection of statistics. At the moment, Steps run the commands provided by users and try to collect at much data from the output of the command as possible. Also, Steps measure how much time it took them to run. All of this data gets stored in the build properties (except for the Step runtimes).

For all this data to get to influxDB, we will need to modify Steps by adding a call into metrics module for storage. For example, let us say that we're modifying steps.PyLint. Then, PyLint.createSummary calls setProperty to set build properties. We'll therefore modify this method to make a call into metrics: For example, we can call metrics.captureData('Pylint warnings', warnings) to store the number of warning caught by PyLint.
Implementing a way for users to gather arbitrary statistics.

This is yet another objective of this project. As Buildbot cannot measure every conceivable build statistic on its own, we want to let the user report this statistic to us. Let us take the example of final build size so see how this will happen:
- First, the user will need to create a small script (shell/batch/python or anything else) that when run, outputs to stdout the build size.
- Then, the user will run this shell script in a new Step, a MeasureShellCommand that will be subclassed from ShellCommand. This Step will take the name keyword argument as a required. This name will be a unique string identifier for this data. So, for example, here the user will add this to their master.cfg: f.addStep(steps.MeasureShellCommand(command=["sh","scripts/size.sh"], name="build_size"))
- Then, once the script finishes running and outputs to stdout, we'll capture this output inside the MeasureShellCommand and pass it on to the metrics middleware for storage in influxDB.
I've provided another such example in this thread.
Changing the frontend to include links to the InfluxDB and Grafana instances. We'll pick up the influxDB URL as well as the URL for the Grafana instance from master.cfg and add a link to both of them in the Buildbot frontend. We cal also make use of Grafana's hash-bang URLs to direct user to the right graph once they have set-up Grafana correctly. (Since the user will need to set-up their own Grafana dashboard initially). We might even allow custom URLs (from their dashboard) to be put in their master.cfg which we could dynamically put in the frontend.

Also, we'll need to add links to the data-getter methods added in metrics (see above).

Note: If the user opts-out of this service, then these links will not be visible in the frontend. This can be achieved with a simple flag that we'll pass to the frontend. If this flag is true, we'll show the links. If not, then we Buildbot will be displayed as it usually is.

Other considerations

Storage via metrics in influxDB. InfluxDB employs a schema-less DB for storage. So, the storage happens as key-value pairs internally. For our storage policy, we will:
- Either take the key (the name of a statistic) from user, as in case of MeasureShellCommand, or,
- Provide a key (such as the build-property names) from within Buildbot itself.
Then, storage is as simple as making a call to the influxDB Python API.
Here's a simple diagram that illustrates the flow of data:

Internally, we will use the tags feature (new in influxDB 0.9.0) to keep track of data. Here's how: Suppose we have a couple of steps named compile for several different builders. Then, we can tag each data-point in the influx time-series named compile with the name of the builder as well as the build number. This way, we can quickly retrieve data in the data-getter methods inside the metrics module.

Format of data provided in master.cfg: The user will need to provide us URLs to both the influxDB instance as well as the Grafana instance (running on apache/nginx). The configuration might look something like this (taking from Dustin's suggestion and this comment):

 b1 = util.BuilderConfig(name='first_builder', slavenames=['bot1', 'bot2'], factory=f1),
 b1.trackStepTimes('fetch', 'compile', 'big_test_2') # See [1]
 b1.trackProperties('lint_warnings', 'lint_errors', 'test_skips') # See [2]
 b1.trackCustomStep('build_size', 'total_SLOC') # See [3]
 c['builders'].append(b1)
 
 INFLUXDB_URL = 'http://www.example.com:8083'
 GRAFANA_URL = 'http://grafana.somewebsite.com'

 c['service'].append(
 	InfluxDBInstance(
 		url=INFLUXDB_URL,
 		username=<username>, # as needed
 		password=<password>, # as needed
 		trackBuildTimes=True, # tacks all build times
 	)
 )

 c['service'].append(
 	GrafanaInstance(
 		url=GRAFANA_URL,
 		username=<username>, # as needed
 		password=<password> # as needed
 	)
 )

[1] : The names need to be defined for Steps first using the name field of the step. Getting a non-existent Step name will raise a NameError. [2] : Again, these properties names need to exist before hand or should be set manually. [3] : Again, these are names of the MeasureShellCommand Step.

Of course, we will need to add these new methods on BuilderConfig.

Option to opt-out of metrics collection.

Not all users may want to have this functionality. As such, there should be an option to opt-out of this service. For this, we can simply check whether c['services'] is None or not. If it is None, then the metrics module will not make any calls to InfluxDB. Instead, we will fall back on in-memory storage as is being done now. We can even use the existing methods for implementing the fall-back procedures.

Documentation

This project will require a lot of documentation to be usable. As such, here are the topics for which I'll provide documentation:

Installation of Influx and Graphana
Configuring Influx, Graphana and Apache/Nginx (to run Graphana)
How to gather arbitrary statistics along with an example or two.
Configuration of master.cfg along with examples.
Documentation of the new Step, MeasureShellCommand

Developer documentation will need to cover all of the configuration options, all changes to the Steps, all the new MetricEvents and documentation on how to add new Steps that can be used to extend support to new testing frameworks/linters etc.

Also, documentation will be done as I complete features. So, for every feature that I implement, I shall provide documentation along with it. (See timeline).

Tests

Tests will need to be added for each MetricEvent, checking whether data being fed to influxDB is being stored. These will check the insertion of data using influxDB's HTTP API to retrieve data and match it against what was inserted.

Also, we'll need tests to confirm that Steps after modification are working properly. For this, we can check whether the data being relayed to metrics module from Steps is being stored in influxDB. Again, for this, we'll use the influx HTTP API to confirm whether data was inserted in influx's database.

Porting to Eight

This is purely a stretch goal. If there's enough time towards the end, the implementation is possible. That said, here are my thoughts on the porting: I've briefly read the Eight documentation for Steps and also ran a git diff on the master/buildbot/steps/. From this, I gather that the underlying architecture does not change for Steps. So, in Eight, as well as in Nine, Steps capture data and feed it to setProperty of build. So, we can update Steps in Eight - just as we do for Nine - and have the Steps pass the captured values to metrics module (as I've mentioned under the Tasks heading above). The metrics module itself is relatively unchanged, going from Eight to Nine. As such, we should just be able to copy all the work from Nine (including tests and documentation) and put it in Eight, as far the metrics module is concerned. I think that once the project is implemented for Nine, it can be ported to Eight without too much hassle. Still, I cannot provide an estimate of how long it will take. I have made a concessions in the timeline for porting, nevertheless.

Timeline/Schedule

Here's a week-by-week division of tasks (mentioned under Tasks heading). Also, before any work starts, I will, during the community bonding period (27th Apr - 25th May) work with my mentor to finalize any small details that might remain. I'll also read extensively the documentations of Buildbot, InfluxDB and Grafana.

Time	Work to be done
May 25 - June 2	Work on updating the metrics module (as described in Tasks) to work with influxDB. Write tests and documentation.
June 3 - June 19	Update all existing Steps to be able to make calls into the metrics module middleware for feeding build statistics to influxDB. Write tests and documentation.
June 20 - June 28	Implement `MeasureShellCommand` for capturing arbitrary statistics. Write tests and documentation.
Midterm Evaluation	Milestones:
	1) Metrics module can now interface with influxDB to store data.
	2) Steps can store gathered statistics in influxDB via the metrics module middleware.
	3) `MeasureShellCommand` implemented that can be used for gathering arbitrary statistics.
	4) Documentation as well as tests have been written for the work done until now.
June 28 - July 5	Set up a local Grafana instance to work with influxDB. Also, write documentation for setting up Grafana with apache (and nginx too).
July 6 - July 13	Modify the frontend to include links to the Grafana instance if the user has opted-in to this service.
July 13 - July 19	At this point, the project is nearly done. I'll take this week to deploy the project for Buildbot master and fix any bugs that are encountered.
July 20 - August 9	Port to Eight (See the heading 'Porting to Eight') [1]
	1) Port Metrics in 2-3 days along with tests.
	2) Port updated Steps as well as the `MeasureShellCommand` with tests.
	3) Test deployment for Buildbot master.
August 10 - August 21	Buffer Time for any unexpected eventualities. Also, use this time to polish code and make it meet Buildbot coding standards and production ready.
Final evaluation	All project goals (except perhaps porting) finished. Also, a working instance deployed for buildbot master.

[1] : I realize that two weeks might be too little time for porting. That said, I would like to mention again that porting is a stretch goal - it might not finish entirely during the summer itself but as I've mentioned under the 'Porting to Eight' heading above, porting shouldn't take too long.

How much time can you devote to your work on Buildbot on a daily basis? Do you have another summer job or other obligations?

I can devote 7-8 hours on average everyday excluding weekends when I'll work less (3-4 hours). I don't have any summer internships lined up. I will however be relatively occupied during 26th May - 29th May.

Do you have any other time commitments over the summer?

Except for some light travelling (during which time I'll still work) for a week in July, I will be free for the rest of the summer. One more thing to note is that my classes will resume around 21st July (by which time most of the work will be done) so I'll be able to work somewhat less after that. I have, however, made my timeline keeping this in mind.

Experience

I have been working with Python as my primary language for the last three year and I believe that I know it quite well. Seeing as this project is almost entirely in Python, I am certain that this past experience with Python will come in quite handy. I am also comfortable with JavaScript (though I don't know it as well as Python) which will be part of this project towards the end, when I have to modify the frontend to include links to the Grafana instance.

I have previously worked on both open source and closed source projects, both mostly in Python.

I've contributed code to a couple of open-source projects before. Chief among those is SymPy[1] in which I participated as a student in last year's summer of code (GSoC 2013)[2]. This project went well. I've also contributed code to mercurial[3], Servo[4] - (Mozilla's new browser engine in Rust), Joomla[5], WordPress[6], Django[7], SimpleCV[8], django-browserid[9], sugarlabs[10]. I also participated as a GSoC 2014 student for sugarlabs[11] which didn't go quite as well as my first GSoC - the project was rather small and I completed it successfully but it never got deployed because of problems with getting web hosting (required to run a docker container).

My closed source work consists of three Python/Django applications made for my college's intranet network - one is small used-goods buying/selling online market, another is a college-wide website for reporting lost/found items and the third is a response portal for taking student responses about faculty and courses. This work I've done as part a campus group that makes several web application for the residents of our college campus. We've also made our institute's main website.

I am studying Applied Mathematics in Indian Institute of Technology, Roorkee. I am doing pretty okay here.

My GitHub profile : @prasoon2211

Links: [1]: Some important patches: One, Two, Three, Four [2]: Majority of code, Official link [3]: One, Two, Three, Merged Patch [4]: Merged PR [5]: Several PRs [6] : One, Two [7]: Django, Patch [8]: One, Two, Three, Four [9]: django-browserid, Patch [10]: One, Two [11]: Official link

Demonstration

I've created a Buildbot instance running on AWS for SymPy on the public IP http://52.74.109.165:8020/. Right now, I'm only running SymPy standard tests but I plan to add coverage as well. The Buildbot instance is open (as in I don't have any users) so anyone is welcome to force a build.

Miscellaneous

This section is in response to points raised in ticket 3234.

Identify what kind of metric "entities" need to be implemented beside those that already exist and implement them

I have instead chosen to unify all metric gathering into one function and then expose several data-getter methods that can perform analysis on the gathered data. All details for these methods are under the Tasks heading.

Implement a way to store metrics in an external metrics storage (my inclination would be â€‹InfluxDB)

InfluxDB is a great idea and I have provided a detailed data storage scheme in the main body of the proposal including a flow of movement of data through the code.

Identify what kind of metrics we'd like to produce (entities (e.g. builds), parts of them (e.g. steps), etc); propose a naming scheme and list of "missing" functionality for metric generation (if any)

I'll try to summarize what I wrote in the proposal as a response to this point.

Buildbot already gathers a lot of build statistics through Steps for supported test frameworks/linters etc. This data get set as build properties.
Additionally, all the Step times are measured and stored internally in the Buildbot database.
But, a user might want to produce metrics for arbirarty build statistics. For this, he/she can either:
- Implement a new Step for gathering specific type of data. For e.g., a user might have to write a JSLint step to extract data from JSLint just as the existing PyLint step extracts data from pylint.
- Use a script based approach with MeasureShellCommand.

Both of these methods are documented in the main body of the proposal. This way, a user is free to produce any kind of metrics from build data.

From this collected data, I've proposed to implement a few statistical analogs for metrics generation as part of the metrics module (see under the Tasks heading above). This is the functionality missing in the metrics module - that is, data reduction on the the gathered build statistics. As I said, I'll be adding some basic statistical methods in the metrics module which will work through the InfluxDB's aggregate functions.

One additional point to keep in mind is that InfluxDB (and so Grafana as well) has its own SQL like query language and several aggregate functions. If a user is willing to learn it, then he/she can perform much more complicated queries on the InfluxDB database.

Implement metric generation in the agreed order (this will be agreed after the list in the previous step is produced)

Covered in last point.

Use an existing metric visualisation tool (my inclination would be â€‹Grafana) to see metrics for a test installation

Grafana is an excellent visualization tool for Grafana with one caveat being that it needs its a running apache instance. To remedy this, I'll write comprehensive documentation for configuring Grafana for both apache and nginx.

Deploy the whole thing for nine.buildbot.net

I've provided one week's time for this in the timeline/schedule.

prasoon2211/application.md