With the next MLflow release, we'd like to update the Python RunData
interface to better support querying metrics, params, and tags. In particular, the current RunData interface, which exposes flat lists of Metric, Param, and RunTag instances, falls short in that it:
- requires a complicated O(n) scan to find a metric/param/tag by key, and
- forces users to then make an additional field-access within a Metric/Param/RunTag instance to get at the actual metric/param/tag value.
We can address point a) by migrating the metrics
, params
, tags
fields of RunData to dictionaries whose keys are metric, param, and tag keys. This is viable as RunData
contains at most one metric, param, and tag per key - for metrics, RunData
is expected to contain the maximum metric value at the maximum timestamp for a given key.
Given this dict-based approach, we must also choose a value type for the dictionaries - two reasonable options are:
- (Proposed) To use simple scalar values - strings for params and tags, and floats for metrics and
- (Alternative) To use Python entity classes (Metric, Param, RunTag).
Option 1) makes it easier to access scalar metric/param/tag values, but harder to access non-value fields of metrics, params, and tags (e.g. the timestamp & eventually x-coordinate of metrics). Note additionally that option 1) doesn't preclude exposing full Metric
/ Param
/ RunTag
objects via the RunData
interface in the future (i.e. we could introduce new fields in RunData
that expose such information). For example, we could handle the addition of new fields to params or tags by introducing lower-level metric_objs
, param_objs
, tag_objs
fields to RunData
that expose flat lists of metrics/params/tags (as the current API does) - however, we expect changes to params & tags to be unlikely. In brief, option 1) trades off flexibility of the RunData interface in exchange for user-friendliness, which is a compromise that seems worthwhile given the unlikeliness of API changes to Param
or RunTag
.
A small detail: with option 1) and the current set of fluent/client APIs, it won't be possible to access metric timestamps in Python. We can address this by exposing the AbstractStore's get_metric_history API via the Python client API, e.g. via an get_metric_history
method in MlflowClient
.
Option 2)'s strengths are option 1)'s weaknesses - 2) allows us to add new fields to metrics/params/tags (e.g. metric x coordinates) and access them via RunData
without adding new fields to RunData
, at the cost of requiring an extra field access to read metric/param/tag values.
Any feedback on the proposed APIs is much appreciated :) - see the gists below for an example of how user workflows might look with the new APIs. It'd be particularly helpful to hear if there are any use cases that would be helped/hurt by the APIs proposed above. Note also that we expect many query use cases to be addressed by the search_runs API, which allows for filtering & searching runs by metric/param/tag. We'd also eventually like to add support for getting a handle to all logged run data as a Pandas or Spark DataFrame, which we expect will also simplify query use cases as well as data export from a server.