Skip to content

Instantly share code, notes, and snippets.

@imrehg
Last active March 10, 2020 12:11
Show Gist options
  • Save imrehg/7e7890160352e39382fa570c616d2065 to your computer and use it in GitHub Desktop.
Save imrehg/7e7890160352e39382fa570c616d2065 to your computer and use it in GitHub Desktop.
HiPlot with the Faculty Platform

Visualising experiments with HiPlot on the Faculty Platform

HiPlot is a tool for high-dimentiomnal visualisation, that comes handy for example when running hyperparameter optimisation of machine learning models. This readme collects some usage notes to make most of both experiment tracking on the Faculty Platform and HiPLot.

See the HiPlot documentation for more details on any HiPLot related information.

This readme has two main sections: using HiPlot in notebooks and as an app.

Using in a Jupyter Notebook

Here's an example, that can be used in a Jupyter notebook, which plots all the logged parameters and metrics. To use:

  • install hiplot,
  • paste this section into a notebook,
  • ajust the number in the experiments variable (which can be retrieved fromt the "export to code" section in the experiments view)

This below prints absolutely everything (params, metrics, tags, etc). Most likely not what is needed, but might be good to start exploring:

import uuid
import hiplot as hip
import mlflow

# Change this for your experiment
experiments = [0]
df = mlflow.search_runs(experiments)

exp = hip.Experiment()

for _, row in df.iterrows():
    dp = hip.Datapoint(uid=str(uuid.UUID(row["run_id"])), values=row.to_dict())
    exp.datapoints.append(dp)
exp.display(force_full_width=True)

Also included in a notebook.

Note, that columns can be dragged to the side of the plot to remove them from the display.

This below only plots parameters and metrics logged (ignoring all tags, etc), which might be a better starting point:

import uuid
import hiplot as hip
import mlflow

# Change the numerical value for your experiment
experiments = [0]
df = mlflow.search_runs(experiments)

exp = hip.Experiment()

for index, row in df.iterrows():
    values = {}
    # Add parameters first
    params = [p.replace("params.", "") for p in row.keys() if p.startswith("params.")]
    for p in params:
        values[p] = row[f"params.{p}"]
    # Add metrics next
    metrics = [
        m.replace("metrics.", "") for m in row.keys() if m.startswith("metrics.")
    ]
    for m in metrics:
        if row[f"metrics.{m}"] < 10:
            values[m] = row[f"metrics.{m}"]
    dp = hip.Datapoint(uid=str(uuid.UUID(row["run_id"])), values=values,)
    exp.datapoints.append(dp)
exp.display(force_full_width=True)

Also included in a notebook.

After removing some columns, reorganising the rest, and set colouring to the "test_rmse" column (test set root-mean-squared error for our dataset):

Filtering by one of the columns:

Here's an example of setting up exactly what entries to plot:

import hiplot as hip
import mlflow

# Change this for your experiment
experiments = [0]
df = mlflow.search_runs(experiments)

exp = hip.Experiment()

variables = [
    "params.learning_rate",
    "params.momentum",
    "metrics.train_rmse",
    "metrics.test_rmse",
    "metrics.val_rmse",
]

for index, row in df.iterrows():
    values = {}
    for v in variables:
        if v in row and row[v]:
            values[v] = float(row[v])
    dp = hip.Datapoint(uid=str(row["run_id"]), values=values)
    exp.datapoints.append(dp)
exp.display(force_full_width=True)

Also included in a notebook.

A lot of the settings can be adjusted to be fixed as well (e.g. the order of the columns on the parallel plot), see this part of the docs.

Using as an app

To run HiPlot as an app:

  • Save faculty_hiplot_fetcher.py and start-hiplot-server.sh from this gist into your workspace, in the same folder.
  • Create an environment that installs hiplot
  • Set up the app to use that environment, and run the start-hiplot-server.sh script

If you navigate to the app's interface, can load experiments directly by using a single experiment's ID number with the faculty:// prefix:

faculty://2

or name:

faculty://Training

Can also use it in a multi-experiment setting as well:

multi://{
    "new model": "faculty://3",
    "old model": "faculty://Training"
}

After removing some of the columns (by dragging them to the left or right side), and reorganizing, it's easy to compare the two experiments (the "exp" column automatically added to device the two set of experiments).

Chek the HiPLot docs regarding multiple experiments.

View raw

(Sorry about that, but we can’t show files that are this big right now.)

Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View raw

(Sorry about that, but we can’t show files that are this big right now.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment