Siddharth Murching smurching

Background

With the next MLflow release, we'd like to update the Python RunData interface to better support querying metrics, params, and tags. In particular, the current RunData interface, which exposes flat lists of Metric, Param, and RunTag instances, falls short in that it:

requires a complicated O(n) scan to find a metric/param/tag by key, and
forces users to then make an additional field-access within a Metric/Param/RunTag instance to get at the actual metric/param/tag value.

Design Decisions & Proposed API

We can address point a) by migrating the metrics, params, tags fields of

	import numpy as np
	from matplotlib import pyplot as plt

	def make_plot(X, y, clf, title, filename):
	'''
	Plots the decision boundary of the classifier <clf> (assumed to have been fitted
	to X via clf.fit()) against the matrix of examples X with corresponding labels y.

	Uses <title> as the title of the plot, saving the plot to <filename>.
	'''

	# Import transformer
	from sparkdl.transformers import KerasVectorTransformer

	# Create input DataFrame
	data = [(Vectors.sparse(5, [(1, 1.0), (3, 7.0)]),),
	(Vectors.dense([2.0, 0.0, 3.0, 4.0, 5.0]),),
	(Vectors.dense([4.0, 0.0, 0.0, 6.0, 7.0]),)]
	df = spark.createDataFrame(data, ["features"])

	# Create KerasVectorTransformer

	// You can run this gist in a scala console
	import java.awt.image.{BufferedImage}
	import java.awt.Color

	// Set image pixel at (0, 0)
	val image = new BufferedImage(1, 1, BufferedImage.TYPE_BYTE_GRAY)
	val b: Byte = 26
	val color = new Color(b & 0xff, b & 0xff, b & 0xff)
	image.setRGB(0, 0, color.getRGB)

	# Copyright 2018 Uber Technologies, Inc. All Rights Reserved.
	# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
	#
	# Licensed under the Apache License, Version 2.0 (the "License");
	# you may not use this file except in compliance with the License.
	# You may obtain a copy of the License at
	#
	# http://www.apache.org/licenses/LICENSE-2.0
	#
	# Unless required by applicable law or agreed to in writing, software

	# Copyright 2018 Uber Technologies, Inc. All Rights Reserved.
	# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
	#
	# Licensed under the Apache License, Version 2.0 (the "License");
	# you may not use this file except in compliance with the License.
	# You may obtain a copy of the License at
	#
	# http://www.apache.org/licenses/LICENSE-2.0
	#
	# Unless required by applicable law or agreed to in writing, software

	"""
	Core deployment plugin API methods.
	These methods declared under ``mlflow.deployments``, with an analogous ``mlflow deployments`` CLI
	"""

	from abc import ABC

	class BaseDeploymentClient(ABC):
	"""
	Base class exposing Python model deployment APIs. Plugin implementors should define target-specific

	### Two different flows

	"""
	Flow 1: ensure there's a run accessible to the local machine prior to running
	the project.

	Pros:
	1. User guaranteed that they can access the run created for running the project from the machine that triggered
	project execution.

	from mlflow.tracking.context.abstract_context import RunContextProvider
	from mlflow.utils import databricks_utils
	from mlflow.entities import SourceType
	from mlflow.utils.mlflow_tags import (
	MLFLOW_SOURCE_TYPE,
	MLFLOW_SOURCE_NAME,
	MLFLOW_DATABRICKS_WEBAPP_URL,
	MLFLOW_DATABRICKS_NOTEBOOK_PATH,
	MLFLOW_DATABRICKS_NOTEBOOK_ID
	)

	import mlflow

	# There are two ways to create parent/child runs in MLflow.

	# (1) The most common way is to use the fluent
	# mlflow.start_run API, passing nested=True:
	with mlflow.start_run():
	num_trials = 10
	mlflow.log_param("num_trials", num_trials)
	best_loss = 1e100