Databricks AI/ML End-to-End Lab

Fresh workspace, self-contained, step by step

This lab is designed for a fresh Databricks environment with no preexisting catalog, schema, table, or custom data. It uses only Databricks-provided sample data and an official Databricks agent notebook, so the student does not need to generate any sample dataset manually. Databricks provides sample data in the samples catalog and the /databricks-datasets directory, and its official retrieval-agent tutorial notebook is described as standalone and ready to run with no setup or data required. (Databricks Documentation)

This tutorial intentionally uses two compute paths because that is the correct Databricks design today:

Serverless Starter Warehouse for SQL exploration
Serverless notebook compute for Python, MLflow, training, and experiments

A notebook attached to a SQL warehouse can run only SQL and Markdown cells, while serverless notebook compute supports interactive Python and SQL execution without infrastructure management. (Databricks Documentation)

This lab also uses Unity Catalog model registration, which requires the schema permissions needed to create tables and models. Databricks documents that you need USE CATALOG, USE SCHEMA, CREATE TABLE, and CREATE MODEL privileges in the target schema to follow this pattern. (Databricks Documentation)

1. What the student will build

By the end of this lab, the student will:

query built-in sample data from a Serverless Starter Warehouse
create a schema
load built-in wine-quality sample files into Unity Catalog tables
write Python in a serverless notebook
train and compare models with MLflow
register the best model in Unity Catalog
deploy it with Mosaic AI Model Serving
test the endpoint
prototype an AI agent in AI Playground
run an official Databricks retrieval-agent notebook

Databricks positions these workflows under its current AI and machine learning platform, Mosaic AI, which unifies model development, deployment, and AI application workflows. (Databricks Documentation)

2. High-level architecture

flowchart LR
    A[Serverless Starter Warehouse<br/>SQL exploration] --> B[main.ai_ml_starter schema]
    B --> C[Serverless Notebook<br/>load sample data]
    C --> D[MLflow Experiment<br/>train + compare]
    D --> E[Unity Catalog Model<br/>register best model]
    E --> F[Model Serving<br/>deploy endpoint]
    F --> G[Test predictions]
    C --> H[AI Playground<br/>LLM + tools]
    H --> I[Official Retrieval Agent Notebook]

Instructor note This diagram is the teaching backbone of the course. Keep reminding students that Databricks is one platform, but not all tasks run on the same compute type.

3. Before you start

Required workspace capabilities

The workspace should have:

Unity Catalog enabled
Serverless Starter Warehouse working
Serverless notebook compute available
Serving visible in the left navigation
Playground visible in the left navigation

Databricks documents serverless notebooks and Model Serving as separate capabilities, and AI Playground is the no-code entry point for LLM and agent prototyping. (Databricks Documentation)

Permissions needed

For the schema you will use, the student needs:

USE CATALOG
USE SCHEMA
CREATE TABLE
CREATE MODEL

Databricks explicitly lists these privileges for Unity Catalog–based model workflows. (Databricks Documentation)

Instructor note If a student cannot create a schema in main, do not let them get stuck. Tell them to use any writable catalog and replace main everywhere in the code.

4. Lab section A — Start with SQL on the Starter Warehouse

Step A1 — Open SQL Editor

In the left navigation, click SQL Editor
Attach Serverless Starter Warehouse
Run this query

SELECT *
FROM samples.tpch.customer
LIMIT 10;

Expected output

A result grid with sample customer rows.

Why this matters

This proves the Starter Warehouse is working and shows students that Databricks already includes sample datasets through the samples catalog. (Databricks Documentation)

Step A2 — Create a schema for the lab

Run:

CREATE SCHEMA IF NOT EXISTS main.ai_ml_starter;

Expected output

A success message showing the schema was created, or no error if it already existed.

Instructor note If this fails because main is not writable, replace main with another writable catalog and keep the rest of the tutorial identical.

Step A3 — Confirm the schema exists

Run:

SHOW SCHEMAS IN main;

Expected output

A list of schemas that includes ai_ml_starter.

5. Lab section B — Create the notebook and attach serverless notebook compute

Step B1 — Create the notebook

Click Workspace
Create a new notebook
Name it:

01_ai_ml_end_to_end

In the compute dropdown, select Serverless

Databricks documents that if serverless interactive compute is enabled, all users in the workspace can attach notebooks to Serverless, and new notebooks can default to serverless when no other compute is selected. (Databricks Documentation)

Expected output

The notebook opens and shows Serverless as the attached compute.

Step B2 — Install the libraries

Notebook cell 1 — Python

%pip install -U mlflow scikit-learn pandas matplotlib hyperopt
dbutils.library.restartPython()

Expected output

Package installation logs, followed by Python restart.

Instructor note Have students wait until Python fully restarts before moving on.

6. Lab section C — Load Databricks sample data into your own tables

Step C1 — Add setup variables

Notebook cell 2 — Python

import mlflow
import pandas as pd
import sklearn.metrics
import sklearn.model_selection
import sklearn.ensemble
import matplotlib.pyplot as plt

from hyperopt import fmin, tpe, hp, SparkTrials, STATUS_OK
from hyperopt.pyll import scope

mlflow.set_registry_uri("databricks-uc")

CATALOG_NAME = "main"
SCHEMA_NAME = "ai_ml_starter"
MODEL_NAME = f"{CATALOG_NAME}.{SCHEMA_NAME}.wine_quality_model"

spark.sql(f"CREATE SCHEMA IF NOT EXISTS {CATALOG_NAME}.{SCHEMA_NAME}")

print("Registry URI:", "databricks-uc")
print("Target model:", MODEL_NAME)

Expected output

The notebook prints the registry URI and full model name.

Why this matters

Databricks uses Unity Catalog as the hosted MLflow model registry for governed model management across workspaces. (Databricks Documentation)

Step C2 — Load the built-in wine-quality files

Notebook cell 3 — Python

white_wine = spark.read.csv(
    "/databricks-datasets/wine-quality/winequality-white.csv",
    sep=";",
    header=True,
    inferSchema=True
)

red_wine = spark.read.csv(
    "/databricks-datasets/wine-quality/winequality-red.csv",
    sep=";",
    header=True,
    inferSchema=True
)

for c in white_wine.columns:
    white_wine = white_wine.withColumnRenamed(c, c.replace(" ", "_"))

for c in red_wine.columns:
    red_wine = red_wine.withColumnRenamed(c, c.replace(" ", "_"))

display(white_wine.limit(5))
display(red_wine.limit(5))

Expected output

Two preview tables showing wine-quality data.

Why this matters

This uses sample data that already exists in Databricks, so the student does not need to create or upload their own dataset. Databricks’ official ML getting-started tutorial uses this same wine dataset. (Databricks Documentation)

Step C3 — Save the data as Unity Catalog tables

Notebook cell 4 — Python

white_wine.write.mode("overwrite").saveAsTable(f"{CATALOG_NAME}.{SCHEMA_NAME}.white_wine")
red_wine.write.mode("overwrite").saveAsTable(f"{CATALOG_NAME}.{SCHEMA_NAME}.red_wine")

print("Created tables:")
print(f"{CATALOG_NAME}.{SCHEMA_NAME}.white_wine")
print(f"{CATALOG_NAME}.{SCHEMA_NAME}.red_wine")

Expected output

Printed table names confirming creation.

Step C4 — Validate with SQL inside the notebook

Notebook cell 5 — SQL

SELECT quality, COUNT(*) AS cnt
FROM main.ai_ml_starter.red_wine
GROUP BY quality
ORDER BY quality;

Expected output

A distribution table showing counts by wine quality.

Instructor note This is the first moment students see notebook SQL and Unity Catalog tables together. Pause here and explain the flow: built-in files → own tables → reusable governed assets.

7. Lab section D — Go back to the Starter Warehouse and query your own tables

Step D1 — Open SQL Editor again

Attach Serverless Starter Warehouse and run:

SELECT COUNT(*) AS rows_red
FROM main.ai_ml_starter.red_wine;

SELECT COUNT(*) AS rows_white
FROM main.ai_ml_starter.white_wine;

Expected output

Two row counts, one for red wine and one for white wine.

Step D2 — Run a slightly richer query

SELECT quality, COUNT(*) AS cnt
FROM main.ai_ml_starter.red_wine
GROUP BY quality
ORDER BY quality;

Expected output

A quality distribution result table.

Teaching point

This shows the student that:

the notebook created the tables
the Starter Warehouse can query them immediately
the same governed data supports both notebook and SQL use cases

8. Lab section E — Train the first model

Step E1 — Prepare the training dataset

Notebook cell 6 — Python

white_pdf = spark.read.table(f"{CATALOG_NAME}.{SCHEMA_NAME}.white_wine").toPandas()
red_pdf = spark.read.table(f"{CATALOG_NAME}.{SCHEMA_NAME}.red_wine").toPandas()

white_pdf["is_red"] = 0.0
red_pdf["is_red"] = 1.0

data_df = pd.concat([white_pdf, red_pdf], axis=0)

labels = data_df["quality"].astype("int") >= 7
features = data_df.drop(["quality"], axis=1)

X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(
    features,
    labels,
    test_size=0.2,
    random_state=1
)

print("Train shape:", X_train.shape)
print("Test shape:", X_test.shape)
print("High-quality rate:", float(labels.mean()))

Expected output

Printed shapes for training and test sets, plus the overall positive-label rate.

Teaching point

The model goal is to predict whether a wine is high quality. Databricks’ official tutorial uses this same framing for the wine-quality example. (Databricks Documentation)

Step E2 — Train a baseline model and log it to MLflow

Notebook cell 7 — Python

mlflow.autolog()

with mlflow.start_run(run_name="gradient_boost_baseline"):
    model = sklearn.ensemble.GradientBoostingClassifier(random_state=0)
    model.fit(X_train, y_train)

    predicted_probs = model.predict_proba(X_test)
    roc_auc = sklearn.metrics.roc_auc_score(y_test, predicted_probs[:, 1])

    roc_display = sklearn.metrics.RocCurveDisplay.from_estimator(model, X_test, y_test)
    roc_display.figure_.savefig("/tmp/roc_curve.png")

    mlflow.log_metric("test_auc", roc_auc)
    mlflow.log_artifact("/tmp/roc_curve.png")

    print("Baseline test AUC:", roc_auc)

Expected output

A printed AUC value and an MLflow run link in the notebook output.

Instructor note

Tell students to click the run link and inspect:

parameters
metrics
artifacts
the logged model

This is the “experiment and compare” part of the lifecycle.

9. Lab section F — Tune the model and compare runs

Step F1 — Run a small hyperparameter search

Notebook cell 8 — Python

search_space = {
    "n_estimators": scope.int(hp.quniform("n_estimators", 20, 300, 1)),
    "learning_rate": hp.loguniform("learning_rate", -3, 0),
    "max_depth": scope.int(hp.quniform("max_depth", 2, 6, 1)),
}

def train_model(params):
    mlflow.autolog()
    with mlflow.start_run(nested=True):
        model_hp = sklearn.ensemble.GradientBoostingClassifier(
            random_state=0,
            **params
        )
        model_hp.fit(X_train, y_train)

        predicted_probs = model_hp.predict_proba(X_test)
        auc = sklearn.metrics.roc_auc_score(y_test, predicted_probs[:, 1])

        mlflow.log_metric("test_auc", auc)
        return {"loss": -auc, "status": STATUS_OK}

spark_trials = SparkTrials(parallelism=4)

with mlflow.start_run(run_name="gb_hyperopt"):
    best_params = fmin(
        fn=train_model,
        space=search_space,
        algo=tpe.suggest,
        max_evals=8,
        trials=spark_trials
    )

print("Best parameters:", best_params)

Expected output

A printed dictionary of best parameters and multiple child runs in MLflow.

Teaching point

Databricks’ getting-started ML tutorial highlights MLflow tracking together with Hyperopt-driven tuning, so this is a natural next step after the baseline model. (Databricks Documentation)

Step F2 — Identify the best run

Notebook cell 9 — Python

best_run = mlflow.search_runs(
    order_by=["metrics.test_auc DESC", "start_time DESC"],
    max_results=1
).iloc[0]

print("Best run_id:", best_run.run_id)
print("Best test_auc:", best_run["metrics.test_auc"])

Expected output

A printed best run_id and best AUC.

10. Lab section G — Register the best model in Unity Catalog

Step G1 — Register the model

Notebook cell 10 — Python

model_uri = f"runs:/{best_run.run_id}/model"
registered_model = mlflow.register_model(model_uri, MODEL_NAME)

print("Registered model:", registered_model.name)
print("Version:", registered_model.version)

Expected output

A printed model name such as main.ai_ml_starter.wine_quality_model and version number 1.

Why this matters

This is the point where a trained experiment becomes a governed, reusable model artifact in Unity Catalog. Databricks documents that Models in Unity Catalog extend centralized governance, discovery, auditing, and lineage to ML models. (Databricks Documentation)

Step G2 — Verify in the UI

Open Catalog
Go to main
Open schema ai_ml_starter
Confirm the model appears there

Expected output

A registered model object visible in the schema.

Instructor note Pause here and reinforce the concept: data assets and model assets are both governed under Unity Catalog.

11. Lab section H — Deploy the model with Mosaic AI Model Serving

Step H1 — Create the serving endpoint

Click Serving
Click Create serving endpoint
Name it:

wine-quality-endpoint

Add a served model
Choose the registered model:

main.ai_ml_starter.wine_quality_model

Select the latest version
Create the endpoint

Databricks documents Mosaic AI Model Serving as its managed solution for deploying AI and ML models for real-time serving and batch inference. (Databricks Documentation)

Expected output

An endpoint that moves through provisioning and eventually becomes Ready.

Step H2 — Test the endpoint in the UI

Use this sample request body in the endpoint testing panel:

{
  "dataframe_records": [
    {
      "fixed_acidity": 7.4,
      "volatile_acidity": 0.70,
      "citric_acid": 0.00,
      "residual_sugar": 1.9,
      "chlorides": 0.076,
      "free_sulfur_dioxide": 11.0,
      "total_sulfur_dioxide": 34.0,
      "density": 0.9978,
      "pH": 3.51,
      "sulphates": 0.56,
      "alcohol": 9.4,
      "is_red": 1.0
    }
  ]
}

Expected output

A prediction response with a classification result.

Instructor note This is the “release and serve” moment. Make sure students see that the model is no longer only a notebook object or experiment artifact.

12. Lab section I — Prototype AI behavior in AI Playground

Step I1 — Open Playground and compare two models

Click Playground
Choose one hosted model
Add another model with the + button
Ask:

Explain in simple English what a classifier does.

Databricks’ current no-code GenAI tutorial uses AI Playground to compare models side by side and prototype agents. (Databricks Documentation)

Expected output

Two parallel model responses for the same prompt.

Step I2 — Prototype a tool-calling agent

In Playground, choose a model labeled Tools enabled
Open Tools
Add the built-in tool:

system.ai.python_exec

Ask:

Use Python to calculate the average of 9.4, 9.8, 10.0, and 10.2.

Databricks documents system.ai.python_exec as a built-in tool that can extend agents with the ability to run Python code in a sandboxed environment. (Databricks Documentation)

Expected output

A response showing that the model used the tool and returned the calculated average.

Teaching point

This helps students separate two ideas:

classic ML model serving
LLM + tools agent behavior

Both live in Databricks, but they are not the same object.

Step I3 — Export the prototype to code

In Playground, use the Get code option and create the generated notebook.

Expected output

A notebook created from the Playground prototype.

Databricks’ current no-code getting-started flow explicitly includes exporting the agent to code after prototyping it in Playground. (Databricks Documentation)

13. Lab section J — Run the official retrieval-agent tutorial notebook

This is the cleanest way to show a real Databricks agent flow without inventing custom RAG data.

Step J1 — Import the official notebook

Go to the official Databricks tutorial page for Build, evaluate, and deploy a retrieval agent and import the notebook into your workspace from that page. Databricks describes this notebook as standalone, built on a sample document corpus, and ready to run with no setup or data required. (Databricks Documentation)

Expected output

A notebook imported into your workspace.

Step J2 — Run the notebook from top to bottom

Follow the notebook sections in order.

Expected output

The student will see a more complete agent workflow, including:

retrieval
tool usage
evaluation
deployment

Databricks positions this notebook as the official guided path for building, evaluating, and deploying a retrieval agent. (Databricks Documentation)

Instructor note Use this notebook as the “advanced capstone” after the student already understands notebooks, data, models, serving, and Playground.

14. Final recap for the student

At the end of this lab, the student has completed the full flow:

SQL exploration on Starter Warehouse
Own schema and own tables in Unity Catalog
Notebook development on serverless compute
MLflow experiments and model comparison
Model registration in Unity Catalog
Model Serving deployment
AI Playground LLM and tool-calling prototype
Official retrieval agent notebook for end-to-end agent flow

That is a real Databricks platform journey, not just a single notebook exercise. It uses the same current platform areas Databricks documents today: serverless notebooks, SQL warehouses, MLflow with Unity Catalog, Model Serving, AI Playground, and the Agent Framework tutorial path. (Databricks Documentation)

15. Cleanup steps

To avoid unnecessary charges:

Leave auto-stop enabled on the Starter Warehouse
Delete the model serving endpoint when finished
Delete the schema objects if they are no longer needed

Optional cleanup SQL

DROP MODEL IF EXISTS main.ai_ml_starter.wine_quality_model;
DROP TABLE IF EXISTS main.ai_ml_starter.red_wine;
DROP TABLE IF EXISTS main.ai_ml_starter.white_wine;
DROP SCHEMA IF EXISTS main.ai_ml_starter;

Instructor note Run the DROP SCHEMA only after all dependent objects are removed.

16. One-page instructor summary

Use this wording when you teach it:

“We start with a Serverless Starter Warehouse to explore data with SQL. Then we switch to serverless notebook compute because Python-based ML training does not run on a SQL warehouse. We load built-in Databricks sample data into our own Unity Catalog schema, train and compare models with MLflow, register the best model in Unity Catalog, deploy it with Model Serving, and then show the GenAI side of Databricks through Playground and the official retrieval-agent notebook.” (Databricks Documentation)

The clean mental model for students is:

Starter Warehouse = SQL analytics
Serverless notebook = Python and ML development
Unity Catalog = governed data and model assets
MLflow = experiments and tracking
Model Serving = deploy the model
Playground / Agent tutorial = LLM and agent experience

I can turn this into a DOCX handout or a slide deck with speaker notes.

devops-school/README.md

Databricks AI/ML End-to-End Lab

Fresh workspace, self-contained, step by step

1. What the student will build

2. High-level architecture

3. Before you start

Required workspace capabilities

Permissions needed

4. Lab section A — Start with SQL on the Starter Warehouse

Step A1 — Open SQL Editor

Expected output

Why this matters

Step A2 — Create a schema for the lab

Expected output

Step A3 — Confirm the schema exists

Expected output

5. Lab section B — Create the notebook and attach serverless notebook compute

Step B1 — Create the notebook

Expected output

Step B2 — Install the libraries

Notebook cell 1 — Python

Expected output

6. Lab section C — Load Databricks sample data into your own tables

Step C1 — Add setup variables

Notebook cell 2 — Python

Expected output

Why this matters

Step C2 — Load the built-in wine-quality files

Notebook cell 3 — Python

Expected output

Why this matters

Step C3 — Save the data as Unity Catalog tables

Notebook cell 4 — Python

Expected output

Step C4 — Validate with SQL inside the notebook

Notebook cell 5 — SQL

Expected output

7. Lab section D — Go back to the Starter Warehouse and query your own tables

Step D1 — Open SQL Editor again

Expected output

Step D2 — Run a slightly richer query

Expected output

Teaching point

8. Lab section E — Train the first model

Step E1 — Prepare the training dataset

Notebook cell 6 — Python

Expected output

Teaching point

Step E2 — Train a baseline model and log it to MLflow

Notebook cell 7 — Python

Expected output

Instructor note

9. Lab section F — Tune the model and compare runs

Step F1 — Run a small hyperparameter search

Notebook cell 8 — Python

Expected output

Teaching point

Step F2 — Identify the best run

Notebook cell 9 — Python

Expected output

10. Lab section G — Register the best model in Unity Catalog

Step G1 — Register the model

Notebook cell 10 — Python

Expected output

Why this matters

Step G2 — Verify in the UI

Expected output

11. Lab section H — Deploy the model with Mosaic AI Model Serving

Step H1 — Create the serving endpoint

Expected output

Step H2 — Test the endpoint in the UI

Expected output

12. Lab section I — Prototype AI behavior in AI Playground

Step I1 — Open Playground and compare two models

Expected output

Step I2 — Prototype a tool-calling agent

Expected output

Teaching point

Step I3 — Export the prototype to code

Expected output