Skip to content

Instantly share code, notes, and snippets.

@JayDoubleu
Created August 5, 2025 17:43
Show Gist options
  • Save JayDoubleu/65d8486221e8c91b1cb98943abb84b84 to your computer and use it in GitHub Desktop.
Save JayDoubleu/65d8486221e8c91b1cb98943abb84b84 to your computer and use it in GitHub Desktop.

Below is a high‑level, end‑to‑end guide to the four pieces you asked about – how they fit together in Azure Databricks, what they’re used for, and the key steps needed to get them working in production.


1. Azure Databricks SSO (Single Sign‑On)

What it is

  • SSO lets users authenticate once (e.g., via Azure AD, Okta, ADFS) and then automatically get access to the Databricks workspace, notebooks, jobs, and REST APIs without re‑entering credentials.
  • In Azure Databricks the SSO flow is built on SAML 2.0 (or OpenID Connect for Azure AD).

Why you care

  • Security – eliminates password sprawl, enforces MFA, and centralises policy (conditional access, expiration, revocation).
  • Compliance – audit logs are emitted to Azure AD sign‑in logs and can be forwarded to Microsoft Sentinel, Splunk, etc.
  • Productivity – users jump straight from the Azure portal or Power BI to a notebook; CI/CD pipelines can use a service‑principal token instead of a static PAT.

Core components

Component Role
Azure AD / Identity Provider (IdP) Stores users/groups, enforces MFA, issues SAML assertions.
Databricks workspace Trusted Service Provider (SP) that consumes the SAML response.
SCIM API (optional) Syncs groups → workspace automatically, keeping access control in lock‑step with Azure AD.
OAuth 2.0 / PAT For programmatic access when a service principal or a job needs a token — usually generated once via the UI or via Azure CLI (az databricks workspace get-token).

Typical flow (user‑centric)

  1. User clicks “Login to Databricks” from Azure portal or a bookmarked URL.
  2. Azure AD authenticates the user (password + MFA, conditional access).
  3. Azure AD sends a SAML assertion to the Databricks SP endpoint.
  4. Databricks validates the assertion, creates a session token, and redirects the user to the UI.
  5. The Databricks UI then exchanges that token for a short‑lived bearer token used for the REST API and Spark driver processes.

How to set it up (step‑by‑step)

Step Azure CLI / Portal Action Key Settings
1️⃣ Create the workspace (or edit an existing one) → AuthenticationEnabled: Azure AD Choose “Azure AD” as the identity provider.
2️⃣ In Azure AD → Enterprise applicationsAdd a new applicationNon-gallerySAML Give it a name like Databricks‑<workspace>.
3️⃣ Set the SAML configuration:
Identifier (Entity ID)https://<region>.azuredatabricks.net/saml/metadata
Reply URL (ACS)https://<workspace>.azuredatabricks.net/saml/acs
Copy the SSO URL & Entity ID for later.
4️⃣ Upload the Databricks metadata (SP metadata XML) – download it from the workspace User Settings → SSO page.
5️⃣ Assign users / groups → Ensure the same groups exist in Azure AD that you want to map to Databricks roles (admin, user, compute‑admin, etc.).
6️⃣ (Optional) SCIM provisioning → In Azure AD > Enterprise apps > DatabricksProvisioningSet to ‘On’ and provide the Databricks SCIM token (generated in the workspace UI).
7️⃣ Test – Click the Azure portal “Launch Workspace” button. You should be signed in without a password prompt.
8️⃣ Audit – Enable Azure AD sign‑in logs → forward to Log Analytics for compliance reporting.

Common pitfalls & fixes

Symptom Likely cause Fix
“Invalid SAML response – certificate not trusted” The IdP is using a self‑signed cert or a cert that isn’t uploaded to Databricks. Upload the IdP signing certificate in the workspace SSO configuration.
Users get a “403 – Insufficient permissions” even after SSO Group‑to‑role mapping not synced (SCIM off or mismatch). Enable SCIM or manually assign users to the admin role via UI → Admin Console → Users.
Service principal token expires after 24 h Using a PAT instead of an OAuth token. Create an Azure AD service principal, grant it the WorkspaceAdmin role, and generate an OAuth access token (az account get-access-token --resource 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d).

2. Model Serving in Azure Databricks

What it is

Model serving = a managed, production‑grade endpoint that hosts a trained model (MLflow‑registered, TensorFlow, PyTorch, XGBoost, etc.) and automatically scales to handle inference requests over HTTP/HTTPS.

  • Databricks offers two flavors:
    1. Serverless Model Serving – a fully managed, auto‑scaled service (no cluster management).
    2. Cluster‑backed Model Serving – uses an existing job cluster, giving you more control over GPU, Spark configs, or custom libraries.

Why it matters

Benefit Description
Zero‑Ops No need to spin up a Flask/FastAPI container, set up a load balancer, or manage TLS.
Native MLflow integration Deploy a model directly from the MLflow Model Registry with a single click / API call.
Auto‑scaling & request throttling Handles burst traffic; you can set concurrency limits and traffic splitting (A/B testing).
Observability Integrated metrics (latency, error rates) and request logging to Databricks Unity Catalog audit tables.
Security Endpoints can be placed behind Azure Private Link and accessed only by workspaces in the same VNet, plus Azure AD tokens for client authentication.

Architecture (high‑level)

[Client] --> HTTPS (OAuth token) --> Azure Load Balancer (private link) --> 
   Databricks Model Serving Endpoint (Serverless) 
      |--- inference request --> (auto‑scaled containers) --> Model (MLflow) 
      |--- logging --> Unity Catalog Tables / ADLS Gen2
  • When you enable Private Link, the DNS name <model-name>.cloud.databricks.com resolves to a private IP in your VNet, guaranteeing that traffic never traverses the public internet.

How to provision a served model (serverless)

  1. Register the model (if not already)

    import mlflow
    mlflow.sklearn.log_model(model, "model")
    mlflow.register_model("runs:/<run-id>/model", "my-model")
  2. Promote the version to ‘Staging’ or ‘Production’ (via UI or API).

  3. Serve the model (via UI → Model Registry → “Serve”) or programmatically:

    # Using the Databricks CLI
    databricks mlflow model-versions create-serving-endpoint \
        --model-name my-model \
        --version 3 \
        --name my-endpoint \
        --config '{ "scale_to_zero_enabled": true }'   # optional
  4. Get the endpoint URL & token

    endpoint_info = mlflow.deployments.get_deployment_endpoint(name="my-endpoint")
    endpoint_url = endpoint_info["url"]
    token = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiToken().get()
  5. Call the endpoint (sample Python request):

    import json, requests
    
    headers = {
        "Authorization": f"Bearer {token}",
        "Content-Type": "application/json"
    }
    payload = {"inputs": [[1.0, 2.0, 3.0]]}  # format depends on model type
    r = requests.post(endpoint_url + "/invocations", headers=headers, data=json.dumps(payload))
    print(r.json())

Advanced options

Feature When to use How to enable
GPU acceleration Deep learning models (TensorRT, large transformer inference) Use a cluster‑backed endpoint with a GPU‑enabled job cluster (spark.databricks.cluster.profile = "serverless" + node_type_id = "Standard_NC6").
Canary / A/B testing Rolling out a new model version In the endpoint config, set traffic_percentage for each version.
Custom pre/post‐processing Need to reshape input, enrich with external data Deploy a custom container (Docker) as the model runtime (requires Private Link and an Azure Container Registry).
Batch scoring Large offline datasets Use MLflow Batch Inference (Spark job) instead of serving; you can still reuse the same model artifact.

Monitoring & troubleshooting

Metric Source Typical alert
Latency (p95) Databricks Observability → Endpoint dashboards Alert if > 500 ms (or service‑level agreement).
Error rate (5xx) Endpoint logs Alert if > 1 % of requests fail.
CPU / GPU utilization Autoscaling logs Investigate if scaling isn’t kicking in (maybe max_concurrency too low).
Request payload size Access logs Reject > 2 MB payloads (avoid DDOS).

3. AgentBricks (Databricks AI Agents)

What it is

AgentBricks is a framework (released in 2024) for building multi‑modal, stateful AI agents that run inside Databricks notebooks/jobs. It is built on top of LangChain‑style components but is first‑class integrated with Unity Catalog, Delta Lake, and the workspace security model.

Key concepts:

Concept Description
Agent An orchestrator that decides which tool to invoke (SQL, Python function, external API).
Brick A reusable, versioned tool packaged as a Unity Catalog SQL function, Python UDF, or Delta Lake table.
State Store Persistent context stored in a Delta table (or a Vector DB) that the agent can read/write across calls.
Prompt Library Centralised, version‑controlled prompts stored as SQL Views or MLflow artifacts.
Execution Guardrails Policy engine (built on Databricks Lakehouse Governance) to enforce e.g., “no writes to production tables” or “max token budget”.

Why you’d use AgentBricks

Use‑case Benefit
Self‑service analytics – a user asks, “What were sales in EMEA last quarter?” → agent runs a SQL Brick, formats the result, and returns a PDF.
Data‑pipeline assistants – an agent can detect data drift, trigger a Delta Live Table job, and update a model registry entry.
Enterprise‑grade RAG – combine a vector search Brick (FAISS/Databricks Vector Search) with a LLM Brick (Databricks Foundation Model) to write natural‑language summaries of a Delta table.
Compliance‑aware bots – guardrails automatically block a write to a regulated table, and log the attempt to an audit audit table.

How AgentBricks works (simplified flow)

User request (text) --> Agent (LLM + prompt) 
  --> parses intent --> selects Brick(s) (SQL, Python, external API)
      --> Brick runs (reads/writes Delta, calls external service)
      --> returns output --> Agent aggregates, may call another Brick (loop)
  --> final response returned to user

All bricks are immutable (registered in Unity Catalog), so you can version‑control them like code and roll back if a bug appears.

Building an AgentBricks pipeline (example)

  1. Create a Brick – a Python UDF that queries Delta and returns a Pandas dataframe.

    # In a notebook, register the brick
    from pyspark.sql.functions import pandas_udf, PandasUDFType
    import pandas as pd
    
    @pandas_udf("string", PandasUDFType.SCALAR)
    def sales_summary_brick(region: pd.Series, quarter: pd.Series) -> pd.Series:
        # Logic: query Delta table via Spark session
        df = spark.sql(f"""
            SELECT sum(amount) as total, avg(amount) as avg
            FROM sales
            WHERE region = '{region.iloc[0]}' AND quarter = '{quarter.iloc[0]}'
        """)
        total, avg = df.first()
        return pd.Series([f"Region {region.iloc[0]}, Q{quarter.iloc[0]}: total ${total:,.0f}, avg ${avg:,.2f}"])
    
    spark.udf.register("sales_summary_brick", sales_summary_brick)
  2. Define the Prompt (stored as an MLflow artifact)

    {
      "system": "You are a data assistant. Use only the provided bricks. Do NOT fabricate data.",
      "user_template": "Give me a sales summary for {region} in Q{quarter}."
    }
  3. Instantiate the Agent (via the databricks-agent library)

    from databricks_agent import Agent, BrickRegistry, PromptStore
    
    # Register bricks and prompts
    registry = BrickRegistry()
    registry.register_sql("sales_summary_brick", "CALL sales_summary_brick('{region}', '{quarter}')")
    
    prompts = PromptStore()
    prompts.register("sales_summary", "prompt.json")   # path to the JSON above
    
    # Create the agent
    sales_agent = Agent(
        model="databricks-dbrx-instruct",   # a foundation model hosted on Azure
        bricks=registry,
        prompts=prompts,
        guardrails={"max_tokens": 500, "no_write_to_prod": True}
    )
  4. Run the agent

    response = sales_agent.run(
        prompt_name="sales_summary",
        variables={"region": "EMEA", "quarter": "3"}
    )
    print(response)   # → "Region EMEA, Q3: total $12,345,000, avg $123,450"
  5. Persist state (optional) – The agent can write to a Delta table agent_state for future calls (e.g., to remember the last queried period).

    spark.sql("""
    INSERT INTO agent_state (session_id, last_region, last_quarter, timestamp)
    VALUES ('sess-123', 'EMEA', '3', current_timestamp())
    """)

Deployment & Ops

Aspect Recommendation
Execution environment Use Jobs with a cluster‑backed agent (GPU if LLM inference is done locally) or Serverless if you only call a hosted foundation model.
Versioning Store bricks in a Unity Catalog schema (bricks.sales) and tag each version. Use ALTER BRICK RENAME TO ... to stage updates.
Security Guardrails evaluate the Data Access Control (DAC) of the caller; ensure the agent runs with a service principal that has the minimum required permissions.
Observability Enable Delta Live Tables for the agent_state table to capture change data; use Databricks SQL dashboards to monitor usage.
Testing Unit‑test bricks with PyTest against a dev Delta lake; CI/CD can run databricks jobs run-now against a staging workspace.

4. Databricks Apps (Custom UI + Compute)

What they are

Databricks Apps are first‑class, low‑code web applications that run inside the Databricks UI and can call notebooks, jobs, or model‑serving endpoints behind the scenes.

Think of them as mini‑portals (e.g., “Sales Forecast Dashboard”, “Data Quality Submit Form”, “Customer‑Churn Explorer”) that:

  • Present a UI built with React (or Streamlit/Gradio) that is hosted on the Databricks App Service.
  • Interact with backend compute (Jobs, Model Serve, Delta Live Tables) via Databricks REST APIs using the workspace token (no external infra).
  • Leverage Unity Catalog for fine‑grained access to tables or models.
  • Can be published to the workspace catalog and discovered via the Apps Launcher.

Why you’d use them

Benefit Example
Zero‑ops frontend – No need for Azure App Service / AKS. A finance team can create a “Budget Request” form that writes to a Delta table.
Integrated auth – Uses the same Azure AD SSO token as the rest of Databricks. Users never need to enter another password to call an internal model.
Rapid prototyping – Deploy from a notebook in seconds (databricks apps deploy). Data scientists can ship a “What‑If analysis” UI for a model they just trained.
Governance – Apps are catalog objects; you can audit who created/modified them and apply tagging. Compliance can require that an app that writes to PII tables be reviewed before publishing.

Architecture snapshot

[User’s browser] --(HTTPS + Azure AD token)--> Databricks Apps Service (React SPA) 
   |
   |-- Calls Databricks REST (Jobs, Model Serve, SQL) using the same token 
   |-- Reads/Writes Delta via Unity Catalog (backend Spark)
   |
[Databricks Workspace] <---> [Delta Lake / Unity Catalog] <---> [Model Serving]

Building a simple Databricks App (step‑by‑step)

  1. Create a new App project (via CLI)

    databricks apps init my-sales-app
    cd my-sales-app

    This scaffolds a frontend/ folder with a React app and a backend/ folder for optional Python Lambda‑style functions.

  2. Define the UI (frontend/src/App.tsx)

    import React, { useState } from "react";
    import { runJob } from "./api";
    
    function App() {
      const [region, setRegion] = useState("EMEA");
      const [quarter, setQuarter] = useState("3");
      const [result, setResult] = useState("");
    
      const handleSubmit = async () => {
        const payload = { region, quarter };
        const res = await runJob("sales_summary_job", payload);
        setResult(res.output);
      };
    
      return (
        <div>
          <h1>Sales Summary</h1>
          <label>Region: <input value={region} onChange={e=>setRegion(e.target.value)} /></label>
          <label>Quarter: <input value={quarter} onChange={e=>setQuarter(e.target.value)} /></label>
          <button onClick={handleSubmit}>Run</button>
          <pre>{result}</pre>
        </div>
      );
    }
    export default App;
  3. Create the backend job (a notebook that will be invoked)

    # Notebook: sales_summary_job
    dbutils.widgets.text("region", "")
    dbutils.widgets.text("quarter", "")
    
    region = dbutils.widgets.get("region")
    quarter = dbutils.widgets.get("quarter")
    
    df = spark.sql(f"""
      SELECT sum(amount) as total, avg(amount) as avg
      FROM sales
      WHERE region = '{region}' AND quarter = '{quarter}'
    """)
    total, avg = df.first()
    print(f"Region {region}, Q{quarter}: total ${total:,.0f}, avg ${avg:,.2f}")

    Mark the notebook as Job‑type and give it a meaningful name (sales_summary_job).

  4. Expose a thin wrapper in frontend/src/api.ts

    export async function runJob(jobName: string, params: any) {
      const response = await fetch(`/api/v1/jobs/run-now`, {
        method: "POST",
        headers: {
          "Content-Type": "application/json",
          // Authorization header is automatically added by the Apps runtime (uses the user's token)
        },
        body: JSON.stringify({
          job_name: jobName,
          notebooks_params: params
        })
      });
      return response.json();
    }
  5. Deploy the app

    # Build React bundle
    cd frontend && npm run build && cd ..
    
    # Deploy to the workspace catalog
    databricks apps deploy --name sales-summary --path ./frontend/build

    After deployment you’ll see the app appear under Workspace → Apps → Sales‑Summary. Users can pin it to their workspace launcher.

  6. Add a guardrail (optional) – In the app manifest you can declare required scopes, e.g.:

    {
      "name": "sales-summary",
      "required_scopes": ["catalog:read:sales", "jobs:run"]
    }

    Users lacking those scopes will see an “Access denied” message.

Advanced features

Feature How to enable
Embedded Model Serving call Inside api.ts call the model endpoint (/invocations) with the same token – no extra auth required.
Private Link networking Deploy the app to a VNet‑isolated workspace; the App service automatically respects the VNet’s Private Link settings.
User‑specific state Store user preferences in a Delta table keyed by user_id (obtainable via GET /api/2.0/preview/scim/v2/Me).
CI/CD Store the app repo in Azure DevOps/GitHub; use databricks apps deploy --manifest manifest.yml in a pipeline step.
Multi‑tenant SaaS‑style app Register the app as a catalog object, expose it via apps.publish to other workspaces, and use cross‑workspace IAM (service principal) for shared data.

Typical use‑cases in enterprise

Use‑case What the app does
Data request portal Users fill a form → the app writes a row to a data_requests Delta table → a “review” job picks it up, runs a transformation, and writes results to a personal folder.
Model‑driven customer support Chat UI invokes a RAG AgentBrick → answers are fetched from vector search and returned to the support rep.
Compliance dashboard App shows a view of all “write” jobs that ran in the last 24 h, using the Jobs API and the Audit Log table.
Self‑service ML pipelines Drop a CSV → the app triggers a Delta Live Table pipeline, then calls Model Serving to register a new model version.

TL;DR Cheat Sheet

Piece Core purpose Key Azure/Databricks integration Typical command/API
SSO Centralised Azure AD login → Databricks UI & APIs Azure AD ↔ SAML 2.0 ↔ Databricks SP; optional SCIM sync az databricks workspace get-token; Azure portal “Enterprise Application”.
Model Serving Managed HTTP endpoint for ML inference (auto‑scale) MLflow Registry → Serverless/Cluster‑backed endpoint; Private Link + Azure AD token auth databricks mlflow model-versions create-serving-endpoint
AgentBricks State‑ful AI agents built from versioned “bricks” (SQL, Python, LLM) Unity Catalog bricks + Prompt Store + Guardrails; integrates with Delta, Vector Search, LLMs sales_agent.run(prompt_name="sales_summary", variables={...})
Databricks Apps Low‑code web apps hosted inside the workspace, calling jobs, models, SQL React front‑end, Apps Service, same Azure AD token, catalog‑level governance databricks apps init …databricks apps deploy …

All four pieces are designed to be used together:

  • SSO gives you a security‑first identity layer.
  • Model Serving lets you expose the trained models that your AgentBricks may call.
  • AgentBricks can be the “brain” behind a Databricks App, providing a conversational or decision‑making UI.

If you ship an analytics app that needs an LLM to explain a chart, you would:

  1. Authenticate via Azure AD SSO.
  2. The app calls a Model Serving endpoint for the LLM.
  3. The LLM result is passed to an AgentBrick (e.g., summarise a Delta table).
  4. The final answer is rendered in a Databricks App UI.

That stack gives you a secure, managed, and fully auditable end‑to‑end solution for modern data‑intelligence workloads on Azure Databricks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment