Below is a high‑level, end‑to‑end guide to the four pieces you asked about – how they fit together in Azure Databricks, what they’re used for, and the key steps needed to get them working in production.
- SSO lets users authenticate once (e.g., via Azure AD, Okta, ADFS) and then automatically get access to the Databricks workspace, notebooks, jobs, and REST APIs without re‑entering credentials.
- In Azure Databricks the SSO flow is built on SAML 2.0 (or OpenID Connect for Azure AD).
- Security – eliminates password sprawl, enforces MFA, and centralises policy (conditional access, expiration, revocation).
- Compliance – audit logs are emitted to Azure AD sign‑in logs and can be forwarded to Microsoft Sentinel, Splunk, etc.
- Productivity – users jump straight from the Azure portal or Power BI to a notebook; CI/CD pipelines can use a service‑principal token instead of a static PAT.
Component | Role |
---|---|
Azure AD / Identity Provider (IdP) | Stores users/groups, enforces MFA, issues SAML assertions. |
Databricks workspace | Trusted Service Provider (SP) that consumes the SAML response. |
SCIM API (optional) | Syncs groups → workspace automatically, keeping access control in lock‑step with Azure AD. |
OAuth 2.0 / PAT | For programmatic access when a service principal or a job needs a token — usually generated once via the UI or via Azure CLI (az databricks workspace get-token ). |
- User clicks “Login to Databricks” from Azure portal or a bookmarked URL.
- Azure AD authenticates the user (password + MFA, conditional access).
- Azure AD sends a SAML assertion to the Databricks SP endpoint.
- Databricks validates the assertion, creates a session token, and redirects the user to the UI.
- The Databricks UI then exchanges that token for a short‑lived bearer token used for the REST API and Spark driver processes.
Step | Azure CLI / Portal Action | Key Settings |
---|---|---|
1️⃣ | Create the workspace (or edit an existing one) → Authentication → Enabled: Azure AD | Choose “Azure AD” as the identity provider. |
2️⃣ | In Azure AD → Enterprise applications → Add a new application → Non-gallery → SAML | Give it a name like Databricks‑<workspace> . |
3️⃣ | Set the SAML configuration: • Identifier (Entity ID) – https://<region>.azuredatabricks.net/saml/metadata • Reply URL (ACS) – https://<workspace>.azuredatabricks.net/saml/acs |
Copy the SSO URL & Entity ID for later. |
4️⃣ | Upload the Databricks metadata (SP metadata XML) – download it from the workspace User Settings → SSO page. | |
5️⃣ | Assign users / groups → Ensure the same groups exist in Azure AD that you want to map to Databricks roles (admin, user, compute‑admin, etc.). | |
6️⃣ | (Optional) SCIM provisioning → In Azure AD > Enterprise apps > Databricks → Provisioning → Set to ‘On’ and provide the Databricks SCIM token (generated in the workspace UI). | |
7️⃣ | Test – Click the Azure portal “Launch Workspace” button. You should be signed in without a password prompt. | |
8️⃣ | Audit – Enable Azure AD sign‑in logs → forward to Log Analytics for compliance reporting. |
Symptom | Likely cause | Fix |
---|---|---|
“Invalid SAML response – certificate not trusted” | The IdP is using a self‑signed cert or a cert that isn’t uploaded to Databricks. | Upload the IdP signing certificate in the workspace SSO configuration. |
Users get a “403 – Insufficient permissions” even after SSO | Group‑to‑role mapping not synced (SCIM off or mismatch). | Enable SCIM or manually assign users to the admin role via UI → Admin Console → Users. |
Service principal token expires after 24 h | Using a PAT instead of an OAuth token. | Create an Azure AD service principal, grant it the WorkspaceAdmin role, and generate an OAuth access token (az account get-access-token --resource 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d ). |
Model serving = a managed, production‑grade endpoint that hosts a trained model (MLflow‑registered, TensorFlow, PyTorch, XGBoost, etc.) and automatically scales to handle inference requests over HTTP/HTTPS.
- Databricks offers two flavors:
- Serverless Model Serving – a fully managed, auto‑scaled service (no cluster management).
- Cluster‑backed Model Serving – uses an existing job cluster, giving you more control over GPU, Spark configs, or custom libraries.
Benefit | Description |
---|---|
Zero‑Ops | No need to spin up a Flask/FastAPI container, set up a load balancer, or manage TLS. |
Native MLflow integration | Deploy a model directly from the MLflow Model Registry with a single click / API call. |
Auto‑scaling & request throttling | Handles burst traffic; you can set concurrency limits and traffic splitting (A/B testing). |
Observability | Integrated metrics (latency, error rates) and request logging to Databricks Unity Catalog audit tables. |
Security | Endpoints can be placed behind Azure Private Link and accessed only by workspaces in the same VNet, plus Azure AD tokens for client authentication. |
[Client] --> HTTPS (OAuth token) --> Azure Load Balancer (private link) -->
Databricks Model Serving Endpoint (Serverless)
|--- inference request --> (auto‑scaled containers) --> Model (MLflow)
|--- logging --> Unity Catalog Tables / ADLS Gen2
- When you enable Private Link, the DNS name
<model-name>.cloud.databricks.com
resolves to a private IP in your VNet, guaranteeing that traffic never traverses the public internet.
-
Register the model (if not already)
import mlflow mlflow.sklearn.log_model(model, "model") mlflow.register_model("runs:/<run-id>/model", "my-model")
-
Promote the version to ‘Staging’ or ‘Production’ (via UI or API).
-
Serve the model (via UI → Model Registry → “Serve”) or programmatically:
# Using the Databricks CLI databricks mlflow model-versions create-serving-endpoint \ --model-name my-model \ --version 3 \ --name my-endpoint \ --config '{ "scale_to_zero_enabled": true }' # optional
-
Get the endpoint URL & token
endpoint_info = mlflow.deployments.get_deployment_endpoint(name="my-endpoint") endpoint_url = endpoint_info["url"] token = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiToken().get()
-
Call the endpoint (sample Python request):
import json, requests headers = { "Authorization": f"Bearer {token}", "Content-Type": "application/json" } payload = {"inputs": [[1.0, 2.0, 3.0]]} # format depends on model type r = requests.post(endpoint_url + "/invocations", headers=headers, data=json.dumps(payload)) print(r.json())
Feature | When to use | How to enable |
---|---|---|
GPU acceleration | Deep learning models (TensorRT, large transformer inference) | Use a cluster‑backed endpoint with a GPU‑enabled job cluster (spark.databricks.cluster.profile = "serverless" + node_type_id = "Standard_NC6" ). |
Canary / A/B testing | Rolling out a new model version | In the endpoint config, set traffic_percentage for each version. |
Custom pre/post‐processing | Need to reshape input, enrich with external data | Deploy a custom container (Docker) as the model runtime (requires Private Link and an Azure Container Registry). |
Batch scoring | Large offline datasets | Use MLflow Batch Inference (Spark job) instead of serving; you can still reuse the same model artifact. |
Metric | Source | Typical alert |
---|---|---|
Latency (p95) | Databricks Observability → Endpoint dashboards | Alert if > 500 ms (or service‑level agreement). |
Error rate (5xx) | Endpoint logs | Alert if > 1 % of requests fail. |
CPU / GPU utilization | Autoscaling logs | Investigate if scaling isn’t kicking in (maybe max_concurrency too low). |
Request payload size | Access logs | Reject > 2 MB payloads (avoid DDOS). |
AgentBricks is a framework (released in 2024) for building multi‑modal, stateful AI agents that run inside Databricks notebooks/jobs. It is built on top of LangChain‑style components but is first‑class integrated with Unity Catalog, Delta Lake, and the workspace security model.
Key concepts:
Concept | Description |
---|---|
Agent | An orchestrator that decides which tool to invoke (SQL, Python function, external API). |
Brick | A reusable, versioned tool packaged as a Unity Catalog SQL function, Python UDF, or Delta Lake table. |
State Store | Persistent context stored in a Delta table (or a Vector DB) that the agent can read/write across calls. |
Prompt Library | Centralised, version‑controlled prompts stored as SQL Views or MLflow artifacts. |
Execution Guardrails | Policy engine (built on Databricks Lakehouse Governance) to enforce e.g., “no writes to production tables” or “max token budget”. |
Use‑case | Benefit |
---|---|
Self‑service analytics – a user asks, “What were sales in EMEA last quarter?” → agent runs a SQL Brick, formats the result, and returns a PDF. | |
Data‑pipeline assistants – an agent can detect data drift, trigger a Delta Live Table job, and update a model registry entry. | |
Enterprise‑grade RAG – combine a vector search Brick (FAISS/Databricks Vector Search) with a LLM Brick (Databricks Foundation Model) to write natural‑language summaries of a Delta table. | |
Compliance‑aware bots – guardrails automatically block a write to a regulated table, and log the attempt to an audit audit table. |
User request (text) --> Agent (LLM + prompt)
--> parses intent --> selects Brick(s) (SQL, Python, external API)
--> Brick runs (reads/writes Delta, calls external service)
--> returns output --> Agent aggregates, may call another Brick (loop)
--> final response returned to user
All bricks are immutable (registered in Unity Catalog), so you can version‑control them like code and roll back if a bug appears.
-
Create a Brick – a Python UDF that queries Delta and returns a Pandas dataframe.
# In a notebook, register the brick from pyspark.sql.functions import pandas_udf, PandasUDFType import pandas as pd @pandas_udf("string", PandasUDFType.SCALAR) def sales_summary_brick(region: pd.Series, quarter: pd.Series) -> pd.Series: # Logic: query Delta table via Spark session df = spark.sql(f""" SELECT sum(amount) as total, avg(amount) as avg FROM sales WHERE region = '{region.iloc[0]}' AND quarter = '{quarter.iloc[0]}' """) total, avg = df.first() return pd.Series([f"Region {region.iloc[0]}, Q{quarter.iloc[0]}: total ${total:,.0f}, avg ${avg:,.2f}"]) spark.udf.register("sales_summary_brick", sales_summary_brick)
-
Define the Prompt (stored as an MLflow artifact)
{ "system": "You are a data assistant. Use only the provided bricks. Do NOT fabricate data.", "user_template": "Give me a sales summary for {region} in Q{quarter}." }
-
Instantiate the Agent (via the
databricks-agent
library)from databricks_agent import Agent, BrickRegistry, PromptStore # Register bricks and prompts registry = BrickRegistry() registry.register_sql("sales_summary_brick", "CALL sales_summary_brick('{region}', '{quarter}')") prompts = PromptStore() prompts.register("sales_summary", "prompt.json") # path to the JSON above # Create the agent sales_agent = Agent( model="databricks-dbrx-instruct", # a foundation model hosted on Azure bricks=registry, prompts=prompts, guardrails={"max_tokens": 500, "no_write_to_prod": True} )
-
Run the agent
response = sales_agent.run( prompt_name="sales_summary", variables={"region": "EMEA", "quarter": "3"} ) print(response) # → "Region EMEA, Q3: total $12,345,000, avg $123,450"
-
Persist state (optional) – The agent can write to a Delta table
agent_state
for future calls (e.g., to remember the last queried period).spark.sql(""" INSERT INTO agent_state (session_id, last_region, last_quarter, timestamp) VALUES ('sess-123', 'EMEA', '3', current_timestamp()) """)
Aspect | Recommendation |
---|---|
Execution environment | Use Jobs with a cluster‑backed agent (GPU if LLM inference is done locally) or Serverless if you only call a hosted foundation model. |
Versioning | Store bricks in a Unity Catalog schema (bricks.sales ) and tag each version. Use ALTER BRICK RENAME TO ... to stage updates. |
Security | Guardrails evaluate the Data Access Control (DAC) of the caller; ensure the agent runs with a service principal that has the minimum required permissions. |
Observability | Enable Delta Live Tables for the agent_state table to capture change data; use Databricks SQL dashboards to monitor usage. |
Testing | Unit‑test bricks with PyTest against a dev Delta lake; CI/CD can run databricks jobs run-now against a staging workspace. |
Databricks Apps are first‑class, low‑code web applications that run inside the Databricks UI and can call notebooks, jobs, or model‑serving endpoints behind the scenes.
Think of them as mini‑portals (e.g., “Sales Forecast Dashboard”, “Data Quality Submit Form”, “Customer‑Churn Explorer”) that:
- Present a UI built with React (or Streamlit/Gradio) that is hosted on the Databricks App Service.
- Interact with backend compute (Jobs, Model Serve, Delta Live Tables) via Databricks REST APIs using the workspace token (no external infra).
- Leverage Unity Catalog for fine‑grained access to tables or models.
- Can be published to the workspace catalog and discovered via the Apps Launcher.
Benefit | Example |
---|---|
Zero‑ops frontend – No need for Azure App Service / AKS. | A finance team can create a “Budget Request” form that writes to a Delta table. |
Integrated auth – Uses the same Azure AD SSO token as the rest of Databricks. | Users never need to enter another password to call an internal model. |
Rapid prototyping – Deploy from a notebook in seconds (databricks apps deploy ). |
Data scientists can ship a “What‑If analysis” UI for a model they just trained. |
Governance – Apps are catalog objects; you can audit who created/modified them and apply tagging. | Compliance can require that an app that writes to PII tables be reviewed before publishing. |
[User’s browser] --(HTTPS + Azure AD token)--> Databricks Apps Service (React SPA)
|
|-- Calls Databricks REST (Jobs, Model Serve, SQL) using the same token
|-- Reads/Writes Delta via Unity Catalog (backend Spark)
|
[Databricks Workspace] <---> [Delta Lake / Unity Catalog] <---> [Model Serving]
-
Create a new App project (via CLI)
databricks apps init my-sales-app cd my-sales-app
This scaffolds a
frontend/
folder with a React app and abackend/
folder for optional Python Lambda‑style functions. -
Define the UI (
frontend/src/App.tsx
)import React, { useState } from "react"; import { runJob } from "./api"; function App() { const [region, setRegion] = useState("EMEA"); const [quarter, setQuarter] = useState("3"); const [result, setResult] = useState(""); const handleSubmit = async () => { const payload = { region, quarter }; const res = await runJob("sales_summary_job", payload); setResult(res.output); }; return ( <div> <h1>Sales Summary</h1> <label>Region: <input value={region} onChange={e=>setRegion(e.target.value)} /></label> <label>Quarter: <input value={quarter} onChange={e=>setQuarter(e.target.value)} /></label> <button onClick={handleSubmit}>Run</button> <pre>{result}</pre> </div> ); } export default App;
-
Create the backend job (a notebook that will be invoked)
# Notebook: sales_summary_job dbutils.widgets.text("region", "") dbutils.widgets.text("quarter", "") region = dbutils.widgets.get("region") quarter = dbutils.widgets.get("quarter") df = spark.sql(f""" SELECT sum(amount) as total, avg(amount) as avg FROM sales WHERE region = '{region}' AND quarter = '{quarter}' """) total, avg = df.first() print(f"Region {region}, Q{quarter}: total ${total:,.0f}, avg ${avg:,.2f}")
Mark the notebook as Job‑type and give it a meaningful name (
sales_summary_job
). -
Expose a thin wrapper in
frontend/src/api.ts
export async function runJob(jobName: string, params: any) { const response = await fetch(`/api/v1/jobs/run-now`, { method: "POST", headers: { "Content-Type": "application/json", // Authorization header is automatically added by the Apps runtime (uses the user's token) }, body: JSON.stringify({ job_name: jobName, notebooks_params: params }) }); return response.json(); }
-
Deploy the app
# Build React bundle cd frontend && npm run build && cd .. # Deploy to the workspace catalog databricks apps deploy --name sales-summary --path ./frontend/build
After deployment you’ll see the app appear under Workspace → Apps → Sales‑Summary. Users can pin it to their workspace launcher.
-
Add a guardrail (optional) – In the app manifest you can declare required scopes, e.g.:
{ "name": "sales-summary", "required_scopes": ["catalog:read:sales", "jobs:run"] }
Users lacking those scopes will see an “Access denied” message.
Feature | How to enable |
---|---|
Embedded Model Serving call | Inside api.ts call the model endpoint (/invocations ) with the same token – no extra auth required. |
Private Link networking | Deploy the app to a VNet‑isolated workspace; the App service automatically respects the VNet’s Private Link settings. |
User‑specific state | Store user preferences in a Delta table keyed by user_id (obtainable via GET /api/2.0/preview/scim/v2/Me ). |
CI/CD | Store the app repo in Azure DevOps/GitHub; use databricks apps deploy --manifest manifest.yml in a pipeline step. |
Multi‑tenant SaaS‑style app | Register the app as a catalog object, expose it via apps.publish to other workspaces, and use cross‑workspace IAM (service principal) for shared data. |
Use‑case | What the app does |
---|---|
Data request portal | Users fill a form → the app writes a row to a data_requests Delta table → a “review” job picks it up, runs a transformation, and writes results to a personal folder. |
Model‑driven customer support | Chat UI invokes a RAG AgentBrick → answers are fetched from vector search and returned to the support rep. |
Compliance dashboard | App shows a view of all “write” jobs that ran in the last 24 h, using the Jobs API and the Audit Log table. |
Self‑service ML pipelines | Drop a CSV → the app triggers a Delta Live Table pipeline, then calls Model Serving to register a new model version. |
Piece | Core purpose | Key Azure/Databricks integration | Typical command/API |
---|---|---|---|
SSO | Centralised Azure AD login → Databricks UI & APIs | Azure AD ↔ SAML 2.0 ↔ Databricks SP; optional SCIM sync | az databricks workspace get-token ; Azure portal “Enterprise Application”. |
Model Serving | Managed HTTP endpoint for ML inference (auto‑scale) | MLflow Registry → Serverless/Cluster‑backed endpoint; Private Link + Azure AD token auth | databricks mlflow model-versions create-serving-endpoint |
AgentBricks | State‑ful AI agents built from versioned “bricks” (SQL, Python, LLM) | Unity Catalog bricks + Prompt Store + Guardrails; integrates with Delta, Vector Search, LLMs | sales_agent.run(prompt_name="sales_summary", variables={...}) |
Databricks Apps | Low‑code web apps hosted inside the workspace, calling jobs, models, SQL | React front‑end, Apps Service, same Azure AD token, catalog‑level governance | databricks apps init … → databricks apps deploy … |
All four pieces are designed to be used together:
- SSO gives you a security‑first identity layer.
- Model Serving lets you expose the trained models that your AgentBricks may call.
- AgentBricks can be the “brain” behind a Databricks App, providing a conversational or decision‑making UI.
If you ship an analytics app that needs an LLM to explain a chart, you would:
- Authenticate via Azure AD SSO.
- The app calls a Model Serving endpoint for the LLM.
- The LLM result is passed to an AgentBrick (e.g., summarise a Delta table).
- The final answer is rendered in a Databricks App UI.
That stack gives you a secure, managed, and fully auditable end‑to‑end solution for modern data‑intelligence workloads on Azure Databricks.