Skip to content

Instantly share code, notes, and snippets.

@mtanco
Created July 13, 2022 04:09
Show Gist options
  • Save mtanco/fb5e82b0e491e2ebd0f44c4c9bbe8445 to your computer and use it in GitHub Desktop.
Save mtanco/fb5e82b0e491e2ebd0f44c4c9bbe8445 to your computer and use it in GitHub Desktop.
Running all unsupervised model types on a dataset with Driverless AI.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"id": "5b16e7cb",
"metadata": {},
"source": [
"# Driverless AI for Unsupervised Models"
]
},
{
"cell_type": "raw",
"id": "56657e4b-5af1-4a4f-8e47-8d797bd86b88",
"metadata": {},
"source": [
"from getpass import getpass\n",
"\n",
"from h2o_ai_cloud import token_provider, steam_client\n",
"from h2osteam.clients import DriverlessClient"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "8dca2fde",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Visit https://cloud-internal.h2o.ai/auth/get-platform-token to get your platform token\n"
]
},
{
"name": "stdin",
"output_type": "stream",
"text": [
"Enter your platform token: ···········································································································································································································································································································································································································································································································································································································································································································································································································\n"
]
}
],
"source": [
"steam = steam_client(token_provider())"
]
},
{
"cell_type": "markdown",
"id": "2069f13a",
"metadata": {},
"source": [
"### Connect to Driverless AI\n",
"We'll create a connection object called dai that we will use to interact with the platform."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "cfbc936e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"622 \t default-driverless-kubernetes \t stopped \t dai-quickstart\n"
]
}
],
"source": [
"# List all instances I own\n",
"for instance in steam.get_driverless_instances():\n",
" print(instance[\"id\"], \"\\t\", instance[\"profile_name\"], \"\\t\", instance[\"status\"], \"\\t\", instance[\"name\"])"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "5c2eb80b-ff2b-4d40-85fa-be5cd861ba36",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Driverless AI instance is starting, please wait...\n",
"Driverless AI instance is running\n"
]
}
],
"source": [
"# Turn a machine on\n",
"dai_machine = DriverlessClient(steam).get_instance(name=\"dai-quickstart\")\n",
"dai_machine.start()"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "86a54d4a-6a85-4fb4-8544-ea57bc5adb18",
"metadata": {},
"outputs": [],
"source": [
"# Connect with Driverelss AI\n",
"dai = dai_machine.connect()"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "50b37ee6",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'https://steam.cloud-internal.h2o.ai:443/oidc-login-start?forward=/proxy/driverless/622/openid/callback'"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Login to the UI as needed\n",
"dai_machine.openid_login_url()"
]
},
{
"cell_type": "markdown",
"id": "26d4bbca",
"metadata": {},
"source": [
"### List Existing Datasets"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "120f71ab",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" | Type | Key | Name\n",
"----+---------+--------------------------------------+-------------------\n",
" 0 | Dataset | 4c4a1534-fcc6-11ec-aab1-36f210f733cb | Telco_Churn\n",
" 1 | Dataset | 48a52cdc-f8db-11ec-8618-9e7cf9db9673 | telco_churn_test\n",
" 2 | Dataset | 48a50cd4-f8db-11ec-8618-9e7cf9db9673 | telco_churn_train\n",
" 3 | Dataset | 472e4bcc-f8db-11ec-8618-9e7cf9db9673 | Fancy New Name\n"
]
}
],
"source": [
"print(dai.datasets.list(start_index=0, count=4))"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "79c5de39-70a9-40eb-9c4d-8f654b6527f6",
"metadata": {},
"outputs": [],
"source": [
"dataset = dai.datasets.get(key=\"4c4a1534-fcc6-11ec-aab1-36f210f733cb\")"
]
},
{
"cell_type": "markdown",
"id": "fd347d51",
"metadata": {},
"source": [
"## Unsupervised Models"
]
},
{
"cell_type": "code",
"execution_count": 30,
"id": "ba42073b-31d2-41cb-80cb-c5c9d6af550f",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['IsolationForestAnomaly',\n",
" 'KMeans',\n",
" 'KMeansFreq',\n",
" 'KMeansOHE',\n",
" 'TruncSVD',\n",
" 'Unsupervised']"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# All unsupervised model types\n",
"unsupervised_models = [m.name for m in dai.recipes.models.list() if m.is_unsupervised]\n",
"unsupervised_models"
]
},
{
"cell_type": "code",
"execution_count": 31,
"id": "21db3828",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Experiment launched at: https://steam.cloud-internal.h2o.ai:443/proxy/driverless/622/#/experiment?key=1092955c-0261-11ed-9bbc-2abf4ebade94\n",
"Experiment launched at: https://steam.cloud-internal.h2o.ai:443/proxy/driverless/622/#/experiment?key=11305436-0261-11ed-9bbc-2abf4ebade94\n",
"Experiment launched at: https://steam.cloud-internal.h2o.ai:443/proxy/driverless/622/#/experiment?key=11ce36ba-0261-11ed-9bbc-2abf4ebade94\n",
"Experiment launched at: https://steam.cloud-internal.h2o.ai:443/proxy/driverless/622/#/experiment?key=127c0eac-0261-11ed-9bbc-2abf4ebade94\n",
"Experiment launched at: https://steam.cloud-internal.h2o.ai:443/proxy/driverless/622/#/experiment?key=13245ee0-0261-11ed-9bbc-2abf4ebade94\n",
"Experiment launched at: https://steam.cloud-internal.h2o.ai:443/proxy/driverless/622/#/experiment?key=13cd0e8c-0261-11ed-9bbc-2abf4ebade94\n"
]
}
],
"source": [
"for algo in unsupervised_models:\n",
" dai.experiments.create_async(\n",
" train_dataset=dataset,\n",
" name=f'Telco Churn - {algo}', \n",
" task='unsupervised',\n",
" models=[algo],\n",
" target_column='',\n",
" force=True\n",
" )"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5facf437",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Driverless AI instance is stopping, please wait...\n"
]
}
],
"source": [
"dai_machine.stop()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b9298dc5",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python3 (H2O AI Cloud)",
"language": "python",
"name": "haic"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.12"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": true,
"sideBar": true,
"skip_h1_title": true,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {
"height": "calc(100% - 180px)",
"left": "10px",
"top": "150px",
"width": "336px"
},
"toc_section_display": true,
"toc_window_display": true
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment