Created
April 20, 2018 13:15
-
-
Save isc-rsingh/8dee6303e3c94f9593499ed0b8812b7a to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"nbformat_minor": 1, | |
"cells": [ | |
{ | |
"source": "## Introduction to Machine Learning with PostgreSQL\n### Predicting NBA Winners", | |
"cell_type": "markdown", | |
"metadata": {} | |
}, | |
{ | |
"source": "<a id=\"model\"></a>\n## 1. Load the data and create the model", | |
"cell_type": "markdown", | |
"metadata": {} | |
}, | |
{ | |
"source": "### 1.1: Load data from PostgreSQL", | |
"cell_type": "markdown", | |
"metadata": {} | |
}, | |
{ | |
"execution_count": 31, | |
"cell_type": "code", | |
"metadata": {}, | |
"outputs": [], | |
"source": "labelCol = 'homeTeamWin'\ntraining_table = 'nba_training_data'" | |
}, | |
{ | |
"execution_count": 32, | |
"cell_type": "code", | |
"metadata": {}, | |
"outputs": [], | |
"source": "import pixiedust\nfrom pyspark.sql import SparkSession\nimport pandas as pd\nspark = SparkSession.builder.getOrCreate()\nsqlCtx = SQLContext(sc)" | |
}, | |
{ | |
"source": "**Tip:** All required fields can be found on Service Credentials tab of Compose Postgres service instance created in Bluemix. If you do not have credentials, you can create them by clicking \"New credentials\"", | |
"cell_type": "markdown", | |
"metadata": {} | |
}, | |
{ | |
"execution_count": 33, | |
"cell_type": "code", | |
"metadata": {}, | |
"outputs": [], | |
"source": "pg_host = 'sl-us-south-1-portal.3.dblayer.com:18447/compose'\npg_uname = \"xxxxxxxx\"\npg_pword = \"xxxxxxxx\"" | |
}, | |
{ | |
"execution_count": 34, | |
"cell_type": "code", | |
"metadata": {}, | |
"outputs": [], | |
"source": "# The code was removed by DSX for sharing." | |
}, | |
{ | |
"execution_count": 35, | |
"cell_type": "code", | |
"metadata": {}, | |
"outputs": [], | |
"source": "from sqlalchemy import create_engine\nengine = create_engine(\"postgresql://\"+pg_uname+\":\"+pg_pword+\"@\"+pg_host)" | |
}, | |
{ | |
"execution_count": 36, | |
"cell_type": "code", | |
"metadata": {}, | |
"outputs": [], | |
"source": "df = pd.read_sql_query('select * from '+training_table,con=engine)\ndf = sqlCtx.createDataFrame(df)" | |
}, | |
{ | |
"execution_count": 37, | |
"cell_type": "code", | |
"metadata": { | |
"pixiedust": { | |
"displayParams": { | |
"tableFields": "awayTeam,awayTeamDaysOff,awayTeamLastFive,awayTeamLastTen,awayTeamTotalWinPercent,date,homeTeam,homeTeamDaysOff,homeTeamLastFive,homeTeamLastTen,homeTeamTotalWinPercent,homeTeamWin", | |
"filter": "{}", | |
"handlerId": "tableView", | |
"no_margin": "true", | |
"table_noschema": "true", | |
"table_nocount": "true", | |
"table_nosearch": "true" | |
} | |
} | |
}, | |
"outputs": [ | |
{ | |
"output_type": "display_data", | |
"data": { | |
"text/html": "<style type=\"text/css\">.pd_warning{display:none;}</style><div class=\"pd_warning\"><em>Hey, there's something awesome here! To see it, open this notebook outside GitHub, in a viewer like Jupyter</em></div>\n <div class=\"pd_save is-viewer-good\" style=\"padding-right:10px;text-align: center;line-height:initial !important;font-size: xx-large;font-weight: 500;color: coral;\">\n \n </div>\n <div id=\"chartFigure98230c54\" class=\"pd_save is-viewer-good\" style=\"overflow-x:auto\">\n <style type=\"text/css\" class=\"pd_save\">\n .df-table-wrapper .panel-heading {\n border-radius: 0;\n padding: 0px;\n }\n .df-table-wrapper .panel-heading:hover {\n border-color: #008571;\n }\n .df-table-wrapper .panel-title a {\n background-color: #f9f9fb;\n color: #333333;\n display: block;\n outline: none;\n padding: 10px 15px;\n text-decoration: none;\n }\n .df-table-wrapper .panel-title a:hover {\n background-color: #337ab7;\n border-color: #2e6da4;\n color: #ffffff;\n display: block;\n padding: 10px 15px;\n text-decoration: none;\n }\n .df-table-wrapper {\n font-size: small;\n font-weight: 300;\n letter-spacing: 0.5px;\n line-height: normal;\n height: inherit;\n overflow: auto;\n }\n .df-table-search {\n margin: 0 0 20px 0;\n }\n .df-table-search-count {\n display: inline-block;\n margin: 0 0 20px 0;\n }\n .df-table-container {\n max-height: 50vh;\n max-width: 100%;\n overflow-x: auto;\n position: relative;\n }\n .df-table-wrapper table {\n border: 0 none #ffffff;\n border-collapse: collapse;\n margin: 0;\n min-width: 100%;\n padding: 0;\n table-layout: fixed;\n height: inherit;\n overflow: auto;\n }\n .df-table-wrapper tr.hidden {\n display: none;\n }\n .df-table-wrapper tr:nth-child(even) {\n background-color: #f9f9fb;\n }\n .df-table-wrapper tr.even {\n background-color: #f9f9fb;\n }\n .df-table-wrapper tr.odd {\n background-color: #ffffff;\n }\n .df-table-wrapper td + td {\n border-left: 1px solid #e0e0e0;\n }\n \n .df-table-wrapper thead,\n .fixed-header {\n font-weight: 600;\n }\n .df-table-wrapper tr,\n .fixed-row {\n border: 0 none #ffffff;\n margin: 0;\n padding: 0;\n }\n .df-table-wrapper th,\n .df-table-wrapper td,\n .fixed-cell {\n border: 0 none #ffffff;\n margin: 0;\n min-width: 50px;\n padding: 5px 20px 5px 10px;\n text-align: left;\n word-wrap: break-word;\n }\n .df-table-wrapper th {\n padding-bottom: 0;\n padding-top: 0;\n }\n .df-table-wrapper th div {\n max-height: 1px;\n visibility: hidden;\n }\n \n .df-schema-field {\n margin-left: 10px;\n }\n \n .fixed-header-container {\n overflow: hidden;\n position: relative;\n }\n .fixed-header {\n border-bottom: 2px solid #000;\n display: table;\n position: relative;\n }\n .fixed-row {\n display: table-row;\n }\n .fixed-cell {\n display: table-cell;\n }\n </style>\n \n \n <div class=\"df-table-wrapper df-table-wrapper-98230c54 panel-group pd_save\">\n <!-- dataframe schema -->\n \n <!-- dataframe table -->\n <div class=\"panel panel-default\" style=\"border:none;\">\n \n <div id=\"df-table-98230c54\" class=\"panel-collapse collapse in\">\n <div class=\"panel-body\">\n \n <div>\n \n </div>\n <!-- fixed header for when dataframe table scrolls -->\n <div class=\"fixed-header-container\">\n <div class=\"fixed-header\" style=\"width: 1681px;\">\n <div class=\"fixed-row\">\n \n <div class=\"fixed-cell\" style=\"width: 102px;\">date</div>\n \n <div class=\"fixed-cell\" style=\"width: 120px;\">homeTeamWin</div>\n \n <div class=\"fixed-cell\" style=\"width: 97px;\">homeTeam</div>\n \n <div class=\"fixed-cell\" style=\"width: 145px;\">homeTeamDaysOff</div>\n \n <div class=\"fixed-cell\" style=\"width: 198px;\">homeTeamTotalWinPercent</div>\n \n <div class=\"fixed-cell\" style=\"width: 149px;\">homeTeamLastFive</div>\n \n <div class=\"fixed-cell\" style=\"width: 145px;\">homeTeamLastTen</div>\n \n <div class=\"fixed-cell\" style=\"width: 95px;\">awayTeam</div>\n \n <div class=\"fixed-cell\" style=\"width: 143px;\">awayTeamDaysOff</div>\n \n <div class=\"fixed-cell\" style=\"width: 196px;\">awayTeamTotalWinPercent</div>\n \n <div class=\"fixed-cell\" style=\"width: 147px;\">awayTeamLastFive</div>\n \n <div class=\"fixed-cell\" style=\"width: 143px;\">awayTeamLastTen</div>\n \n </div>\n </div>\n </div>\n <div class=\"df-table-container\">\n <table class=\"df-table\">\n <thead>\n <tr>\n \n <th><div>date</div></th>\n \n <th><div>homeTeamWin</div></th>\n \n <th><div>homeTeam</div></th>\n \n <th><div>homeTeamDaysOff</div></th>\n \n <th><div>homeTeamTotalWinPercent</div></th>\n \n <th><div>homeTeamLastFive</div></th>\n \n <th><div>homeTeamLastTen</div></th>\n \n <th><div>awayTeam</div></th>\n \n <th><div>awayTeamDaysOff</div></th>\n \n <th><div>awayTeamTotalWinPercent</div></th>\n \n <th><div>awayTeamLastFive</div></th>\n \n <th><div>awayTeamLastTen</div></th>\n \n </tr>\n </thead>\n <tbody>\n \n <tr>\n \n <td>1484114400</td>\n \n <td>1</td>\n \n <td>MIN</td>\n \n <td>2</td>\n \n <td>315</td>\n \n <td>1</td>\n \n <td>3</td>\n \n <td>HOU</td>\n \n <td>1</td>\n \n <td>775</td>\n \n <td>5</td>\n \n <td>9</td>\n \n </tr>\n \n <tr>\n \n <td>1482300000</td>\n \n <td>0</td>\n \n <td>DET</td>\n \n <td>1</td>\n \n <td>466</td>\n \n <td>1</td>\n \n <td>4</td>\n \n <td>MEM</td>\n \n <td>0</td>\n \n <td>600</td>\n \n <td>1</td>\n \n <td>6</td>\n \n </tr>\n \n <tr>\n \n <td>1484287200</td>\n \n <td>1</td>\n \n <td>MIN</td>\n \n <td>2</td>\n \n <td>333</td>\n \n <td>2</td>\n \n <td>4</td>\n \n <td>OKC</td>\n \n <td>2</td>\n \n <td>600</td>\n \n <td>3</td>\n \n <td>6</td>\n \n </tr>\n \n <tr>\n \n <td>1483423200</td>\n \n <td>0</td>\n \n <td>DET</td>\n \n <td>2</td>\n \n <td>444</td>\n \n <td>2</td>\n \n <td>3</td>\n \n <td>IND</td>\n \n <td>2</td>\n \n <td>485</td>\n \n <td>2</td>\n \n <td>4</td>\n \n </tr>\n \n <tr>\n \n <td>1490331600</td>\n \n <td>0</td>\n \n <td>CHI</td>\n \n <td>2</td>\n \n <td>472</td>\n \n <td>2</td>\n \n <td>3</td>\n \n <td>PHI</td>\n \n <td>2</td>\n \n <td>366</td>\n \n <td>2</td>\n \n <td>3</td>\n \n </tr>\n \n <tr>\n \n <td>1479276000</td>\n \n <td>1</td>\n \n <td>ORL</td>\n \n <td>2</td>\n \n <td>363</td>\n \n <td>1</td>\n \n <td>4</td>\n \n <td>NO</td>\n \n <td>1</td>\n \n <td>181</td>\n \n <td>2</td>\n \n <td>2</td>\n \n </tr>\n \n <tr>\n \n <td>1484719200</td>\n \n <td>1</td>\n \n <td>NO</td>\n \n <td>2</td>\n \n <td>380</td>\n \n <td>2</td>\n \n <td>5</td>\n \n <td>ORL</td>\n \n <td>2</td>\n \n <td>395</td>\n \n <td>1</td>\n \n <td>2</td>\n \n </tr>\n \n <tr>\n \n <td>1479621600</td>\n \n <td>1</td>\n \n <td>DEN</td>\n \n <td>2</td>\n \n <td>333</td>\n \n <td>1</td>\n \n <td>3</td>\n \n <td>UTA</td>\n \n <td>1</td>\n \n <td>500</td>\n \n <td>2</td>\n \n <td>5</td>\n \n </tr>\n \n <tr>\n \n <td>1486706400</td>\n \n <td>0</td>\n \n <td>BKN</td>\n \n <td>2</td>\n \n <td>169</td>\n \n <td>0</td>\n \n <td>0</td>\n \n <td>MIA</td>\n \n <td>1</td>\n \n <td>433</td>\n \n <td>5</td>\n \n <td>10</td>\n \n </tr>\n \n <tr>\n \n <td>1486792800</td>\n \n <td>1</td>\n \n <td>HOU</td>\n \n <td>2</td>\n \n <td>696</td>\n \n <td>4</td>\n \n <td>6</td>\n \n <td>PHO</td>\n \n <td>0</td>\n \n <td>314</td>\n \n <td>2</td>\n \n <td>2</td>\n \n </tr>\n \n <tr>\n \n <td>1478498400</td>\n \n <td>0</td>\n \n <td>PHI</td>\n \n <td>2</td>\n \n <td>0</td>\n \n <td>0</td>\n \n <td>0</td>\n \n <td>UTA</td>\n \n <td>0</td>\n \n <td>0</td>\n \n <td>0</td>\n \n <td>0</td>\n \n </tr>\n \n <tr>\n \n <td>1483855200</td>\n \n <td>0</td>\n \n <td>BKN</td>\n \n <td>2</td>\n \n <td>228</td>\n \n <td>0</td>\n \n <td>1</td>\n \n <td>PHI</td>\n \n <td>2</td>\n \n <td>264</td>\n \n <td>2</td>\n \n <td>3</td>\n \n </tr>\n \n <tr>\n \n <td>1490331600</td>\n \n <td>1</td>\n \n <td>WAS</td>\n \n <td>1</td>\n \n <td>605</td>\n \n <td>2</td>\n \n <td>6</td>\n \n <td>BKN</td>\n \n <td>0</td>\n \n <td>211</td>\n \n <td>3</td>\n \n <td>5</td>\n \n </tr>\n \n <tr>\n \n <td>1486188000</td>\n \n <td>0</td>\n \n <td>PHO</td>\n \n <td>0</td>\n \n <td>320</td>\n \n <td>1</td>\n \n <td>3</td>\n \n <td>MIL</td>\n \n <td>1</td>\n \n <td>428</td>\n \n <td>0</td>\n \n <td>1</td>\n \n </tr>\n \n <tr>\n \n <td>1489125600</td>\n \n <td>1</td>\n \n <td>DAL</td>\n \n <td>3</td>\n \n <td>428</td>\n \n <td>4</td>\n \n <td>6</td>\n \n <td>BKN</td>\n \n <td>2</td>\n \n <td>174</td>\n \n <td>2</td>\n \n <td>2</td>\n \n </tr>\n \n <tr>\n \n <td>1486101600</td>\n \n <td>1</td>\n \n <td>ORL</td>\n \n <td>2</td>\n \n <td>372</td>\n \n <td>1</td>\n \n <td>2</td>\n \n <td>TOR</td>\n \n <td>1</td>\n \n <td>600</td>\n \n <td>2</td>\n \n <td>3</td>\n \n </tr>\n \n <tr>\n \n <td>1488088800</td>\n \n <td>1</td>\n \n <td>LAC</td>\n \n <td>1</td>\n \n <td>603</td>\n \n <td>3</td>\n \n <td>5</td>\n \n <td>CHA</td>\n \n <td>1</td>\n \n <td>431</td>\n \n <td>1</td>\n \n <td>2</td>\n \n </tr>\n \n <tr>\n \n <td>1483596000</td>\n \n <td>1</td>\n \n <td>IND</td>\n \n <td>1</td>\n \n <td>500</td>\n \n <td>3</td>\n \n <td>5</td>\n \n <td>BKN</td>\n \n <td>2</td>\n \n <td>242</td>\n \n <td>1</td>\n \n <td>2</td>\n \n </tr>\n \n <tr>\n \n <td>1491714000</td>\n \n <td>0</td>\n \n <td>DEN</td>\n \n <td>1</td>\n \n <td>481</td>\n \n <td>3</td>\n \n <td>5</td>\n \n <td>OKC</td>\n \n <td>1</td>\n \n <td>569</td>\n \n <td>2</td>\n \n <td>5</td>\n \n </tr>\n \n <tr>\n \n <td>1488434400</td>\n \n <td>1</td>\n \n <td>POR</td>\n \n <td>2</td>\n \n <td>406</td>\n \n <td>1</td>\n \n <td>3</td>\n \n <td>OKC</td>\n \n <td>2</td>\n \n <td>583</td>\n \n <td>4</td>\n \n <td>7</td>\n \n </tr>\n \n <tr>\n \n <td>1488780000</td>\n \n <td>1</td>\n \n <td>DET</td>\n \n <td>2</td>\n \n <td>483</td>\n \n <td>3</td>\n \n <td>6</td>\n \n <td>CHI</td>\n \n <td>1</td>\n \n <td>500</td>\n \n <td>3</td>\n \n <td>5</td>\n \n </tr>\n \n <tr>\n \n <td>1488952800</td>\n \n <td>0</td>\n \n <td>NO</td>\n \n <td>1</td>\n \n <td>390</td>\n \n <td>2</td>\n \n <td>4</td>\n \n <td>TOR</td>\n \n <td>4</td>\n \n <td>587</td>\n \n <td>3</td>\n \n <td>5</td>\n \n </tr>\n \n <tr>\n \n <td>1487916000</td>\n \n <td>1</td>\n \n <td>CHI</td>\n \n <td>8</td>\n \n <td>491</td>\n \n <td>2</td>\n \n <td>5</td>\n \n <td>PHO</td>\n \n <td>8</td>\n \n <td>315</td>\n \n <td>2</td>\n \n <td>3</td>\n \n </tr>\n \n <tr>\n \n <td>1488952800</td>\n \n <td>1</td>\n \n <td>MIL</td>\n \n <td>2</td>\n \n <td>467</td>\n \n <td>3</td>\n \n <td>7</td>\n \n <td>NY</td>\n \n <td>2</td>\n \n <td>406</td>\n \n <td>2</td>\n \n <td>4</td>\n \n </tr>\n \n <tr>\n \n <td>1482904800</td>\n \n <td>1</td>\n \n <td>ATL</td>\n \n <td>1</td>\n \n <td>483</td>\n \n <td>2</td>\n \n <td>5</td>\n \n <td>NY</td>\n \n <td>2</td>\n \n <td>533</td>\n \n <td>2</td>\n \n <td>5</td>\n \n </tr>\n \n <tr>\n \n <td>1484892000</td>\n \n <td>1</td>\n \n <td>LAL</td>\n \n <td>3</td>\n \n <td>326</td>\n \n <td>0</td>\n \n <td>3</td>\n \n <td>IND</td>\n \n <td>2</td>\n \n <td>536</td>\n \n <td>4</td>\n \n <td>7</td>\n \n </tr>\n \n <tr>\n \n <td>1479276000</td>\n \n <td>1</td>\n \n <td>OKC</td>\n \n <td>2</td>\n \n <td>545</td>\n \n <td>1</td>\n \n <td>5</td>\n \n <td>HOU</td>\n \n <td>2</td>\n \n <td>600</td>\n \n <td>3</td>\n \n <td>6</td>\n \n </tr>\n \n <tr>\n \n <td>1484978400</td>\n \n <td>1</td>\n \n <td>DET</td>\n \n <td>2</td>\n \n <td>454</td>\n \n <td>2</td>\n \n <td>5</td>\n \n <td>WAS</td>\n \n <td>1</td>\n \n <td>547</td>\n \n <td>4</td>\n \n <td>7</td>\n \n </tr>\n \n <tr>\n \n <td>1483855200</td>\n \n <td>0</td>\n \n <td>POR</td>\n \n <td>3</td>\n \n <td>421</td>\n \n <td>3</td>\n \n <td>3</td>\n \n <td>DET</td>\n \n <td>3</td>\n \n <td>447</td>\n \n <td>2</td>\n \n <td>3</td>\n \n </tr>\n \n <tr>\n \n <td>1491022800</td>\n \n <td>1</td>\n \n <td>BKN</td>\n \n <td>1</td>\n \n <td>213</td>\n \n <td>2</td>\n \n <td>4</td>\n \n <td>ORL</td>\n \n <td>0</td>\n \n <td>355</td>\n \n <td>1</td>\n \n <td>3</td>\n \n </tr>\n \n <tr>\n \n <td>1490331600</td>\n \n <td>1</td>\n \n <td>HOU</td>\n \n <td>4</td>\n \n <td>690</td>\n \n <td>4</td>\n \n <td>7</td>\n \n <td>NO</td>\n \n <td>3</td>\n \n <td>422</td>\n \n <td>4</td>\n \n <td>6</td>\n \n </tr>\n \n <tr>\n \n <td>1486188000</td>\n \n <td>1</td>\n \n <td>UTA</td>\n \n <td>3</td>\n \n <td>620</td>\n \n <td>2</td>\n \n <td>7</td>\n \n <td>CHA</td>\n \n <td>2</td>\n \n <td>460</td>\n \n <td>0</td>\n \n <td>3</td>\n \n </tr>\n \n <tr>\n \n <td>1478584800</td>\n \n <td>1</td>\n \n <td>MEM</td>\n \n <td>2</td>\n \n <td>428</td>\n \n <td>2</td>\n \n <td>3</td>\n \n <td>DEN</td>\n \n <td>2</td>\n \n <td>500</td>\n \n <td>2</td>\n \n <td>3</td>\n \n </tr>\n \n <tr>\n \n <td>1480917600</td>\n \n <td>0</td>\n \n <td>LAL</td>\n \n <td>2</td>\n \n <td>454</td>\n \n <td>2</td>\n \n <td>3</td>\n \n <td>UTA</td>\n \n <td>2</td>\n \n <td>571</td>\n \n <td>4</td>\n \n <td>5</td>\n \n </tr>\n \n <tr>\n \n <td>1485842400</td>\n \n <td>1</td>\n \n <td>LAL</td>\n \n <td>5</td>\n \n <td>320</td>\n \n <td>1</td>\n \n <td>2</td>\n \n <td>DEN</td>\n \n <td>3</td>\n \n <td>456</td>\n \n <td>4</td>\n \n <td>7</td>\n \n </tr>\n \n <tr>\n \n <td>1478498400</td>\n \n <td>1</td>\n \n <td>LAC</td>\n \n <td>2</td>\n \n <td>0</td>\n \n <td>0</td>\n \n <td>0</td>\n \n <td>DET</td>\n \n <td>2</td>\n \n <td>0</td>\n \n <td>0</td>\n \n <td>0</td>\n \n </tr>\n \n <tr>\n \n <td>1490072400</td>\n \n <td>0</td>\n \n <td>POR</td>\n \n <td>2</td>\n \n <td>463</td>\n \n <td>4</td>\n \n <td>8</td>\n \n <td>MIL</td>\n \n <td>2</td>\n \n <td>492</td>\n \n <td>3</td>\n \n <td>8</td>\n \n </tr>\n \n <tr>\n \n <td>1490677200</td>\n \n <td>0</td>\n \n <td>LAL</td>\n \n <td>2</td>\n \n <td>287</td>\n \n <td>1</td>\n \n <td>2</td>\n \n <td>WAS</td>\n \n <td>3</td>\n \n <td>616</td>\n \n <td>3</td>\n \n <td>6</td>\n \n </tr>\n \n <tr>\n \n <td>1491627600</td>\n \n <td>1</td>\n \n <td>POR</td>\n \n <td>1</td>\n \n <td>493</td>\n \n <td>3</td>\n \n <td>7</td>\n \n <td>UTA</td>\n \n <td>1</td>\n \n <td>620</td>\n \n <td>4</td>\n \n <td>6</td>\n \n </tr>\n \n <tr>\n \n <td>1486879200</td>\n \n <td>1</td>\n \n <td>SAC</td>\n \n <td>1</td>\n \n <td>407</td>\n \n <td>3</td>\n \n <td>5</td>\n \n <td>NO</td>\n \n <td>2</td>\n \n <td>388</td>\n \n <td>2</td>\n \n <td>4</td>\n \n </tr>\n \n <tr>\n \n <td>1479880800</td>\n \n <td>0</td>\n \n <td>IND</td>\n \n <td>2</td>\n \n <td>466</td>\n \n <td>3</td>\n \n <td>5</td>\n \n <td>ATL</td>\n \n <td>0</td>\n \n <td>642</td>\n \n <td>2</td>\n \n <td>6</td>\n \n </tr>\n \n <tr>\n \n <td>1484028000</td>\n \n <td>1</td>\n \n <td>GS</td>\n \n <td>2</td>\n \n <td>842</td>\n \n <td>4</td>\n \n <td>8</td>\n \n <td>MIA</td>\n \n <td>2</td>\n \n <td>282</td>\n \n <td>1</td>\n \n <td>2</td>\n \n </tr>\n \n <tr>\n \n <td>1483336800</td>\n \n <td>1</td>\n \n <td>CHI</td>\n \n <td>2</td>\n \n <td>470</td>\n \n <td>2</td>\n \n <td>3</td>\n \n <td>CHA</td>\n \n <td>2</td>\n \n <td>558</td>\n \n <td>3</td>\n \n <td>5</td>\n \n </tr>\n \n <tr>\n \n <td>1485669600</td>\n \n <td>0</td>\n \n <td>SA</td>\n \n <td>1</td>\n \n <td>782</td>\n \n <td>4</td>\n \n <td>7</td>\n \n <td>DAL</td>\n \n <td>2</td>\n \n <td>347</td>\n \n <td>2</td>\n \n <td>5</td>\n \n </tr>\n \n <tr>\n \n <td>1486101600</td>\n \n <td>0</td>\n \n <td>POR</td>\n \n <td>3</td>\n \n <td>440</td>\n \n <td>4</td>\n \n <td>5</td>\n \n <td>DAL</td>\n \n <td>2</td>\n \n <td>387</td>\n \n <td>4</td>\n \n <td>7</td>\n \n </tr>\n \n <tr>\n \n <td>1484978400</td>\n \n <td>1</td>\n \n <td>ATL</td>\n \n <td>0</td>\n \n <td>581</td>\n \n <td>3</td>\n \n <td>8</td>\n \n <td>PHI</td>\n \n <td>1</td>\n \n <td>365</td>\n \n <td>4</td>\n \n <td>8</td>\n \n </tr>\n \n <tr>\n \n <td>1485324000</td>\n \n <td>1</td>\n \n <td>MEM</td>\n \n <td>4</td>\n \n <td>565</td>\n \n <td>2</td>\n \n <td>4</td>\n \n <td>TOR</td>\n \n <td>1</td>\n \n <td>622</td>\n \n <td>1</td>\n \n <td>4</td>\n \n </tr>\n \n <tr>\n \n <td>1488088800</td>\n \n <td>1</td>\n \n <td>TOR</td>\n \n <td>1</td>\n \n <td>586</td>\n \n <td>2</td>\n \n <td>5</td>\n \n <td>POR</td>\n \n <td>2</td>\n \n <td>421</td>\n \n <td>2</td>\n \n <td>4</td>\n \n </tr>\n \n <tr>\n \n <td>1487138400</td>\n \n <td>1</td>\n \n <td>OKC</td>\n \n <td>2</td>\n \n <td>553</td>\n \n <td>2</td>\n \n <td>4</td>\n \n <td>NY</td>\n \n <td>3</td>\n \n <td>410</td>\n \n <td>1</td>\n \n <td>3</td>\n \n </tr>\n \n <tr>\n \n <td>1490072400</td>\n \n <td>1</td>\n \n <td>NO</td>\n \n <td>2</td>\n \n <td>414</td>\n \n <td>4</td>\n \n <td>6</td>\n \n <td>MEM</td>\n \n <td>2</td>\n \n <td>571</td>\n \n <td>4</td>\n \n <td>5</td>\n \n </tr>\n \n <tr>\n \n <td>1479708000</td>\n \n <td>0</td>\n \n <td>MIN</td>\n \n <td>2</td>\n \n <td>333</td>\n \n <td>2</td>\n \n <td>4</td>\n \n <td>BOS</td>\n \n <td>2</td>\n \n <td>538</td>\n \n <td>3</td>\n \n <td>5</td>\n \n </tr>\n \n <tr>\n \n <td>1486360800</td>\n \n <td>0</td>\n \n <td>NY</td>\n \n <td>1</td>\n \n <td>423</td>\n \n <td>2</td>\n \n <td>4</td>\n \n <td>LAL</td>\n \n <td>2</td>\n \n <td>320</td>\n \n <td>1</td>\n \n <td>2</td>\n \n </tr>\n \n <tr>\n \n <td>1479016800</td>\n \n <td>1</td>\n \n <td>GS</td>\n \n <td>2</td>\n \n <td>777</td>\n \n <td>4</td>\n \n <td>7</td>\n \n <td>PHO</td>\n \n <td>0</td>\n \n <td>300</td>\n \n <td>2</td>\n \n <td>3</td>\n \n </tr>\n \n <tr>\n \n <td>1480658400</td>\n \n <td>1</td>\n \n <td>NY</td>\n \n <td>1</td>\n \n <td>500</td>\n \n <td>3</td>\n \n <td>6</td>\n \n <td>MIN</td>\n \n <td>1</td>\n \n <td>277</td>\n \n <td>1</td>\n \n <td>3</td>\n \n </tr>\n \n <tr>\n \n <td>1483336800</td>\n \n <td>0</td>\n \n <td>BKN</td>\n \n <td>3</td>\n \n <td>250</td>\n \n <td>1</td>\n \n <td>2</td>\n \n <td>UTA</td>\n \n <td>1</td>\n \n <td>617</td>\n \n <td>3</td>\n \n <td>7</td>\n \n </tr>\n \n <tr>\n \n <td>1484719200</td>\n \n <td>1</td>\n \n <td>WAS</td>\n \n <td>2</td>\n \n <td>525</td>\n \n <td>4</td>\n \n <td>7</td>\n \n <td>MEM</td>\n \n <td>2</td>\n \n <td>581</td>\n \n <td>3</td>\n \n <td>5</td>\n \n </tr>\n \n <tr>\n \n <td>1480053600</td>\n \n <td>1</td>\n \n <td>CLE</td>\n \n <td>2</td>\n \n <td>0</td>\n \n <td>0</td>\n \n <td>0</td>\n \n <td>DAL</td>\n \n <td>1</td>\n \n <td>0</td>\n \n <td>0</td>\n \n <td>0</td>\n \n </tr>\n \n <tr>\n \n <td>1483250400</td>\n \n <td>1</td>\n \n <td>ATL</td>\n \n <td>1</td>\n \n <td>515</td>\n \n <td>3</td>\n \n <td>6</td>\n \n <td>SA</td>\n \n <td>1</td>\n \n <td>818</td>\n \n <td>4</td>\n \n <td>9</td>\n \n </tr>\n \n <tr>\n \n <td>1488261600</td>\n \n <td>1</td>\n \n <td>OKC</td>\n \n <td>2</td>\n \n <td>576</td>\n \n <td>3</td>\n \n <td>6</td>\n \n <td>UTA</td>\n \n <td>2</td>\n \n <td>627</td>\n \n <td>3</td>\n \n <td>7</td>\n \n </tr>\n \n <tr>\n \n <td>1489381200</td>\n \n <td>1</td>\n \n <td>MEM</td>\n \n <td>1</td>\n \n <td>0</td>\n \n <td>0</td>\n \n <td>0</td>\n \n <td>MIL</td>\n \n <td>2</td>\n \n <td>0</td>\n \n <td>0</td>\n \n <td>0</td>\n \n </tr>\n \n <tr>\n \n <td>1484200800</td>\n \n <td>0</td>\n \n <td>PHO</td>\n \n <td>4</td>\n \n <td>315</td>\n \n <td>2</td>\n \n <td>4</td>\n \n <td>DAL</td>\n \n <td>3</td>\n \n <td>289</td>\n \n <td>1</td>\n \n <td>4</td>\n \n </tr>\n \n <tr>\n \n <td>1478844000</td>\n \n <td>1</td>\n \n <td>POR</td>\n \n <td>1</td>\n \n <td>555</td>\n \n <td>3</td>\n \n <td>5</td>\n \n <td>SAC</td>\n \n <td>0</td>\n \n <td>400</td>\n \n <td>2</td>\n \n <td>4</td>\n \n </tr>\n \n <tr>\n \n <td>1480053600</td>\n \n <td>1</td>\n \n <td>POR</td>\n \n <td>2</td>\n \n <td>0</td>\n \n <td>0</td>\n \n <td>0</td>\n \n <td>NO</td>\n \n <td>2</td>\n \n <td>0</td>\n \n <td>0</td>\n \n <td>0</td>\n \n </tr>\n \n <tr>\n \n <td>1479880800</td>\n \n <td>1</td>\n \n <td>DET</td>\n \n <td>2</td>\n \n <td>400</td>\n \n <td>1</td>\n \n <td>3</td>\n \n <td>MIA</td>\n \n <td>2</td>\n \n <td>307</td>\n \n <td>2</td>\n \n <td>3</td>\n \n </tr>\n \n <tr>\n \n <td>1484460000</td>\n \n <td>1</td>\n \n <td>TOR</td>\n \n <td>1</td>\n \n <td>666</td>\n \n <td>3</td>\n \n <td>5</td>\n \n <td>NY</td>\n \n <td>2</td>\n \n <td>450</td>\n \n <td>2</td>\n \n <td>2</td>\n \n </tr>\n \n <tr>\n \n <td>1480140000</td>\n \n <td>1</td>\n \n <td>GS</td>\n \n <td>1</td>\n \n <td>875</td>\n \n <td>5</td>\n \n <td>10</td>\n \n <td>MIN</td>\n \n <td>1</td>\n \n <td>333</td>\n \n <td>2</td>\n \n <td>4</td>\n \n </tr>\n \n <tr>\n \n <td>1477890000</td>\n \n <td>0</td>\n \n <td>BKN</td>\n \n <td>1</td>\n \n <td>333</td>\n \n <td>1</td>\n \n <td>1</td>\n \n <td>CHI</td>\n \n <td>1</td>\n \n <td>1000</td>\n \n <td>2</td>\n \n <td>2</td>\n \n </tr>\n \n <tr>\n \n <td>1478671200</td>\n \n <td>0</td>\n \n <td>ORL</td>\n \n <td>1</td>\n \n <td>428</td>\n \n <td>3</td>\n \n <td>3</td>\n \n <td>MIN</td>\n \n <td>0</td>\n \n <td>166</td>\n \n <td>1</td>\n \n <td>1</td>\n \n </tr>\n \n <tr>\n \n <td>1484978400</td>\n \n <td>1</td>\n \n <td>CHI</td>\n \n <td>1</td>\n \n <td>477</td>\n \n <td>2</td>\n \n <td>5</td>\n \n <td>SAC</td>\n \n <td>1</td>\n \n <td>380</td>\n \n <td>1</td>\n \n <td>2</td>\n \n </tr>\n \n <tr>\n \n <td>1490331600</td>\n \n <td>1</td>\n \n <td>LAL</td>\n \n <td>3</td>\n \n <td>281</td>\n \n <td>0</td>\n \n <td>1</td>\n \n <td>MIN</td>\n \n <td>3</td>\n \n <td>400</td>\n \n <td>1</td>\n \n <td>4</td>\n \n </tr>\n \n <tr>\n \n <td>1479103200</td>\n \n <td>1</td>\n \n <td>IND</td>\n \n <td>2</td>\n \n <td>400</td>\n \n <td>2</td>\n \n <td>4</td>\n \n <td>ORL</td>\n \n <td>1</td>\n \n <td>400</td>\n \n <td>2</td>\n \n <td>4</td>\n \n </tr>\n \n <tr>\n \n <td>1483682400</td>\n \n <td>0</td>\n \n <td>ORL</td>\n \n <td>2</td>\n \n <td>432</td>\n \n <td>2</td>\n \n <td>5</td>\n \n <td>HOU</td>\n \n <td>0</td>\n \n <td>756</td>\n \n <td>5</td>\n \n <td>8</td>\n \n </tr>\n \n <tr>\n \n <td>1477371600</td>\n \n <td>0</td>\n \n <td>GS</td>\n \n <td>0</td>\n \n <td>0</td>\n \n <td>0</td>\n \n <td>0</td>\n \n <td>SA</td>\n \n <td>0</td>\n \n <td>0</td>\n \n <td>0</td>\n \n <td>0</td>\n \n </tr>\n \n <tr>\n \n <td>1486360800</td>\n \n <td>1</td>\n \n <td>TOR</td>\n \n <td>0</td>\n \n <td>596</td>\n \n <td>2</td>\n \n <td>3</td>\n \n <td>LAC</td>\n \n <td>1</td>\n \n <td>607</td>\n \n <td>1</td>\n \n <td>4</td>\n \n </tr>\n \n <tr>\n \n <td>1481868000</td>\n \n <td>1</td>\n \n <td>WAS</td>\n \n <td>2</td>\n \n <td>416</td>\n \n <td>3</td>\n \n <td>5</td>\n \n <td>DET</td>\n \n <td>1</td>\n \n <td>518</td>\n \n <td>3</td>\n \n <td>6</td>\n \n </tr>\n \n <tr>\n \n <td>1485324000</td>\n \n <td>1</td>\n \n <td>POR</td>\n \n <td>4</td>\n \n <td>413</td>\n \n <td>1</td>\n \n <td>4</td>\n \n <td>LAL</td>\n \n <td>3</td>\n \n <td>333</td>\n \n <td>1</td>\n \n <td>3</td>\n \n </tr>\n \n <tr>\n \n <td>1491714000</td>\n \n <td>0</td>\n \n <td>NY</td>\n \n <td>1</td>\n \n <td>375</td>\n \n <td>2</td>\n \n <td>3</td>\n \n <td>TOR</td>\n \n <td>1</td>\n \n <td>612</td>\n \n <td>4</td>\n \n <td>8</td>\n \n </tr>\n \n <tr>\n \n <td>1482732000</td>\n \n <td>1</td>\n \n <td>HOU</td>\n \n <td>3</td>\n \n <td>709</td>\n \n <td>3</td>\n \n <td>8</td>\n \n <td>PHO</td>\n \n <td>2</td>\n \n <td>300</td>\n \n <td>1</td>\n \n <td>3</td>\n \n </tr>\n \n <tr>\n \n <td>1483250400</td>\n \n <td>0</td>\n \n <td>MIA</td>\n \n <td>1</td>\n \n <td>294</td>\n \n <td>1</td>\n \n <td>3</td>\n \n <td>DET</td>\n \n <td>1</td>\n \n <td>428</td>\n \n <td>1</td>\n \n <td>2</td>\n \n </tr>\n \n <tr>\n \n <td>1485928800</td>\n \n <td>0</td>\n \n <td>OKC</td>\n \n <td>1</td>\n \n <td>571</td>\n \n <td>3</td>\n \n <td>5</td>\n \n <td>CHI</td>\n \n <td>3</td>\n \n <td>489</td>\n \n <td>3</td>\n \n <td>5</td>\n \n </tr>\n \n <tr>\n \n <td>1491973200</td>\n \n <td>0</td>\n \n <td>OKC</td>\n \n <td>1</td>\n \n <td>580</td>\n \n <td>4</td>\n \n <td>6</td>\n \n <td>DEN</td>\n \n <td>0</td>\n \n <td>481</td>\n \n <td>3</td>\n \n <td>5</td>\n \n </tr>\n \n <tr>\n \n <td>1481868000</td>\n \n <td>0</td>\n \n <td>CHI</td>\n \n <td>1</td>\n \n <td>520</td>\n \n <td>2</td>\n \n <td>4</td>\n \n <td>MIL</td>\n \n <td>1</td>\n \n <td>500</td>\n \n <td>2</td>\n \n <td>6</td>\n \n </tr>\n \n <tr>\n \n <td>1478322000</td>\n \n <td>0</td>\n \n <td>SA</td>\n \n <td>0</td>\n \n <td>833</td>\n \n <td>4</td>\n \n <td>5</td>\n \n <td>LAC</td>\n \n <td>1</td>\n \n <td>800</td>\n \n <td>4</td>\n \n <td>4</td>\n \n </tr>\n \n <tr>\n \n <td>1490331600</td>\n \n <td>1</td>\n \n <td>BOS</td>\n \n <td>2</td>\n \n <td>638</td>\n \n <td>4</td>\n \n <td>6</td>\n \n <td>PHO</td>\n \n <td>1</td>\n \n <td>305</td>\n \n <td>0</td>\n \n <td>2</td>\n \n </tr>\n \n <tr>\n \n <td>1478844000</td>\n \n <td>1</td>\n \n <td>BOS</td>\n \n <td>2</td>\n \n <td>428</td>\n \n <td>2</td>\n \n <td>3</td>\n \n <td>NY</td>\n \n <td>2</td>\n \n <td>428</td>\n \n <td>2</td>\n \n <td>3</td>\n \n </tr>\n \n <tr>\n \n <td>1482472800</td>\n \n <td>1</td>\n \n <td>NO</td>\n \n <td>2</td>\n \n <td>322</td>\n \n <td>2</td>\n \n <td>3</td>\n \n <td>MIA</td>\n \n <td>1</td>\n \n <td>333</td>\n \n <td>2</td>\n \n <td>3</td>\n \n </tr>\n \n <tr>\n \n <td>1483423200</td>\n \n <td>1</td>\n \n <td>PHO</td>\n \n <td>0</td>\n \n <td>285</td>\n \n <td>1</td>\n \n <td>2</td>\n \n <td>MIA</td>\n \n <td>2</td>\n \n <td>285</td>\n \n <td>0</td>\n \n <td>2</td>\n \n </tr>\n \n <tr>\n \n <td>1481522400</td>\n \n <td>1</td>\n \n <td>DAL</td>\n \n <td>2</td>\n \n <td>217</td>\n \n <td>2</td>\n \n <td>3</td>\n \n <td>DEN</td>\n \n <td>2</td>\n \n <td>375</td>\n \n <td>2</td>\n \n <td>3</td>\n \n </tr>\n \n <tr>\n \n <td>1481090400</td>\n \n <td>0</td>\n \n <td>LAC</td>\n \n <td>3</td>\n \n <td>727</td>\n \n <td>2</td>\n \n <td>6</td>\n \n <td>GS</td>\n \n <td>2</td>\n \n <td>857</td>\n \n <td>4</td>\n \n <td>9</td>\n \n </tr>\n \n <tr>\n \n <td>1491022800</td>\n \n <td>1</td>\n \n <td>CHI</td>\n \n <td>1</td>\n \n <td>480</td>\n \n <td>3</td>\n \n <td>5</td>\n \n <td>ATL</td>\n \n <td>2</td>\n \n <td>520</td>\n \n <td>2</td>\n \n <td>3</td>\n \n </tr>\n \n <tr>\n \n <td>1485583200</td>\n \n <td>0</td>\n \n <td>UTA</td>\n \n <td>1</td>\n \n <td>625</td>\n \n <td>3</td>\n \n <td>7</td>\n \n <td>MEM</td>\n \n <td>0</td>\n \n <td>562</td>\n \n <td>2</td>\n \n <td>5</td>\n \n </tr>\n \n <tr>\n \n <td>1479362400</td>\n \n <td>1</td>\n \n <td>HOU</td>\n \n <td>1</td>\n \n <td>545</td>\n \n <td>3</td>\n \n <td>6</td>\n \n <td>POR</td>\n \n <td>1</td>\n \n <td>583</td>\n \n <td>3</td>\n \n <td>6</td>\n \n </tr>\n \n <tr>\n \n <td>1480053600</td>\n \n <td>0</td>\n \n <td>ORL</td>\n \n <td>2</td>\n \n <td>0</td>\n \n <td>0</td>\n \n <td>0</td>\n \n <td>WAS</td>\n \n <td>4</td>\n \n <td>0</td>\n \n <td>0</td>\n \n <td>0</td>\n \n </tr>\n \n <tr>\n \n <td>1484460000</td>\n \n <td>0</td>\n \n <td>SAC</td>\n \n <td>1</td>\n \n <td>410</td>\n \n <td>1</td>\n \n <td>4</td>\n \n <td>OKC</td>\n \n <td>2</td>\n \n <td>585</td>\n \n <td>3</td>\n \n <td>5</td>\n \n </tr>\n \n <tr>\n \n <td>1485928800</td>\n \n <td>0</td>\n \n <td>PHO</td>\n \n <td>4</td>\n \n <td>312</td>\n \n <td>1</td>\n \n <td>3</td>\n \n <td>LAC</td>\n \n <td>4</td>\n \n <td>625</td>\n \n <td>1</td>\n \n <td>6</td>\n \n </tr>\n \n <tr>\n \n <td>1479362400</td>\n \n <td>1</td>\n \n <td>MIA</td>\n \n <td>2</td>\n \n <td>200</td>\n \n <td>0</td>\n \n <td>2</td>\n \n <td>MIL</td>\n \n <td>1</td>\n \n <td>500</td>\n \n <td>2</td>\n \n <td>5</td>\n \n </tr>\n \n <tr>\n \n <td>1488607200</td>\n \n <td>1</td>\n \n <td>SA</td>\n \n <td>0</td>\n \n <td>783</td>\n \n <td>5</td>\n \n <td>8</td>\n \n <td>MIN</td>\n \n <td>3</td>\n \n <td>409</td>\n \n <td>4</td>\n \n <td>6</td>\n \n </tr>\n \n <tr>\n \n <td>1487829600</td>\n \n <td>1</td>\n \n <td>GS</td>\n \n <td>8</td>\n \n <td>0</td>\n \n <td>0</td>\n \n <td>0</td>\n \n <td>LAC</td>\n \n <td>8</td>\n \n <td>0</td>\n \n <td>0</td>\n \n <td>0</td>\n \n </tr>\n \n <tr>\n \n <td>1489298400</td>\n \n <td>1</td>\n \n <td>HOU</td>\n \n <td>2</td>\n \n <td>681</td>\n \n <td>3</td>\n \n <td>6</td>\n \n <td>CLE</td>\n \n <td>3</td>\n \n <td>671</td>\n \n <td>2</td>\n \n <td>5</td>\n \n </tr>\n \n <tr>\n \n <td>1489899600</td>\n \n <td>0</td>\n \n <td>BKN</td>\n \n <td>2</td>\n \n <td>191</td>\n \n <td>2</td>\n \n <td>4</td>\n \n <td>DAL</td>\n \n <td>2</td>\n \n <td>426</td>\n \n <td>2</td>\n \n <td>6</td>\n \n </tr>\n \n </tbody>\n </table>\n </div>\n </div>\n </div>\n </div>\n </div>\n \n <script class=\"pd_save\">\n $(function() {\n var tableWrapper = $('.df-table-wrapper-98230c54');\n var fixedHeader = $('.fixed-header', tableWrapper);\n var tableContainer = $('.df-table-container', tableWrapper);\n var table = $('.df-table', tableContainer);\n var rows = $('tbody > tr', table);\n var total = 100;\n \n fixedHeader\n .css('width', table.width())\n .find('.fixed-cell')\n .each(function(i, e) {\n $(this).css('width', $('.df-table-wrapper-98230c54 th:nth-child(' + (i+1) + ')').css('width'));\n });\n \n tableContainer.scroll(function() {\n fixedHeader.css({ left: table.position().left });\n });\n \n rows.on(\"click\", function(e){\n var txt = e.delegateTarget.innerText;\n var splits = txt.split(\"\\t\");\n var len = splits.length;\n var hdrs = $(fixedHeader).find(\".fixed-cell\");\n // Add all cells in the selected row as a map to be consumed by the target as needed\n var payload = {type:\"select\", targetDivId: \"\" };\n for (var i = 0; i < len; i++) {\n payload[hdrs[i].innerHTML] = splits[i];\n }\n \n //simple selection highlighting, client adds \"selected\" class\n $(this).addClass(\"selected\").siblings().removeClass(\"selected\");\n $(document).trigger('pd_event', payload);\n });\n \n $('.df-table-search', tableWrapper).keyup(function() {\n var val = '^(?=.*\\\\b' + $.trim($(this).val()).split(/\\s+/).join('\\\\b)(?=.*\\\\b') + ').*$';\n var reg = RegExp(val, 'i');\n var index = 0;\n \n rows.each(function(i, e) {\n if (!reg.test($(this).text().replace(/\\s+/g, ' '))) {\n $(this).attr('class', 'hidden');\n }\n else {\n $(this).attr('class', (++index % 2 == 0 ? 'even' : 'odd'));\n }\n });\n $('.df-table-search-count', tableWrapper).html('Showing ' + index + ' of ' + total + ' rows');\n });\n });\n \n $(\".df-table-wrapper td:contains('http://')\").each(function(){var tc = this.textContent; $(this).wrapInner(\"<a target='_blank' href='\" + tc + \"'></a>\");});\n $(\".df-table-wrapper td:contains('https://')\").each(function(){var tc = this.textContent; $(this).wrapInner(\"<a target='_blank' href='\" + tc + \"'></a>\");});\n </script>\n \n </div>", | |
"text/plain": "<IPython.core.display.HTML object>" | |
}, | |
"metadata": {} | |
} | |
], | |
"source": "display(df)" | |
}, | |
{ | |
"source": "### 1.2: Prepare data", | |
"cell_type": "markdown", | |
"metadata": {} | |
}, | |
{ | |
"source": "In this subsection you will split your data into: train and test datasets.", | |
"cell_type": "markdown", | |
"metadata": {} | |
}, | |
{ | |
"execution_count": 38, | |
"cell_type": "code", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": "Number of records for training: 988\nNumber of records for evaluation: 242\n" | |
} | |
], | |
"source": "(train_data, test_data) = df.randomSplit([0.8, 0.2], 24)\nprint(\"Number of records for training: \" + str(train_data.count()))\nprint(\"Number of records for evaluation: \" + str(test_data.count()))" | |
}, | |
{ | |
"source": "As you can see our data has been successfully split into two datasets:\n - The train data set, which is the largest group, is used for training.\n - The test data set will be used for model evaluation.", | |
"cell_type": "markdown", | |
"metadata": {} | |
}, | |
{ | |
"source": "### 2.3: Create pipeline and train a model", | |
"cell_type": "markdown", | |
"metadata": {} | |
}, | |
{ | |
"source": "In this section you will create an Apache\u00ae Spark machine learning pipeline and then train the model.", | |
"cell_type": "markdown", | |
"metadata": {} | |
}, | |
{ | |
"source": "In the first step you need to import the Apache\u00ae Spark machine learning packages that will be needed in the subsequent steps.", | |
"cell_type": "markdown", | |
"metadata": {} | |
}, | |
{ | |
"execution_count": 39, | |
"cell_type": "code", | |
"metadata": {}, | |
"outputs": [], | |
"source": "from pyspark.ml import Pipeline, Model\nfrom pyspark.ml.feature import VectorAssembler\nfrom pyspark.ml.classification import LogisticRegression, RandomForestClassifier\nfrom pyspark.ml.evaluation import BinaryClassificationEvaluator, MulticlassClassificationEvaluator\n" | |
}, | |
{ | |
"source": "In the following step, create a feature vector by combining all features together.", | |
"cell_type": "markdown", | |
"metadata": {} | |
}, | |
{ | |
"execution_count": 40, | |
"cell_type": "code", | |
"metadata": {}, | |
"outputs": [], | |
"source": "vectorAssembler_features = VectorAssembler(\n inputCols=[\n 'homeTeamHomeWinPercent',\n 'homeTeamLastFive',\n 'awayTeamAwayWinPercent',\n 'awayTeamLastFive'\n ], outputCol='features'\n)" | |
}, | |
{ | |
"source": "Next, define estimators you want to use for classification. Logistic Regression is used in the following example.", | |
"cell_type": "markdown", | |
"metadata": {} | |
}, | |
{ | |
"execution_count": 41, | |
"cell_type": "code", | |
"metadata": {}, | |
"outputs": [], | |
"source": "lr = LogisticRegression(labelCol=labelCol, featuresCol='features', maxIter=10, regParam=0.01, family=\"binomial\")" | |
}, | |
{ | |
"source": "Let's build the pipeline now. A pipeline consists of transformers and an estimator.", | |
"cell_type": "markdown", | |
"metadata": {} | |
}, | |
{ | |
"execution_count": 42, | |
"cell_type": "code", | |
"metadata": {}, | |
"outputs": [], | |
"source": "pipeline_lr = Pipeline(stages=[vectorAssembler_features,lr])" | |
}, | |
{ | |
"source": "Now, you can train your Logistic Regression model by using the previously defined pipeline and train data.", | |
"cell_type": "markdown", | |
"metadata": {} | |
}, | |
{ | |
"execution_count": 43, | |
"cell_type": "code", | |
"metadata": {}, | |
"outputs": [], | |
"source": "model = pipeline_lr.fit(train_data)" | |
}, | |
{ | |
"source": "You can check your model accuracy now. To evaluate the model, use test data.", | |
"cell_type": "markdown", | |
"metadata": {} | |
}, | |
{ | |
"execution_count": 44, | |
"cell_type": "code", | |
"metadata": { | |
"pixiedust": { | |
"displayParams": { | |
"tableFields": "awayTeam,awayTeamLastFive,homeTeam,homeTeamLastFive,homeTeamWin,prediction", | |
"handlerId": "tableView" | |
} | |
} | |
}, | |
"outputs": [ | |
{ | |
"output_type": "display_data", | |
"data": { | |
"text/html": "<style type=\"text/css\">.pd_warning{display:none;}</style><div class=\"pd_warning\"><em>Hey, there's something awesome here! To see it, open this notebook outside GitHub, in a viewer like Jupyter</em></div>\n <div class=\"pd_save is-viewer-good\" style=\"padding-right:10px;text-align: center;line-height:initial !important;font-size: xx-large;font-weight: 500;color: coral;\">\n \n </div>\n <div id=\"chartFigure2d655c46\" class=\"pd_save is-viewer-good\" style=\"overflow-x:auto\">\n <style type=\"text/css\" class=\"pd_save\">\n .df-table-wrapper .panel-heading {\n border-radius: 0;\n padding: 0px;\n }\n .df-table-wrapper .panel-heading:hover {\n border-color: #008571;\n }\n .df-table-wrapper .panel-title a {\n background-color: #f9f9fb;\n color: #333333;\n display: block;\n outline: none;\n padding: 10px 15px;\n text-decoration: none;\n }\n .df-table-wrapper .panel-title a:hover {\n background-color: #337ab7;\n border-color: #2e6da4;\n color: #ffffff;\n display: block;\n padding: 10px 15px;\n text-decoration: none;\n }\n .df-table-wrapper {\n font-size: small;\n font-weight: 300;\n letter-spacing: 0.5px;\n line-height: normal;\n height: inherit;\n overflow: auto;\n }\n .df-table-search {\n margin: 0 0 20px 0;\n }\n .df-table-search-count {\n display: inline-block;\n margin: 0 0 20px 0;\n }\n .df-table-container {\n max-height: 50vh;\n max-width: 100%;\n overflow-x: auto;\n position: relative;\n }\n .df-table-wrapper table {\n border: 0 none #ffffff;\n border-collapse: collapse;\n margin: 0;\n min-width: 100%;\n padding: 0;\n table-layout: fixed;\n height: inherit;\n overflow: auto;\n }\n .df-table-wrapper tr.hidden {\n display: none;\n }\n .df-table-wrapper tr:nth-child(even) {\n background-color: #f9f9fb;\n }\n .df-table-wrapper tr.even {\n background-color: #f9f9fb;\n }\n .df-table-wrapper tr.odd {\n background-color: #ffffff;\n }\n .df-table-wrapper td + td {\n border-left: 1px solid #e0e0e0;\n }\n \n .df-table-wrapper thead,\n .fixed-header {\n font-weight: 600;\n }\n .df-table-wrapper tr,\n .fixed-row {\n border: 0 none #ffffff;\n margin: 0;\n padding: 0;\n }\n .df-table-wrapper th,\n .df-table-wrapper td,\n .fixed-cell {\n border: 0 none #ffffff;\n margin: 0;\n min-width: 50px;\n padding: 5px 20px 5px 10px;\n text-align: left;\n word-wrap: break-word;\n }\n .df-table-wrapper th {\n padding-bottom: 0;\n padding-top: 0;\n }\n .df-table-wrapper th div {\n max-height: 1px;\n visibility: hidden;\n }\n \n .df-schema-field {\n margin-left: 10px;\n }\n \n .fixed-header-container {\n overflow: hidden;\n position: relative;\n }\n .fixed-header {\n border-bottom: 2px solid #000;\n display: table;\n position: relative;\n }\n .fixed-row {\n display: table-row;\n }\n .fixed-cell {\n display: table-cell;\n }\n </style>\n \n \n <div class=\"df-table-wrapper df-table-wrapper-2d655c46 panel-group pd_save\">\n <!-- dataframe schema -->\n \n <div class=\"panel panel-default\">\n <div class=\"panel-heading\">\n <h4 class=\"panel-title\" style=\"margin: 0px;\">\n <a data-toggle=\"collapse\" href=\"#df-schema-2d655c46\" data-parent=\"#df-table-wrapper-2d655c46\">Schema</a>\n </h4>\n </div>\n <div id=\"df-schema-2d655c46\" class=\"panel-collapse collapse\">\n <div class=\"panel-body\" style=\"font-family: monospace;\">\n <div class=\"df-schema-fields\">\n <div>Field types:</div>\n \n <div class=\"df-schema-field\"><strong>homeTeamWin: </strong> float64</div>\n \n <div class=\"df-schema-field\"><strong>homeTeam: </strong> object</div>\n \n <div class=\"df-schema-field\"><strong>homeTeamLastFive: </strong> int64</div>\n \n <div class=\"df-schema-field\"><strong>awayTeam: </strong> object</div>\n \n <div class=\"df-schema-field\"><strong>awayTeamLastFive: </strong> int64</div>\n \n <div class=\"df-schema-field\"><strong>prediction: </strong> float64</div>\n \n </div>\n </div>\n </div>\n </div>\n \n <!-- dataframe table -->\n <div class=\"panel panel-default\">\n \n <div class=\"panel-heading\">\n <h4 class=\"panel-title\" style=\"margin: 0px;\">\n <a data-toggle=\"collapse\" href=\"#df-table-2d655c46\" data-parent=\"#df-table-wrapper-2d655c46\"> Table</a>\n </h4>\n </div>\n \n <div id=\"df-table-2d655c46\" class=\"panel-collapse collapse in\">\n <div class=\"panel-body\">\n \n <input type=\"text\" class=\"df-table-search form-control input-sm\" placeholder=\"Search table\">\n \n <div>\n \n <span class=\"df-table-search-count\">Showing 100 of 988 rows</span>\n \n </div>\n <!-- fixed header for when dataframe table scrolls -->\n <div class=\"fixed-header-container\">\n <div class=\"fixed-header\" style=\"width: 873px;\">\n <div class=\"fixed-row\">\n \n <div class=\"fixed-cell\" style=\"width: 149px;\">homeTeamWin</div>\n \n <div class=\"fixed-cell\" style=\"width: 121px;\">homeTeam</div>\n \n <div class=\"fixed-cell\" style=\"width: 186px;\">homeTeamLastFive</div>\n \n <div class=\"fixed-cell\" style=\"width: 118px;\">awayTeam</div>\n \n <div class=\"fixed-cell\" style=\"width: 183px;\">awayTeamLastFive</div>\n \n <div class=\"fixed-cell\" style=\"width: 117px;\">prediction</div>\n \n </div>\n </div>\n </div>\n <div class=\"df-table-container\">\n <table class=\"df-table\">\n <thead>\n <tr>\n \n <th><div>homeTeamWin</div></th>\n \n <th><div>homeTeam</div></th>\n \n <th><div>homeTeamLastFive</div></th>\n \n <th><div>awayTeam</div></th>\n \n <th><div>awayTeamLastFive</div></th>\n \n <th><div>prediction</div></th>\n \n </tr>\n </thead>\n <tbody>\n \n <tr>\n \n <td>1.0</td>\n \n <td>MEM</td>\n \n <td>3</td>\n \n <td>PHI</td>\n \n <td>0</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>MEM</td>\n \n <td>1</td>\n \n <td>DAL</td>\n \n <td>1</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>LAC</td>\n \n <td>5</td>\n \n <td>SAC</td>\n \n <td>3</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>PHO</td>\n \n <td>1</td>\n \n <td>TOR</td>\n \n <td>4</td>\n \n <td>0.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>BOS</td>\n \n <td>2</td>\n \n <td>CHI</td>\n \n <td>3</td>\n \n <td>0.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>PHI</td>\n \n <td>0</td>\n \n <td>IND</td>\n \n <td>3</td>\n \n <td>0.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>DEN</td>\n \n <td>3</td>\n \n <td>MIN</td>\n \n <td>3</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>MIL</td>\n \n <td>2</td>\n \n <td>CHA</td>\n \n <td>2</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>NO</td>\n \n <td>3</td>\n \n <td>CHI</td>\n \n <td>4</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>MIA</td>\n \n <td>2</td>\n \n <td>NY</td>\n \n <td>3</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>MIN</td>\n \n <td>1</td>\n \n <td>DET</td>\n \n <td>3</td>\n \n <td>0.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>LAL</td>\n \n <td>2</td>\n \n <td>SAC</td>\n \n <td>4</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>CHI</td>\n \n <td>3</td>\n \n <td>ATL</td>\n \n <td>3</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>SAC</td>\n \n <td>2</td>\n \n <td>CLE</td>\n \n <td>2</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>DAL</td>\n \n <td>2</td>\n \n <td>PHO</td>\n \n <td>2</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>BKN</td>\n \n <td>1</td>\n \n <td>WAS</td>\n \n <td>2</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>TOR</td>\n \n <td>5</td>\n \n <td>CLE</td>\n \n <td>2</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>NY</td>\n \n <td>1</td>\n \n <td>CHI</td>\n \n <td>3</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>NY</td>\n \n <td>0</td>\n \n <td>BOS</td>\n \n <td>0</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>MIL</td>\n \n <td>3</td>\n \n <td>WAS</td>\n \n <td>3</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>UTA</td>\n \n <td>2</td>\n \n <td>MIL</td>\n \n <td>1</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>DEN</td>\n \n <td>2</td>\n \n <td>DAL</td>\n \n <td>4</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>PHO</td>\n \n <td>2</td>\n \n <td>SAC</td>\n \n <td>1</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>PHO</td>\n \n <td>1</td>\n \n <td>DAL</td>\n \n <td>1</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>CLE</td>\n \n <td>4</td>\n \n <td>ATL</td>\n \n <td>3</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>MIA</td>\n \n <td>3</td>\n \n <td>NY</td>\n \n <td>1</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>BOS</td>\n \n <td>5</td>\n \n <td>LAL</td>\n \n <td>1</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>CLE</td>\n \n <td>2</td>\n \n <td>UTA</td>\n \n <td>4</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>ATL</td>\n \n <td>3</td>\n \n <td>CHA</td>\n \n <td>1</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>POR</td>\n \n <td>1</td>\n \n <td>OKC</td>\n \n <td>4</td>\n \n <td>0.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>BKN</td>\n \n <td>1</td>\n \n <td>CLE</td>\n \n <td>3</td>\n \n <td>0.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>HOU</td>\n \n <td>2</td>\n \n <td>ATL</td>\n \n <td>2</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>TOR</td>\n \n <td>2</td>\n \n <td>IND</td>\n \n <td>3</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>CLE</td>\n \n <td>4</td>\n \n <td>CHI</td>\n \n <td>3</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>UTA</td>\n \n <td>2</td>\n \n <td>DAL</td>\n \n <td>0</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>GS</td>\n \n <td>4</td>\n \n <td>CHA</td>\n \n <td>0</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>POR</td>\n \n <td>4</td>\n \n <td>MIN</td>\n \n <td>0</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>BKN</td>\n \n <td>2</td>\n \n <td>ORL</td>\n \n <td>1</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>LAL</td>\n \n <td>1</td>\n \n <td>DEN</td>\n \n <td>4</td>\n \n <td>0.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>MEM</td>\n \n <td>3</td>\n \n <td>SA</td>\n \n <td>3</td>\n \n <td>0.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>POR</td>\n \n <td>3</td>\n \n <td>GS</td>\n \n <td>4</td>\n \n <td>0.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>PHO</td>\n \n <td>2</td>\n \n <td>DEN</td>\n \n <td>3</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>DAL</td>\n \n <td>3</td>\n \n <td>ORL</td>\n \n <td>1</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>POR</td>\n \n <td>2</td>\n \n <td>ATL</td>\n \n <td>3</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>BKN</td>\n \n <td>1</td>\n \n <td>LAL</td>\n \n <td>0</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>DEN</td>\n \n <td>4</td>\n \n <td>LAC</td>\n \n <td>2</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>LAC</td>\n \n <td>2</td>\n \n <td>NY</td>\n \n <td>1</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>NO</td>\n \n <td>0</td>\n \n <td>DEN</td>\n \n <td>0</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>ATL</td>\n \n <td>3</td>\n \n <td>UTA</td>\n \n <td>3</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>MIA</td>\n \n <td>1</td>\n \n <td>HOU</td>\n \n <td>3</td>\n \n <td>0.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>BOS</td>\n \n <td>4</td>\n \n <td>MIL</td>\n \n <td>4</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>ORL</td>\n \n <td>2</td>\n \n <td>MIA</td>\n \n <td>4</td>\n \n <td>0.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>TOR</td>\n \n <td>4</td>\n \n <td>PHI</td>\n \n <td>2</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>NY</td>\n \n <td>3</td>\n \n <td>OKC</td>\n \n <td>2</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>ATL</td>\n \n <td>4</td>\n \n <td>LAC</td>\n \n <td>3</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>BOS</td>\n \n <td>4</td>\n \n <td>PHI</td>\n \n <td>2</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>SA</td>\n \n <td>4</td>\n \n <td>GS</td>\n \n <td>2</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>NO</td>\n \n <td>4</td>\n \n <td>MEM</td>\n \n <td>4</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>IND</td>\n \n <td>2</td>\n \n <td>BOS</td>\n \n <td>2</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>ATL</td>\n \n <td>3</td>\n \n <td>CHI</td>\n \n <td>2</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>CLE</td>\n \n <td>2</td>\n \n <td>SAC</td>\n \n <td>1</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>MEM</td>\n \n <td>1</td>\n \n <td>WAS</td>\n \n <td>0</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>ATL</td>\n \n <td>3</td>\n \n <td>PHI</td>\n \n <td>1</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>NY</td>\n \n <td>3</td>\n \n <td>MIN</td>\n \n <td>1</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>MIN</td>\n \n <td>2</td>\n \n <td>MIL</td>\n \n <td>2</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>NO</td>\n \n <td>0</td>\n \n <td>GS</td>\n \n <td>0</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>BOS</td>\n \n <td>3</td>\n \n <td>GS</td>\n \n <td>5</td>\n \n <td>0.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>MIL</td>\n \n <td>2</td>\n \n <td>CLE</td>\n \n <td>4</td>\n \n <td>0.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>MIN</td>\n \n <td>4</td>\n \n <td>ORL</td>\n \n <td>2</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>WAS</td>\n \n <td>3</td>\n \n <td>TOR</td>\n \n <td>4</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>MEM</td>\n \n <td>2</td>\n \n <td>HOU</td>\n \n <td>4</td>\n \n <td>0.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>MEM</td>\n \n <td>0</td>\n \n <td>ORL</td>\n \n <td>0</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>MEM</td>\n \n <td>4</td>\n \n <td>CLE</td>\n \n <td>5</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>NY</td>\n \n <td>0</td>\n \n <td>MIL</td>\n \n <td>3</td>\n \n <td>0.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>MIL</td>\n \n <td>3</td>\n \n <td>MIA</td>\n \n <td>1</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>PHO</td>\n \n <td>2</td>\n \n <td>DAL</td>\n \n <td>1</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>OKC</td>\n \n <td>2</td>\n \n <td>CLE</td>\n \n <td>4</td>\n \n <td>0.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>BKN</td>\n \n <td>1</td>\n \n <td>BOS</td>\n \n <td>3</td>\n \n <td>0.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>NY</td>\n \n <td>1</td>\n \n <td>NO</td>\n \n <td>2</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>TOR</td>\n \n <td>1</td>\n \n <td>ORL</td>\n \n <td>1</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>SA</td>\n \n <td>4</td>\n \n <td>IND</td>\n \n <td>2</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>BKN</td>\n \n <td>0</td>\n \n <td>MEM</td>\n \n <td>3</td>\n \n <td>0.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>MIA</td>\n \n <td>2</td>\n \n <td>DEN</td>\n \n <td>2</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>LAL</td>\n \n <td>0</td>\n \n <td>MIN</td>\n \n <td>1</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>CLE</td>\n \n <td>4</td>\n \n <td>LAL</td>\n \n <td>1</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>DAL</td>\n \n <td>3</td>\n \n <td>HOU</td>\n \n <td>3</td>\n \n <td>0.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>ATL</td>\n \n <td>5</td>\n \n <td>BOS</td>\n \n <td>4</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>DEN</td>\n \n <td>2</td>\n \n <td>POR</td>\n \n <td>1</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>MIL</td>\n \n <td>4</td>\n \n <td>DAL</td>\n \n <td>1</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>POR</td>\n \n <td>4</td>\n \n <td>CHI</td>\n \n <td>3</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>UTA</td>\n \n <td>4</td>\n \n <td>OKC</td>\n \n <td>3</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>MEM</td>\n \n <td>2</td>\n \n <td>BOS</td>\n \n <td>2</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>CHA</td>\n \n <td>1</td>\n \n <td>PHI</td>\n \n <td>2</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>TOR</td>\n \n <td>2</td>\n \n <td>BOS</td>\n \n <td>4</td>\n \n <td>0.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>PHO</td>\n \n <td>1</td>\n \n <td>HOU</td>\n \n <td>4</td>\n \n <td>0.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>MIN</td>\n \n <td>2</td>\n \n <td>MIA</td>\n \n <td>5</td>\n \n <td>0.0</td>\n \n </tr>\n \n <tr>\n \n <td>0.0</td>\n \n <td>DAL</td>\n \n <td>4</td>\n \n <td>PHO</td>\n \n <td>3</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>HOU</td>\n \n <td>3</td>\n \n <td>POR</td>\n \n <td>3</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>WAS</td>\n \n <td>3</td>\n \n <td>CHA</td>\n \n <td>3</td>\n \n <td>1.0</td>\n \n </tr>\n \n <tr>\n \n <td>1.0</td>\n \n <td>POR</td>\n \n <td>3</td>\n \n <td>MIN</td>\n \n <td>3</td>\n \n <td>1.0</td>\n \n </tr>\n \n </tbody>\n </table>\n </div>\n </div>\n </div>\n </div>\n </div>\n \n <script class=\"pd_save\">\n $(function() {\n var tableWrapper = $('.df-table-wrapper-2d655c46');\n var fixedHeader = $('.fixed-header', tableWrapper);\n var tableContainer = $('.df-table-container', tableWrapper);\n var table = $('.df-table', tableContainer);\n var rows = $('tbody > tr', table);\n var total = 100;\n \n fixedHeader\n .css('width', table.width())\n .find('.fixed-cell')\n .each(function(i, e) {\n $(this).css('width', $('.df-table-wrapper-2d655c46 th:nth-child(' + (i+1) + ')').css('width'));\n });\n \n tableContainer.scroll(function() {\n fixedHeader.css({ left: table.position().left });\n });\n \n rows.on(\"click\", function(e){\n var txt = e.delegateTarget.innerText;\n var splits = txt.split(\"\\t\");\n var len = splits.length;\n var hdrs = $(fixedHeader).find(\".fixed-cell\");\n // Add all cells in the selected row as a map to be consumed by the target as needed\n var payload = {type:\"select\", targetDivId: \"\" };\n for (var i = 0; i < len; i++) {\n payload[hdrs[i].innerHTML] = splits[i];\n }\n \n //simple selection highlighting, client adds \"selected\" class\n $(this).addClass(\"selected\").siblings().removeClass(\"selected\");\n $(document).trigger('pd_event', payload);\n });\n \n $('.df-table-search', tableWrapper).keyup(function() {\n var val = '^(?=.*\\\\b' + $.trim($(this).val()).split(/\\s+/).join('\\\\b)(?=.*\\\\b') + ').*$';\n var reg = RegExp(val, 'i');\n var index = 0;\n \n rows.each(function(i, e) {\n if (!reg.test($(this).text().replace(/\\s+/g, ' '))) {\n $(this).attr('class', 'hidden');\n }\n else {\n $(this).attr('class', (++index % 2 == 0 ? 'even' : 'odd'));\n }\n });\n $('.df-table-search-count', tableWrapper).html('Showing ' + index + ' of ' + total + ' rows');\n });\n });\n \n $(\".df-table-wrapper td:contains('http://')\").each(function(){var tc = this.textContent; $(this).wrapInner(\"<a target='_blank' href='\" + tc + \"'></a>\");});\n $(\".df-table-wrapper td:contains('https://')\").each(function(){var tc = this.textContent; $(this).wrapInner(\"<a target='_blank' href='\" + tc + \"'></a>\");});\n </script>\n \n </div>", | |
"text/plain": "<IPython.core.display.HTML object>" | |
}, | |
"metadata": {} | |
} | |
], | |
"source": "rawpredictionsdf = model.stages[1].summary.predictions\ndisplay(rawpredictionsdf)" | |
}, | |
{ | |
"execution_count": 45, | |
"cell_type": "code", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": "Train Area Under ROC = 0.627433\n" | |
} | |
], | |
"source": "train_area_under_roc = model.stages[1].summary.areaUnderROC\nprint(\"Train Area Under ROC = %g\" % train_area_under_roc)" | |
}, | |
{ | |
"execution_count": 46, | |
"cell_type": "code", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": "Test Area Under ROC = 0.590336\n" | |
} | |
], | |
"source": "predictions = model.transform(test_data)\nevaluator = BinaryClassificationEvaluator(labelCol=labelCol,rawPredictionCol=\"prediction\", metricName=\"areaUnderROC\")\ntest_area_under_roc = evaluator.evaluate(predictions)\nprint(\"Test Area Under ROC = %g\" % test_area_under_roc)" | |
}, | |
{ | |
"source": "<a id=\"load\"></a>\n## 3. Deploy model to Watson Machine Learning", | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": true | |
} | |
}, | |
{ | |
"source": "In this section you will learn how to store the model in Watson Machine Learning repository using a client library.\n\nFirst, you must import client libraries.\n\n**Note**: Apache\u00ae Spark 2.0 is required.", | |
"cell_type": "markdown", | |
"metadata": {} | |
}, | |
{ | |
"execution_count": 47, | |
"cell_type": "code", | |
"metadata": {}, | |
"outputs": [], | |
"source": "from repository_v3.mlrepositoryclient import MLRepositoryClient\nfrom repository_v3.mlrepositoryartifact import MLRepositoryArtifact\nfrom repository_v3.mlrepository import MetaProps, MetaNames\nimport json\nimport requests\nimport urllib3" | |
}, | |
{ | |
"source": "Authenticate to Watson Machine Learning service on Bluemix.", | |
"cell_type": "markdown", | |
"metadata": {} | |
}, | |
{ | |
"source": "**Action**: Put authentication information from your instance of Watson Machine Learning service here.</div>", | |
"cell_type": "markdown", | |
"metadata": {} | |
}, | |
{ | |
"execution_count": 48, | |
"cell_type": "code", | |
"metadata": {}, | |
"outputs": [], | |
"source": "wml_credentials={\n \"url\": \"https://ibm-watson-ml.mybluemix.net\",\n \"access_key\": \"xxx\",\n \"username\": \"xxx\",\n \"password\": \"xxx\",\n \"instance_id\": \"xxx\"\n}" | |
}, | |
{ | |
"execution_count": 49, | |
"cell_type": "code", | |
"metadata": {}, | |
"outputs": [], | |
"source": "# The code was removed by DSX for sharing." | |
}, | |
{ | |
"source": "**Tip**: `url`, `instance_id`, `username` and `password` can be found on **Service Credentials** tab of service instance created in Bluemix. If you cannot see **instance_id** field in **Serice Credentials** generate new credentials by pressing **New credential (+)** button.", | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": true | |
} | |
}, | |
{ | |
"source": "Authenticate to the Watson Machine Learning service on IBM Cloud.", | |
"cell_type": "markdown", | |
"metadata": {} | |
}, | |
{ | |
"execution_count": 50, | |
"cell_type": "code", | |
"metadata": {}, | |
"outputs": [], | |
"source": "ml_repository_client = MLRepositoryClient(wml_credentials['url'])\nml_repository_client.authorize(wml_credentials['username'], wml_credentials['password'])" | |
}, | |
{ | |
"source": "Create model artifact (abstraction layer).", | |
"cell_type": "markdown", | |
"metadata": {} | |
}, | |
{ | |
"execution_count": 51, | |
"cell_type": "code", | |
"metadata": {}, | |
"outputs": [], | |
"source": "pipeline_artifact = MLRepositoryArtifact(pipeline_lr, name=\"pipeline\")" | |
}, | |
{ | |
"source": "**Tip**: The MLRepositoryArtifact method expects a trained model object, training data, and a model name. (It is this model name that is displayed by the Watson Machine Learning service).", | |
"cell_type": "markdown", | |
"metadata": {} | |
}, | |
{ | |
"execution_count": 52, | |
"cell_type": "code", | |
"metadata": {}, | |
"outputs": [], | |
"source": "deployment_name = 'NBA Game Predictor POSTGRESQL'\nmodel_artifact = MLRepositoryArtifact(model, pipeline_artifact=pipeline_artifact, training_data=train_data, name=deployment_name)" | |
}, | |
{ | |
"source": "## <font color=\"red\">Save the model</font>", | |
"cell_type": "markdown", | |
"metadata": {} | |
}, | |
{ | |
"execution_count": 53, | |
"cell_type": "code", | |
"metadata": {}, | |
"outputs": [], | |
"source": "saved_model = ml_repository_client.models.save(model_artifact)" | |
}, | |
{ | |
"source": "Get saved model metadata from Watson Machine Learning.", | |
"cell_type": "markdown", | |
"metadata": {} | |
}, | |
{ | |
"source": "**Tip**: Use meta.available_props() to get the list of available props.", | |
"cell_type": "markdown", | |
"metadata": {} | |
}, | |
{ | |
"execution_count": 54, | |
"cell_type": "code", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"execution_count": 54, | |
"metadata": {}, | |
"data": { | |
"text/plain": "dict_keys(['trainingDataReference', 'frameworkVersion', 'inputDataSchema', 'tags', 'hyperParameters', 'version', 'frameworkName', 'label_column', 'trainingDataSchema', 'framework_runtimes', 'contentStatus', 'trainingDefinitionVersionUrl', 'creationTime', 'modelVersionUrl', 'runtimes'])" | |
}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": "saved_model.meta.available_props()" | |
}, | |
{ | |
"execution_count": 55, | |
"cell_type": "code", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": "creationTime: 2018-04-19 19:20:46.221000+00:00\nmodelVersionUrl: https://ibm-watson-ml.mybluemix.net/v3/ml_assets/models/6bfc2ad3-73af-4a23-a033-9fbb5a0e7f78/versions/4eef4da5-acfa-4c52-a211-550452ec957e\nlabel: homeTeamWin\nmodelID: 6bfc2ad3-73af-4a23-a033-9fbb5a0e7f78\n" | |
} | |
], | |
"source": "print(\"creationTime: \" + str(saved_model.meta.prop(\"creationTime\")))\nprint(\"modelVersionUrl: \" + saved_model.meta.prop(\"modelVersionUrl\"))\nprint(\"label: \" + saved_model.meta.prop(\"label_column\"))\nprint(\"modelID: \" + str(saved_model.uid))" | |
}, | |
{ | |
"source": "**Tip**: `modelID` is our model unique indentifier in the Watson Machine Learning repository.", | |
"cell_type": "markdown", | |
"metadata": {} | |
}, | |
{ | |
"source": "<a id=\"deployment\"></a>\n## 4. Create a deployment", | |
"cell_type": "markdown", | |
"metadata": {} | |
}, | |
{ | |
"execution_count": 56, | |
"cell_type": "code", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"execution_count": 56, | |
"metadata": {}, | |
"data": { | |
"text/plain": "'BearereyJhbGciOiJSUzUxMiIsInR5cCI6IkpXVCJ9.eyJ0ZW5hbnRJZCI6ImVmY2VkNzExLTY1YjctNDI1OS04OTkxLTBiZDJmN2YyOWQwZSIsImluc3RhbmNlSWQiOiJlZmNlZDcxMS02NWI3LTQyNTktODk5MS0wYmQyZjdmMjlkMGUiLCJwbGFuSWQiOiIzZjZhY2Y0My1lZGU4LTQxM2EtYWM2OS1mOGFmM2JiMGNiZmUiLCJyZWdpb24iOiJ1cy1zb3V0aCIsInVzZXJJZCI6IjA2MDBkZDk3LTlhYmMtNGY3MC1iMjQ5LTNhOGM5ODA2MTA1YyIsImlzcyI6Imh0dHBzOi8vaWJtLXdhdHNvbi1tbC5teWJsdWVtaXgubmV0L3YzL2lkZW50aXR5IiwiaWF0IjoxNTI0MTY1NjQ5LCJleHAiOjE1MjQxOTQ0NDl9.i3ffO5Pq-xYaKkug6zJi2SshpHFK186k_goBp5Fo3UPSlXxQ8NflIAcXD8pkpwmwcFNO2xnE9RUsUYwEZlewl-4aOftZOSWIgWR6u7h1wVCjwoVGmAU6JZL4vdYc5UnXfRbzLTXz0Py8q-2edZ8EAhuU0XbhFWhOoQS7ZoHTdRiVcK1-R8uT8YDr-q2RePDJDIzVTeAne0PQK6zV8MrkaQm3jh6grmsD4vDYFr510OSDJJoib03UOSpuzGP-njLW3T1wP819cbIqsYkhf0yAlbiWGb7PYJDY4BjOby_roOUDpaeTpkXa9mDP6-Ng31ctidrYYrBavSAvKxX_HkZVcA'" | |
}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": "headers = urllib3.util.make_headers(basic_auth='{}:{}'.format(wml_credentials['username'], wml_credentials['password']))\nurl = '{}/v3/identity/token'.format(wml_credentials['url'])\nresponse = requests.get(url, headers=headers)\nml_token = 'Bearer' + json.loads(response.text).get('token')\nml_token" | |
}, | |
{ | |
"execution_count": 57, | |
"cell_type": "code", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"execution_count": 57, | |
"metadata": {}, | |
"data": { | |
"text/plain": "'https://ibm-watson-ml.mybluemix.net/v3/wml_instances/efced711-65b7-4259-8991-0bd2f7f29d0e/published_models/6bfc2ad3-73af-4a23-a033-9fbb5a0e7f78/deployments/b4b09d68-8f3e-4524-9857-208b0f8ac5b2/online'" | |
}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": "deployment_url = wml_credentials['url'] + \"/v3/wml_instances/\" + wml_credentials['instance_id'] + \"/published_models/\" + saved_model.uid + \"/deployments/\"\ndeployment_header = {'Content-Type': 'application/json', 'Authorization': ml_token}\ndeployment_payload = {\"type\": \"online\", \"name\": deployment_name}\ndeployment_response = requests.post(deployment_url, json=deployment_payload, headers=deployment_header)\ndeployment_response.text\nscoring_url = json.loads(deployment_response.text).get('entity').get('scoring_url')\nscoring_url" | |
}, | |
{ | |
"source": "<a id=\"test\"></a>\n## 5. Test the deployment\n\nFor April 19, 2018 matchup of Philadelphia Sixers vs. Miami Heat in Miami\n\nPHL\n\n- season win %: 634\n- wins last 5: 4\n\nMIA\n\n- season win %: 537\n- wins last 5: 2\n", | |
"cell_type": "markdown", | |
"metadata": {} | |
}, | |
{ | |
"execution_count": 58, | |
"cell_type": "code", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": "{\n \"fields\": [\"homeTeamHomeWinPercent\", \"homeTeamLastFive\", \"awayTeamAwayWinPercent\", \"awayTeamLastFive\", \"features\", \"rawPrediction\", \"probability\", \"prediction\"],\n \"values\": [[537, 2, 634, 4, [537.0, 2.0, 634.0, 4.0], [0.19970850501247311, -0.19970850501247311], [0.5497618464246237, 0.4502381535753764], 0.0]]\n}\n" | |
} | |
], | |
"source": "def get_prediction_from_watson_ml(homeTeamHomeWinPercent, homeTeamLastFive, awayTeamAwayWinPercent, awayTeamLastFive):\n scoring_header = {'Content-Type': 'application/json', 'Authorization': ml_token}\n scoring_payload = {'fields': ['homeTeamHomeWinPercent','homeTeamLastFive','awayTeamAwayWinPercent','awayTeamLastFive'], 'values': [[homeTeamHomeWinPercent, homeTeamLastFive, awayTeamAwayWinPercent, awayTeamLastFive]]}\n scoring_response = requests.post(scoring_url, json=scoring_payload, headers=scoring_header)\n return scoring_response.text\n\nresponse = get_prediction_from_watson_ml(537, 2, 634, 4)\nprint(response)\n" | |
}, | |
{ | |
"execution_count": null, | |
"cell_type": "code", | |
"metadata": {}, | |
"outputs": [], | |
"source": "" | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3.5 with Spark 2.1", | |
"name": "python3-spark21", | |
"language": "python" | |
}, | |
"language_info": { | |
"mimetype": "text/x-python", | |
"nbconvert_exporter": "python", | |
"version": "3.5.4", | |
"name": "python", | |
"file_extension": ".py", | |
"pygments_lexer": "ipython3", | |
"codemirror_mode": { | |
"version": 3, | |
"name": "ipython" | |
} | |
} | |
}, | |
"nbformat": 4 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment