Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save luisquintanilla/5c387542240dea57dde3c51521659305 to your computer and use it in GitHub Desktop.
Save luisquintanilla/5c387542240dea57dde3c51521659305 to your computer and use it in GitHub Desktop.
Time Series Forecast Model using ML .NET & Notebooks
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Weather forecast model\r\n",
"\r\n",
"Data is from NOAA. \r\n",
"\r\n",
"The stations were found using https://www.ncdc.noaa.gov/cdo-web/datatools/findstation\r\n",
"\r\n",
"The dataset used was the Daily Summaries Dataset -> Air Temperature\r\n",
"\r\n",
"## Getting the data\r\n",
"\r\n",
"1. Find a station https://www.ncdc.noaa.gov/cdo-web/datatools/findstation\r\n",
"\r\n",
"![image](https://user-images.githubusercontent.com/46974588/116326383-50f17f80-a792-11eb-9c66-3dabef398889.png)\r\n",
"\r\n",
"1. View cart and export **Custom GHCN-Daily CSV** format\r\n",
"\r\n",
"![image](https://user-images.githubusercontent.com/46974588/116326449-78484c80-a792-11eb-8061-9c87bb6fc856.png)\r\n",
"\r\n",
"1. Submit request for data with the following options:\r\n",
" - [x] Station Name\r\n",
" - Units: Standard\r\n",
" - [x] Air Temperature\r\n",
" - [ ] Average Temperature\r\n",
" - [x] Maximum Temperature (TMAX)\r\n",
" - [x] Minimum Temperature (TMIN)\r\n",
"\r\n",
"![image](https://user-images.githubusercontent.com/46974588/116326560-bd6c7e80-a792-11eb-8050-18f7f85193a5.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Install NuGet packages"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"dotnet_interactive": {
"language": "csharp"
}
},
"outputs": [
{
"data": {
"text/html": "\r\n<div>\r\n <div id='dotnet-interactive-this-cell-40348.Microsoft.DotNet.Interactive.Http.HttpPort' style='display: none'>\r\n The below script needs to be able to find the current output cell; this is an easy method to get it.\r\n </div>\r\n <script type='text/javascript'>\r\nasync function probeAddresses(probingAddresses) {\r\n function timeout(ms, promise) {\r\n return new Promise(function (resolve, reject) {\r\n setTimeout(function () {\r\n reject(new Error('timeout'))\r\n }, ms)\r\n promise.then(resolve, reject)\r\n })\r\n }\r\n\r\n if (Array.isArray(probingAddresses)) {\r\n for (let i = 0; i < probingAddresses.length; i++) {\r\n\r\n let rootUrl = probingAddresses[i];\r\n\r\n if (!rootUrl.endsWith('/')) {\r\n rootUrl = `${rootUrl}/`;\r\n }\r\n\r\n try {\r\n let response = await timeout(1000, fetch(`${rootUrl}discovery`, {\r\n method: 'POST',\r\n cache: 'no-cache',\r\n mode: 'cors',\r\n timeout: 1000,\r\n headers: {\r\n 'Content-Type': 'text/plain'\r\n },\r\n body: probingAddresses[i]\r\n }));\r\n\r\n if (response.status == 200) {\r\n return rootUrl;\r\n }\r\n }\r\n catch (e) { }\r\n }\r\n }\r\n}\r\n\r\nfunction loadDotnetInteractiveApi() {\r\n probeAddresses([\"http://172.17.112.1:1000/\", \"http://172.30.64.1:1000/\", \"http://192.168.1.223:1000/\", \"http://100.64.11.55:1000/\", \"http://192.168.1.207:1000/\", \"http://127.0.0.1:1000/\"])\r\n .then((root) => {\r\n // use probing to find host url and api resources\r\n // load interactive helpers and language services\r\n let dotnetInteractiveRequire = require.config({\r\n context: '40348.Microsoft.DotNet.Interactive.Http.HttpPort',\r\n paths:\r\n {\r\n 'dotnet-interactive': `${root}resources`\r\n }\r\n }) || require;\r\n\r\n window.dotnetInteractiveRequire = dotnetInteractiveRequire;\r\n\r\n window.configureRequireFromExtension = function(extensionName, extensionCacheBuster) {\r\n let paths = {};\r\n paths[extensionName] = `${root}extensions/${extensionName}/resources/`;\r\n \r\n let internalRequire = require.config({\r\n context: extensionCacheBuster,\r\n paths: paths,\r\n urlArgs: `cacheBuster=${extensionCacheBuster}`\r\n }) || require;\r\n\r\n return internalRequire\r\n };\r\n \r\n dotnetInteractiveRequire([\r\n 'dotnet-interactive/dotnet-interactive'\r\n ],\r\n function (dotnet) {\r\n dotnet.init(window);\r\n },\r\n function (error) {\r\n console.log(error);\r\n }\r\n );\r\n })\r\n .catch(error => {console.log(error);});\r\n }\r\n\r\n// ensure `require` is available globally\r\nif ((typeof(require) !== typeof(Function)) || (typeof(require.config) !== typeof(Function))) {\r\n let require_script = document.createElement('script');\r\n require_script.setAttribute('src', 'https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js');\r\n require_script.setAttribute('type', 'text/javascript');\r\n \r\n \r\n require_script.onload = function() {\r\n loadDotnetInteractiveApi();\r\n };\r\n\r\n document.getElementsByTagName('head')[0].appendChild(require_script);\r\n}\r\nelse {\r\n loadDotnetInteractiveApi();\r\n}\r\n\r\n </script>\r\n</div>"
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": "Installed package Microsoft.ML.TimeSeries version 1.5.5"
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": "Installed package Microsoft.ML version 1.5.5"
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"#r \"nuget:Microsoft.ML,1.5.5\"\r\n",
"#r \"nuget:Microsoft.ML.TimeSeries,1.5.5\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Import NuGet packages"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"using Microsoft.ML;\r\n",
"using Microsoft.ML.Data;\r\n",
"using Microsoft.ML.Transforms.TimeSeries;\r\n",
"using System.Linq;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Define model input and output schemas"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"dotnet_interactive": {
"language": "csharp"
}
},
"outputs": [],
"source": [
"public class ModelInput\r\n",
"{\r\n",
" [LoadColumn(6)]\r\n",
" public DateTime Date { get; set; }\r\n",
" \r\n",
" [LoadColumn(7)]\r\n",
" public float MaxTemp { get; set; }\r\n",
"\r\n",
" [LoadColumn(8)]\r\n",
" public float MinTemp {get;set;}\r\n",
" \r\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Added original columns Max / Min Temp columns to compare with actual"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"public class ModelOutput\r\n",
"{\r\n",
" public DateTime Date { get; set; }\r\n",
"\r\n",
" public float MaxTemp {get;set;}\r\n",
"\r\n",
" public float MinTemp {get; set;}\r\n",
"\r\n",
" public float[] ForecastTemp { get; set; }\r\n",
"\r\n",
" public float[] LowerBoundTemp { get; set; }\r\n",
"\r\n",
" public float[] UpperBoundTemp { get; set; }\r\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Initialize MLContext"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"var mlContext = new MLContext();"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Load data into IDataView\r\n",
"\r\n",
"5 year data starts 4/1/2015 \r\n",
"10 year data starts 4/2/2010"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"var seattle5yr = \"Data/seattle-5yr.csv\";\r\n",
"var seattle10yr = \"Data/seattle-10yr.csv\";\r\n",
"\r\n",
"var trainingDataView5yr = mlContext.Data.LoadFromTextFile<ModelInput>(seattle5yr, hasHeader: true, separatorChar:',');\r\n",
"var trainingDataView10yr = mlContext.Data.LoadFromTextFile<ModelInput>(seattle10yr, hasHeader: true, separatorChar:',');"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Models (5 year data)\r\n",
"\r\n",
"- Minimum Temperature\r\n",
"- Maximum Temperature"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Minimum Temperature"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [],
"source": [
"IEstimator<ITransformer> minTempEstimator5yr = mlContext.Forecasting.ForecastBySsa(\r\n",
" outputColumnName: \"ForecastTemp\",\r\n",
" inputColumnName: \"MinTemp\",\r\n",
" windowSize: 7,\r\n",
" seriesLength: 2202,\r\n",
" trainSize: 2202,\r\n",
" horizon: 7,\r\n",
" confidenceLevel: 0.85f,\r\n",
" confidenceLowerBoundColumn: \"LowerBoundTemp\",\r\n",
" confidenceUpperBoundColumn: \"UpperBoundTemp\");"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [],
"source": [
"var minTempModel5yr = minTempEstimator5yr.Fit(trainingDataView5yr);"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Maximum Temperature"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [],
"source": [
"IEstimator<ITransformer> maxTempEstimator5yr = mlContext.Forecasting.ForecastBySsa(\r\n",
" outputColumnName: \"ForecastTemp\",\r\n",
" inputColumnName: \"MaxTemp\",\r\n",
" windowSize: 7,\r\n",
" seriesLength: 2202,\r\n",
" trainSize: 2202,\r\n",
" horizon: 7,\r\n",
" confidenceLevel: 0.85f,\r\n",
" confidenceLowerBoundColumn: \"LowerBoundTemp\",\r\n",
" confidenceUpperBoundColumn: \"UpperBoundTemp\");"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [],
"source": [
"var maxTempModel5yr = maxTempEstimator5yr.Fit(trainingDataView5yr);"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Test Models (5 year)"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [],
"source": [
"var testInput = new ModelInput { Date= new DateTime(2020,04,01), MaxTemp = 51, MinTemp = 38};"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [],
"source": [
"TimeSeriesPredictionEngine<ModelInput, ModelOutput> minForecastEngine5yr = minTempModel5yr.CreateTimeSeriesEngine<ModelInput, ModelOutput>(mlContext);\r\n",
"var minPrediction5yr = minForecastEngine5yr.Predict(testInput,horizon:7);"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [],
"source": [
"TimeSeriesPredictionEngine<ModelInput, ModelOutput> maxForecastEngine5yr = maxTempModel5yr.CreateTimeSeriesEngine<ModelInput, ModelOutput>(mlContext);\r\n",
"var maxPrediction5yr = maxForecastEngine5yr.Predict(testInput, horizon:7);"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {
"dotnet_interactive": {
"language": "csharp"
}
},
"outputs": [
{
"data": {
"text/html": "<table><thead><tr><th><i>index</i></th><th>Date</th><th>MaxTemp</th><th>MinTemp</th><th>ForecastTemp</th><th>LowerBoundTemp</th><th>UpperBoundTemp</th></tr></thead><tbody><tr><td>0</td><td><span>2020-04-01 00:00:00Z</span></td><td><div class=\"dni-plaintext\">51</div></td><td><div class=\"dni-plaintext\">38</div></td><td><div class=\"dni-plaintext\">[ 45.64229, 45.505856, 45.42607, 45.48796, 45.574562, 45.579506, 45.473053 ]</div></td><td><div class=\"dni-plaintext\">[ 39.421925, 38.72968, 37.953438, 37.29155, 36.686584, 36.047966, 35.26731 ]</div></td><td><div class=\"dni-plaintext\">[ 51.86265, 52.282032, 52.898705, 53.684372, 54.46254, 55.111046, 55.678795 ]</div></td></tr><tr><td>1</td><td><span>2020-04-01 00:00:00Z</span></td><td><div class=\"dni-plaintext\">51</div></td><td><div class=\"dni-plaintext\">38</div></td><td><div class=\"dni-plaintext\">[ 73.780716, 73.05658, 72.64575, 72.69919, 72.88619, 72.93948, 72.638275 ]</div></td><td><div class=\"dni-plaintext\">[ 65.4993, 64.08209, 62.78885, 61.9087, 61.187195, 60.394077, 59.195263 ]</div></td><td><div class=\"dni-plaintext\">[ 82.06213, 82.03107, 82.502655, 83.48968, 84.58519, 85.484886, 86.08128 ]</div></td></tr></tbody></table>"
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"new [] {minPrediction5yr, maxPrediction5yr}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Models (10 year)\r\n",
"\r\n",
"- Minimum temperature\r\n",
"- Maximum temperature"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Minimum Temperature"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [],
"source": [
"IEstimator<ITransformer> minTempEstimator10yr = mlContext.Forecasting.ForecastBySsa(\r\n",
" outputColumnName: \"ForecastTemp\",\r\n",
" inputColumnName: \"MinTemp\",\r\n",
" windowSize: 7,\r\n",
" seriesLength: 4027,\r\n",
" trainSize: 4027,\r\n",
" horizon: 7,\r\n",
" confidenceLevel: 0.85f,\r\n",
" confidenceLowerBoundColumn: \"LowerBoundTemp\",\r\n",
" confidenceUpperBoundColumn: \"UpperBoundTemp\");"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [],
"source": [
"var minTempModel10yr = minTempEstimator10yr.Fit(trainingDataView10yr);"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Maximum Temperature"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {},
"outputs": [],
"source": [
"IEstimator<ITransformer> maxTempEstimator10yr = mlContext.Forecasting.ForecastBySsa(\r\n",
" outputColumnName: \"ForecastTemp\",\r\n",
" inputColumnName: \"MaxTemp\",\r\n",
" windowSize: 7,\r\n",
" seriesLength: 4027,\r\n",
" trainSize: 4027,\r\n",
" horizon: 7,\r\n",
" confidenceLevel: 0.85f,\r\n",
" confidenceLowerBoundColumn: \"LowerBoundTemp\",\r\n",
" confidenceUpperBoundColumn: \"UpperBoundTemp\");"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [],
"source": [
"var maxTempModel10yr = maxTempEstimator10yr.Fit(trainingDataView10yr);"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Test Models (10 year)"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {},
"outputs": [],
"source": [
"var testInput = new ModelInput { Date= new DateTime(2020,04,01), MaxTemp = 51, MinTemp = 38};"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {},
"outputs": [],
"source": [
"TimeSeriesPredictionEngine<ModelInput, ModelOutput> minForecastEngine10yr = minTempModel10yr.CreateTimeSeriesEngine<ModelInput, ModelOutput>(mlContext);\r\n",
"var minPrediction10yr = minForecastEngine10yr.Predict(testInput,horizon:7);"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {},
"outputs": [],
"source": [
"TimeSeriesPredictionEngine<ModelInput, ModelOutput> maxForecastEngine10yr = maxTempModel10yr.CreateTimeSeriesEngine<ModelInput, ModelOutput>(mlContext);\r\n",
"var maxPrediction10yr = maxForecastEngine10yr.Predict(testInput, horizon:7);"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {},
"outputs": [
{
"data": {
"text/html": "<table><thead><tr><th><i>index</i></th><th>Date</th><th>MaxTemp</th><th>MinTemp</th><th>ForecastTemp</th><th>LowerBoundTemp</th><th>UpperBoundTemp</th></tr></thead><tbody><tr><td>0</td><td><span>2020-04-01 00:00:00Z</span></td><td><div class=\"dni-plaintext\">51</div></td><td><div class=\"dni-plaintext\">38</div></td><td><div class=\"dni-plaintext\">[ 45.442406, 45.264603, 45.15645, 45.19384, 45.254196, 45.226856, 45.13159 ]</div></td><td><div class=\"dni-plaintext\">[ 39.717957, 39.014584, 38.26309, 37.649364, 37.10386, 36.508354, 35.83022 ]</div></td><td><div class=\"dni-plaintext\">[ 51.166855, 51.51462, 52.04981, 52.738316, 53.404533, 53.94536, 54.432964 ]</div></td></tr><tr><td>1</td><td><span>2020-04-01 00:00:00Z</span></td><td><div class=\"dni-plaintext\">51</div></td><td><div class=\"dni-plaintext\">38</div></td><td><div class=\"dni-plaintext\">[ 73.52904, 72.769394, 72.34601, 72.38786, 72.55228, 72.53551, 72.297585 ]</div></td><td><div class=\"dni-plaintext\">[ 66.13588, 64.76329, 63.576366, 62.831036, 62.2506, 61.534508, 60.570606 ]</div></td><td><div class=\"dni-plaintext\">[ 80.922195, 80.7755, 81.11565, 81.944695, 82.85395, 83.53651, 84.02457 ]</div></td></tr></tbody></table>"
},
"execution_count": 44,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"new [] {minPrediction10yr, maxPrediction10yr}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Conclusion\r\n",
"\r\n",
"5 year models, because it has more recent data looks to make better predictions. I recommend playing around with hyperparameters"
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".NET (C#)",
"language": "C#",
"name": ".net-csharp"
},
"language_info": {
"file_extension": ".cs",
"mimetype": "text/x-csharp",
"name": "C#",
"pygments_lexer": "csharp",
"version": "8.0"
},
"orig_nbformat": 2
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment