Last active
October 8, 2019 06:12
-
-
Save min2bro/b413d0b7f3c512a46140ecb8c0f8d5e6 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Comprehensive Guide on Pandas Datetime" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": true | |
}, | |
"source": [ | |
"##### In this post we will explore the Pandas datetime methods which can be used instantaneously to work with datetime in Pandas.\n", | |
"\n", | |
"##### I am sharing the table of content in case you are just interested to see a specific topic then this would help you to jump directly over there" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"collapsed": true | |
}, | |
"outputs": [], | |
"source": [ | |
"Table of Content:\n", | |
" 1. Datetime Index using date_range - frequency\n", | |
" 2. Creat Dataframe using Datetime Index\n", | |
" 3. Import CSV file in a dataframe using parse_date\n", | |
" 4. Convert date column to datetime object\n", | |
" 5. Extract data using datetime accessor\n", | |
" 6. Timeseries Aggreggation using resample\n", | |
" 7. Indexing and Slicing of datetime Index\n", | |
" 8. Date Offsets: Adding a day, adding hours etc.\n", | |
" 9. Advanced Datetime Index functions\n", | |
" 10. Understanding CustomBusinessDays\n", | |
" 11. Timezones\n", | |
" 12. Business Hours and CustomBusinessHours\n", | |
" 13. Difference Between two dates column" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Import time-series data" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"###### This is the monthly electrical consumption data in csv which we will import in a dataframe and data can be downloaded using this link" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### parse_dates attributes in read_csv() function" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"###### We are using parse_date attribute to parse and convert the date columns in the csv files to numpy datetime64 type" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 8, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"<class 'pandas.core.frame.DataFrame'>\n", | |
"RangeIndex: 397 entries, 0 to 396\n", | |
"Data columns (total 2 columns):\n", | |
"DATE 397 non-null datetime64[ns]\n", | |
"IPG2211A2N 397 non-null float64\n", | |
"dtypes: datetime64[ns](1), float64(1)\n", | |
"memory usage: 6.3 KB\n" | |
] | |
} | |
], | |
"source": [ | |
"import pandas as pd\n", | |
"import numpy as np\n", | |
"\n", | |
"df=pd.read_csv('./Electric_Production.csv',parse_dates=['DATE'])\n", | |
"df.info()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Pandas to_datetime" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"###### Alternatively, you can use to_datetime to convert any column to datetime" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 6, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"<class 'pandas.core.frame.DataFrame'>\n", | |
"RangeIndex: 397 entries, 0 to 396\n", | |
"Data columns (total 2 columns):\n", | |
"DATE 397 non-null datetime64[ns]\n", | |
"IPG2211A2N 397 non-null float64\n", | |
"dtypes: datetime64[ns](1), float64(1)\n", | |
"memory usage: 6.3 KB\n" | |
] | |
} | |
], | |
"source": [ | |
"df['DATE']=pd.to_datetime(df['DATE'])\n", | |
"df.info()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Extract Month and Year from datetime using datetime accessor" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### We will create 3 new columns here for Year, Month and day after extracting it from the Date column" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 10, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>DATE</th>\n", | |
" <th>IPG2211A2N</th>\n", | |
" <th>Year</th>\n", | |
" <th>month</th>\n", | |
" <th>day</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <td>0</td>\n", | |
" <td>1985-01-01</td>\n", | |
" <td>72.5052</td>\n", | |
" <td>1985</td>\n", | |
" <td>1</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>1</td>\n", | |
" <td>1985-02-01</td>\n", | |
" <td>70.6720</td>\n", | |
" <td>1985</td>\n", | |
" <td>2</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>2</td>\n", | |
" <td>1985-03-01</td>\n", | |
" <td>62.4502</td>\n", | |
" <td>1985</td>\n", | |
" <td>3</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>3</td>\n", | |
" <td>1985-04-01</td>\n", | |
" <td>57.4714</td>\n", | |
" <td>1985</td>\n", | |
" <td>4</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>4</td>\n", | |
" <td>1985-05-01</td>\n", | |
" <td>55.3151</td>\n", | |
" <td>1985</td>\n", | |
" <td>5</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" DATE IPG2211A2N Year month day\n", | |
"0 1985-01-01 72.5052 1985 1 1\n", | |
"1 1985-02-01 70.6720 1985 2 1\n", | |
"2 1985-03-01 62.4502 1985 3 1\n", | |
"3 1985-04-01 57.4714 1985 4 1\n", | |
"4 1985-05-01 55.3151 1985 5 1" | |
] | |
}, | |
"execution_count": 10, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"df['Year']=df['DATE'].dt.year\n", | |
"df['month']=df['DATE'].dt.month\n", | |
"df['day']=df['DATE'].dt.day\n", | |
"df.head()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 11, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>DATE</th>\n", | |
" <th>IPG2211A2N</th>\n", | |
" <th>Year</th>\n", | |
" <th>month</th>\n", | |
" <th>day</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <td>0</td>\n", | |
" <td>1985-01-01</td>\n", | |
" <td>72.5052</td>\n", | |
" <td>1985</td>\n", | |
" <td>1</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>1</td>\n", | |
" <td>1985-02-01</td>\n", | |
" <td>70.6720</td>\n", | |
" <td>1985</td>\n", | |
" <td>2</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>2</td>\n", | |
" <td>1985-03-01</td>\n", | |
" <td>62.4502</td>\n", | |
" <td>1985</td>\n", | |
" <td>3</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>3</td>\n", | |
" <td>1985-04-01</td>\n", | |
" <td>57.4714</td>\n", | |
" <td>1985</td>\n", | |
" <td>4</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>4</td>\n", | |
" <td>1985-05-01</td>\n", | |
" <td>55.3151</td>\n", | |
" <td>1985</td>\n", | |
" <td>5</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" DATE IPG2211A2N Year month day\n", | |
"0 1985-01-01 72.5052 1985 1 1\n", | |
"1 1985-02-01 70.6720 1985 2 1\n", | |
"2 1985-03-01 62.4502 1985 3 1\n", | |
"3 1985-04-01 57.4714 1985 4 1\n", | |
"4 1985-05-01 55.3151 1985 5 1" | |
] | |
}, | |
"execution_count": 11, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"# OR\n", | |
"\n", | |
"df['Year']=df['DATE'].apply(lambda x: x.year)\n", | |
"df['month']=df['DATE'].apply(lambda x: x.month)\n", | |
"df['day']=df['DATE'].apply(lambda x: x.day)\n", | |
"df.head()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Time Series- Aggregation\n", | |
"### resample to find sum on the date index date" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"###### resample() is a method in pandas that can be used to summarize data by date or time. \n", | |
"\n", | |
"###### Let's find the Yearly sum of Electricity Consumption" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 60, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>IPG2211A2N</th>\n", | |
" <th>Year</th>\n", | |
" <th>month</th>\n", | |
" <th>day</th>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>DATE</th>\n", | |
" <th></th>\n", | |
" <th></th>\n", | |
" <th></th>\n", | |
" <th></th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <td>1985-12-31</td>\n", | |
" <td>745.9880</td>\n", | |
" <td>23820</td>\n", | |
" <td>78</td>\n", | |
" <td>12</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>1986-12-31</td>\n", | |
" <td>752.5187</td>\n", | |
" <td>23832</td>\n", | |
" <td>78</td>\n", | |
" <td>12</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>1987-12-31</td>\n", | |
" <td>788.8833</td>\n", | |
" <td>23844</td>\n", | |
" <td>78</td>\n", | |
" <td>12</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>1988-12-31</td>\n", | |
" <td>836.5963</td>\n", | |
" <td>23856</td>\n", | |
" <td>78</td>\n", | |
" <td>12</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>1989-12-31</td>\n", | |
" <td>862.7420</td>\n", | |
" <td>23868</td>\n", | |
" <td>78</td>\n", | |
" <td>12</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" IPG2211A2N Year month day\n", | |
"DATE \n", | |
"1985-12-31 745.9880 23820 78 12\n", | |
"1986-12-31 752.5187 23832 78 12\n", | |
"1987-12-31 788.8833 23844 78 12\n", | |
"1988-12-31 836.5963 23856 78 12\n", | |
"1989-12-31 862.7420 23868 78 12" | |
] | |
}, | |
"execution_count": 60, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"df.set_index('DATE').resample('1Y').sum().head()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### resample to find mean on the date index date" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"##### Lets find the Electricity consumption mean for each year" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 59, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>IPG2211A2N</th>\n", | |
" <th>Year</th>\n", | |
" <th>month</th>\n", | |
" <th>day</th>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>DATE</th>\n", | |
" <th></th>\n", | |
" <th></th>\n", | |
" <th></th>\n", | |
" <th></th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <td>1985-12-31</td>\n", | |
" <td>62.165667</td>\n", | |
" <td>1985.0</td>\n", | |
" <td>6.5</td>\n", | |
" <td>1.0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>1986-12-31</td>\n", | |
" <td>62.709892</td>\n", | |
" <td>1986.0</td>\n", | |
" <td>6.5</td>\n", | |
" <td>1.0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>1987-12-31</td>\n", | |
" <td>65.740275</td>\n", | |
" <td>1987.0</td>\n", | |
" <td>6.5</td>\n", | |
" <td>1.0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>1988-12-31</td>\n", | |
" <td>69.716358</td>\n", | |
" <td>1988.0</td>\n", | |
" <td>6.5</td>\n", | |
" <td>1.0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>1989-12-31</td>\n", | |
" <td>71.895167</td>\n", | |
" <td>1989.0</td>\n", | |
" <td>6.5</td>\n", | |
" <td>1.0</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" IPG2211A2N Year month day\n", | |
"DATE \n", | |
"1985-12-31 62.165667 1985.0 6.5 1.0\n", | |
"1986-12-31 62.709892 1986.0 6.5 1.0\n", | |
"1987-12-31 65.740275 1987.0 6.5 1.0\n", | |
"1988-12-31 69.716358 1988.0 6.5 1.0\n", | |
"1989-12-31 71.895167 1989.0 6.5 1.0" | |
] | |
}, | |
"execution_count": 59, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"df.set_index('DATE').resample('1Y').mean().head()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Datetime index and slice" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### Just ensure that the datetime column is set as index for the dataframe. I am using set_index() function to set that before index and slice" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### Filter using the date" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 13, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>IPG2211A2N</th>\n", | |
" <th>Year</th>\n", | |
" <th>month</th>\n", | |
" <th>day</th>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>DATE</th>\n", | |
" <th></th>\n", | |
" <th></th>\n", | |
" <th></th>\n", | |
" <th></th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <td>1987-01-01</td>\n", | |
" <td>73.8152</td>\n", | |
" <td>1987</td>\n", | |
" <td>1</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>1987-02-01</td>\n", | |
" <td>70.0620</td>\n", | |
" <td>1987</td>\n", | |
" <td>2</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" IPG2211A2N Year month day\n", | |
"DATE \n", | |
"1987-01-01 73.8152 1987 1 1\n", | |
"1987-02-01 70.0620 1987 2 1" | |
] | |
}, | |
"execution_count": 13, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"df.set_index('DATE')['1987'].head(2)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### Filter all rows between two dates i.e. 1989-JAN and 1995-Apr here" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 17, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>IPG2211A2N</th>\n", | |
" <th>Year</th>\n", | |
" <th>month</th>\n", | |
" <th>day</th>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>DATE</th>\n", | |
" <th></th>\n", | |
" <th></th>\n", | |
" <th></th>\n", | |
" <th></th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <td>1989-01-01</td>\n", | |
" <td>77.9188</td>\n", | |
" <td>1989</td>\n", | |
" <td>1</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>1989-02-01</td>\n", | |
" <td>76.6822</td>\n", | |
" <td>1989</td>\n", | |
" <td>2</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>1989-03-01</td>\n", | |
" <td>73.3523</td>\n", | |
" <td>1989</td>\n", | |
" <td>3</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>1989-04-01</td>\n", | |
" <td>65.1081</td>\n", | |
" <td>1989</td>\n", | |
" <td>4</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>1989-05-01</td>\n", | |
" <td>63.6892</td>\n", | |
" <td>1989</td>\n", | |
" <td>5</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" IPG2211A2N Year month day\n", | |
"DATE \n", | |
"1989-01-01 77.9188 1989 1 1\n", | |
"1989-02-01 76.6822 1989 2 1\n", | |
"1989-03-01 73.3523 1989 3 1\n", | |
"1989-04-01 65.1081 1989 4 1\n", | |
"1989-05-01 63.6892 1989 5 1" | |
] | |
}, | |
"execution_count": 17, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"df.set_index('DATE')['1989-01':'1995-04'].head()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Date Offset\n", | |
"\n", | |
"#### Its a kind of date increment used for a date range. As per the documentation:\n", | |
"\n", | |
"#### DateOffset work as follows. Each offset specify a set of dates that conform to the DateOffset. For example, Bday defines this #### set to be the set of dates that are weekdays (M-F). To test if a date is in the set of a DateOffset dateOffset we can use the #### onOffset method: dateOffset.onOffset(date).\n", | |
"\n", | |
"#### If a date is not on a valid date, the rollback and rollforward methods can be used to roll the date to the nearest valid date #### before/after the date.\n", | |
"\n", | |
"#### DateOffsets can be created to move dates forward a given number of valid dates. For example, Bday(2) can be added to a #### date to move it two business days forward. If the date does not start on a valid date, first it is moved to a valid date\n", | |
"\n", | |
"#### Add one day\n", | |
"\n", | |
"#### Here we are adding a day(timedelta of 1 day) to the Date column in dataframe and creating a new column called as next_day" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 18, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>DATE</th>\n", | |
" <th>IPG2211A2N</th>\n", | |
" <th>Year</th>\n", | |
" <th>month</th>\n", | |
" <th>day</th>\n", | |
" <th>next_day</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <td>0</td>\n", | |
" <td>1985-01-01</td>\n", | |
" <td>72.5052</td>\n", | |
" <td>1985</td>\n", | |
" <td>1</td>\n", | |
" <td>1</td>\n", | |
" <td>1985-01-02</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>1</td>\n", | |
" <td>1985-02-01</td>\n", | |
" <td>70.6720</td>\n", | |
" <td>1985</td>\n", | |
" <td>2</td>\n", | |
" <td>1</td>\n", | |
" <td>1985-02-02</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>2</td>\n", | |
" <td>1985-03-01</td>\n", | |
" <td>62.4502</td>\n", | |
" <td>1985</td>\n", | |
" <td>3</td>\n", | |
" <td>1</td>\n", | |
" <td>1985-03-02</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>3</td>\n", | |
" <td>1985-04-01</td>\n", | |
" <td>57.4714</td>\n", | |
" <td>1985</td>\n", | |
" <td>4</td>\n", | |
" <td>1</td>\n", | |
" <td>1985-04-02</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>4</td>\n", | |
" <td>1985-05-01</td>\n", | |
" <td>55.3151</td>\n", | |
" <td>1985</td>\n", | |
" <td>5</td>\n", | |
" <td>1</td>\n", | |
" <td>1985-05-02</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" DATE IPG2211A2N Year month day next_day\n", | |
"0 1985-01-01 72.5052 1985 1 1 1985-01-02\n", | |
"1 1985-02-01 70.6720 1985 2 1 1985-02-02\n", | |
"2 1985-03-01 62.4502 1985 3 1 1985-03-02\n", | |
"3 1985-04-01 57.4714 1985 4 1 1985-04-02\n", | |
"4 1985-05-01 55.3151 1985 5 1 1985-05-02" | |
] | |
}, | |
"execution_count": 18, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"df['next_day']=df['DATE']+pd.Timedelta('1 day')\n", | |
"df.head()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Adding a Business day" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### Here we are adding a Business day using Bday param, it will add a day between Mon-Fri. if a date is Sat then add a bday will return the next Monday i.e. a Business day instead of a Saturday" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 67, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>DATE</th>\n", | |
" <th>IPG2211A2N</th>\n", | |
" <th>Year</th>\n", | |
" <th>month</th>\n", | |
" <th>day</th>\n", | |
" <th>next_day</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <td>0</td>\n", | |
" <td>1985-01-01</td>\n", | |
" <td>72.5052</td>\n", | |
" <td>1985</td>\n", | |
" <td>1</td>\n", | |
" <td>1</td>\n", | |
" <td>1985-01-02</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>1</td>\n", | |
" <td>1985-02-01</td>\n", | |
" <td>70.6720</td>\n", | |
" <td>1985</td>\n", | |
" <td>2</td>\n", | |
" <td>1</td>\n", | |
" <td>1985-02-04</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>2</td>\n", | |
" <td>1985-03-01</td>\n", | |
" <td>62.4502</td>\n", | |
" <td>1985</td>\n", | |
" <td>3</td>\n", | |
" <td>1</td>\n", | |
" <td>1985-03-04</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>3</td>\n", | |
" <td>1985-04-01</td>\n", | |
" <td>57.4714</td>\n", | |
" <td>1985</td>\n", | |
" <td>4</td>\n", | |
" <td>1</td>\n", | |
" <td>1985-04-02</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>4</td>\n", | |
" <td>1985-05-01</td>\n", | |
" <td>55.3151</td>\n", | |
" <td>1985</td>\n", | |
" <td>5</td>\n", | |
" <td>1</td>\n", | |
" <td>1985-05-02</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" DATE IPG2211A2N Year month day next_day\n", | |
"0 1985-01-01 72.5052 1985 1 1 1985-01-02\n", | |
"1 1985-02-01 70.6720 1985 2 1 1985-02-04\n", | |
"2 1985-03-01 62.4502 1985 3 1 1985-03-04\n", | |
"3 1985-04-01 57.4714 1985 4 1 1985-04-02\n", | |
"4 1985-05-01 55.3151 1985 5 1 1985-05-02" | |
] | |
}, | |
"execution_count": 67, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"df['next_day']=df['DATE'].apply(lambda x: x+pd.offsets.BDay(1))\n", | |
"df.head()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Add 2 business days" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### Addind two days to the current DATE column using days parameter and create a new column day_after" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 19, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>DATE</th>\n", | |
" <th>IPG2211A2N</th>\n", | |
" <th>Year</th>\n", | |
" <th>month</th>\n", | |
" <th>day</th>\n", | |
" <th>next_day</th>\n", | |
" <th>day_after</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <td>0</td>\n", | |
" <td>1985-01-01</td>\n", | |
" <td>72.5052</td>\n", | |
" <td>1985</td>\n", | |
" <td>1</td>\n", | |
" <td>1</td>\n", | |
" <td>1985-01-02</td>\n", | |
" <td>1985-01-03</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>1</td>\n", | |
" <td>1985-02-01</td>\n", | |
" <td>70.6720</td>\n", | |
" <td>1985</td>\n", | |
" <td>2</td>\n", | |
" <td>1</td>\n", | |
" <td>1985-02-02</td>\n", | |
" <td>1985-02-03</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>2</td>\n", | |
" <td>1985-03-01</td>\n", | |
" <td>62.4502</td>\n", | |
" <td>1985</td>\n", | |
" <td>3</td>\n", | |
" <td>1</td>\n", | |
" <td>1985-03-02</td>\n", | |
" <td>1985-03-03</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>3</td>\n", | |
" <td>1985-04-01</td>\n", | |
" <td>57.4714</td>\n", | |
" <td>1985</td>\n", | |
" <td>4</td>\n", | |
" <td>1</td>\n", | |
" <td>1985-04-02</td>\n", | |
" <td>1985-04-03</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>4</td>\n", | |
" <td>1985-05-01</td>\n", | |
" <td>55.3151</td>\n", | |
" <td>1985</td>\n", | |
" <td>5</td>\n", | |
" <td>1</td>\n", | |
" <td>1985-05-02</td>\n", | |
" <td>1985-05-03</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" DATE IPG2211A2N Year month day next_day day_after\n", | |
"0 1985-01-01 72.5052 1985 1 1 1985-01-02 1985-01-03\n", | |
"1 1985-02-01 70.6720 1985 2 1 1985-02-02 1985-02-03\n", | |
"2 1985-03-01 62.4502 1985 3 1 1985-03-02 1985-03-03\n", | |
"3 1985-04-01 57.4714 1985 4 1 1985-04-02 1985-04-03\n", | |
"4 1985-05-01 55.3151 1985 5 1 1985-05-02 1985-05-03" | |
] | |
}, | |
"execution_count": 19, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"df['day_after']=df['DATE'].apply(lambda x: x+pd.DateOffset(days=2))\n", | |
"df.head()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Add next month date" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### Adding a month to the DATE column using months parameter" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 70, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>DATE</th>\n", | |
" <th>IPG2211A2N</th>\n", | |
" <th>Year</th>\n", | |
" <th>month</th>\n", | |
" <th>day</th>\n", | |
" <th>next_day</th>\n", | |
" <th>day_after</th>\n", | |
" <th>next_month_day</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <td>0</td>\n", | |
" <td>1985-01-01</td>\n", | |
" <td>72.5052</td>\n", | |
" <td>1985</td>\n", | |
" <td>1</td>\n", | |
" <td>1</td>\n", | |
" <td>1985-01-02</td>\n", | |
" <td>1985-01-03</td>\n", | |
" <td>1985-02-01</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>1</td>\n", | |
" <td>1985-02-01</td>\n", | |
" <td>70.6720</td>\n", | |
" <td>1985</td>\n", | |
" <td>2</td>\n", | |
" <td>1</td>\n", | |
" <td>1985-02-04</td>\n", | |
" <td>1985-02-03</td>\n", | |
" <td>1985-03-01</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>2</td>\n", | |
" <td>1985-03-01</td>\n", | |
" <td>62.4502</td>\n", | |
" <td>1985</td>\n", | |
" <td>3</td>\n", | |
" <td>1</td>\n", | |
" <td>1985-03-04</td>\n", | |
" <td>1985-03-03</td>\n", | |
" <td>1985-04-01</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>3</td>\n", | |
" <td>1985-04-01</td>\n", | |
" <td>57.4714</td>\n", | |
" <td>1985</td>\n", | |
" <td>4</td>\n", | |
" <td>1</td>\n", | |
" <td>1985-04-02</td>\n", | |
" <td>1985-04-03</td>\n", | |
" <td>1985-05-01</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>4</td>\n", | |
" <td>1985-05-01</td>\n", | |
" <td>55.3151</td>\n", | |
" <td>1985</td>\n", | |
" <td>5</td>\n", | |
" <td>1</td>\n", | |
" <td>1985-05-02</td>\n", | |
" <td>1985-05-03</td>\n", | |
" <td>1985-06-01</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" DATE IPG2211A2N Year month day next_day day_after \\\n", | |
"0 1985-01-01 72.5052 1985 1 1 1985-01-02 1985-01-03 \n", | |
"1 1985-02-01 70.6720 1985 2 1 1985-02-04 1985-02-03 \n", | |
"2 1985-03-01 62.4502 1985 3 1 1985-03-04 1985-03-03 \n", | |
"3 1985-04-01 57.4714 1985 4 1 1985-04-02 1985-04-03 \n", | |
"4 1985-05-01 55.3151 1985 5 1 1985-05-02 1985-05-03 \n", | |
"\n", | |
" next_month_day \n", | |
"0 1985-02-01 \n", | |
"1 1985-03-01 \n", | |
"2 1985-04-01 \n", | |
"3 1985-05-01 \n", | |
"4 1985-06-01 " | |
] | |
}, | |
"execution_count": 70, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"df['next_month_day']=df['DATE'].apply(lambda x: x+pd.DateOffset(months=1))\n", | |
"df.head()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### For the complete list of parameters check this link\n", | |
"#### https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.tseries.offsets.DateOffset.html" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Using date_range to create datetime index\n", | |
"\n", | |
"#### it is Immutable numpy ndarray of datetime64 data, We will see how to create datetime index and eventually create a dataframe using these datetime index arrays" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Datetime index with Hourly frequency" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### It gives the array of date and time starting from '2018-01-01' with a Hourly frequency and period=3 means total elements of 3" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 27, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"DatetimeIndex(['2018-01-01 00:00:00', '2018-01-01 01:00:00',\n", | |
" '2018-01-01 02:00:00'],\n", | |
" dtype='datetime64[ns]', freq='H')" | |
] | |
}, | |
"execution_count": 27, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"import pandas as pd\n", | |
"dti = pd.date_range('2018-01-01', periods=3, freq='H')\n", | |
"dti" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Monthly Frequency" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### Now change the frequency to Monthly and create total 10 date array" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 22, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"DatetimeIndex(['2018-01-31', '2018-02-28', '2018-03-31', '2018-04-30',\n", | |
" '2018-05-31', '2018-06-30', '2018-07-31', '2018-08-31',\n", | |
" '2018-09-30', '2018-10-31'],\n", | |
" dtype='datetime64[ns]', freq='M')" | |
] | |
}, | |
"execution_count": 22, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"index = pd.date_range('2018-01-01',periods=10, freq='M')\n", | |
"index" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Weekly Frequency with start and end" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### Change the frequency to Weekly and create dates between two dates using start and end dates" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 25, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"DatetimeIndex(['2019-01-06', '2019-01-13', '2019-01-20', '2019-01-27',\n", | |
" '2019-02-03', '2019-02-10', '2019-02-17', '2019-02-24',\n", | |
" '2019-03-03', '2019-03-10', '2019-03-17', '2019-03-24',\n", | |
" '2019-03-31', '2019-04-07', '2019-04-14', '2019-04-21',\n", | |
" '2019-04-28'],\n", | |
" dtype='datetime64[ns]', freq='W-SUN')" | |
] | |
}, | |
"execution_count": 25, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"pd.date_range(start='2019-01-01', end='2019-04-30', freq='W')" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Datetime index with start and end" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 31, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"DatetimeIndex(['2011-01-01', '2011-01-02', '2011-01-03', '2011-01-04',\n", | |
" '2011-01-05', '2011-01-06', '2011-01-07', '2011-01-08',\n", | |
" '2011-01-09', '2011-01-10', '2011-01-11', '2011-01-12',\n", | |
" '2011-01-13', '2011-01-14', '2011-01-15', '2011-01-16',\n", | |
" '2011-01-17', '2011-01-18', '2011-01-19', '2011-01-20',\n", | |
" '2011-01-21', '2011-01-22', '2011-01-23', '2011-01-24',\n", | |
" '2011-01-25', '2011-01-26', '2011-01-27', '2011-01-28',\n", | |
" '2011-01-29', '2011-01-30', '2011-01-31', '2011-02-01'],\n", | |
" dtype='datetime64[ns]', freq='D')" | |
] | |
}, | |
"execution_count": 31, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"import datetime\n", | |
"start = datetime.datetime(2011, 1, 1)\n", | |
"\n", | |
"end = datetime.datetime(2011, 2, 1)\n", | |
"\n", | |
"index = pd.date_range(start, end)\n", | |
"index" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Create dataframe using date time index" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### Create dataframe with datetime as index" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### Here index: dti is the date_range created above with hourly frequency" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 29, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>price</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <td>2018-01-01 00:00:00</td>\n", | |
" <td>1.005757</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>2018-01-01 01:00:00</td>\n", | |
" <td>15.762388</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>2018-01-01 02:00:00</td>\n", | |
" <td>3.260808</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" price\n", | |
"2018-01-01 00:00:00 1.005757\n", | |
"2018-01-01 01:00:00 15.762388\n", | |
"2018-01-01 02:00:00 3.260808" | |
] | |
}, | |
"execution_count": 29, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"import numpy as np\n", | |
"df= pd.DataFrame({'price':np.random.uniform(0,20,size=3)},index=dti)\n", | |
"df" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### Create datafrema with datetime as a column" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 30, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>price</th>\n", | |
" <th>date</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <td>0</td>\n", | |
" <td>17.763795</td>\n", | |
" <td>2018-01-01 00:00:00</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>1</td>\n", | |
" <td>4.811412</td>\n", | |
" <td>2018-01-01 01:00:00</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>2</td>\n", | |
" <td>13.950828</td>\n", | |
" <td>2018-01-01 02:00:00</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" price date\n", | |
"0 17.763795 2018-01-01 00:00:00\n", | |
"1 4.811412 2018-01-01 01:00:00\n", | |
"2 13.950828 2018-01-01 02:00:00" | |
] | |
}, | |
"execution_count": 30, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"import numpy as np\n", | |
"df= pd.DataFrame({'price':np.random.uniform(0,20,size=3),'date':dti})\n", | |
"df" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Datetime Index Using Holiday Calendar" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### You can also use the Holiday calendars to provide the list of holidays. Here we are using freq as US holiday calendar, So the\n", | |
"#### final datetime index will skip all the dates available in that holiday calendar" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 32, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"DatetimeIndex(['2019-12-24', '2019-12-26', '2019-12-27', '2019-12-30',\n", | |
" '2019-12-31'],\n", | |
" dtype='datetime64[ns]', freq='C')\n" | |
] | |
} | |
], | |
"source": [ | |
"from pandas.tseries.holiday import USFederalHolidayCalendar\n", | |
"from pandas.tseries.offsets import CustomBusinessDay\n", | |
"\n", | |
"us_bd = CustomBusinessDay(calendar=USFederalHolidayCalendar())\n", | |
"print (pd.date_range(start='2019-12-24',end='2019-12-31', freq=us_bd))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Datetime Index using Origin Parameter" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### You can set the origin date and a list of days as a parameter and add that to origin date. Here the origin is 2019-10-25 \n", | |
"#### and adding 1 day to it gives 2019-10-26 and similarly adding 2 and 3 gives 2019-10-27 and 2019-10-28 resp" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 36, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"DatetimeIndex(['2019-10-26', '2019-10-27', '2019-10-28'], dtype='datetime64[ns]', freq=None)" | |
] | |
}, | |
"execution_count": 36, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"pd.to_datetime([1, 2, 3], unit='D', origin=pd.Timestamp('2019-10-25'))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Week masking and Holidays" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### One of the important feature is Week masking, In Middle eastern countries the working days in a week is thru Sun-Thu and \n", | |
"#### Fri,Sat is considered as Weekends. So here we are creating a dateindex using such working weeks from Sun-Thu and\n", | |
"#### list of Holidays" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 79, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"DatetimeIndex(['2011-01-02', '2011-01-03', '2011-01-04', '2011-01-06',\n", | |
" '2011-01-09', '2011-01-10', '2011-01-11', '2011-01-12',\n", | |
" '2011-01-13', '2011-01-16', '2011-01-17', '2011-01-18',\n", | |
" '2011-01-19', '2011-01-20', '2011-01-23', '2011-01-24',\n", | |
" '2011-01-25', '2011-01-26', '2011-01-27', '2011-01-30',\n", | |
" '2011-01-31', '2011-02-01'],\n", | |
" dtype='datetime64[ns]', freq='C')" | |
] | |
}, | |
"execution_count": 79, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"weekmask = 'Sun Mon Tue Wed Thu'\n", | |
"\n", | |
"holidays = [datetime.datetime(2011, 1, 5), datetime.datetime(2011, 3, 14)]\n", | |
"\n", | |
"pd.bdate_range(start, end, freq='C', weekmask=weekmask, holidays=holidays)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Understand Custom Business days" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### Using CustomBusinessdays you can create the custom business day using the same example of Middle eastern countries\n", | |
"#### as shown above. This will work exactly the same way as Dateoffset Bday() explained above. \n", | |
"\n", | |
"#### As shown in the example here, if we add 2 business days after 2013-04-30(Tue) considering 2013-05-01(Wed) is a holiday as it in holiday #### list, If we add 2 Middle eastern business day to 2013-04-30 then it will return 2013-05-05 since Wed(2013-05-01) is a Holiday\n", | |
"#### and Fri and Sat is a weekend so the 2nd business day is Sunday i.e. 2013-05-05" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 40, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"Timestamp('2013-05-05 17:00:00')" | |
] | |
}, | |
"execution_count": 40, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"import datetime\n", | |
"import numpy as np\n", | |
"import pandas as pd\n", | |
"\n", | |
"weekmask_egypt = 'Sun Mon Tue Wed Thu'\n", | |
"\n", | |
"holidays = ['2012-05-01',datetime.datetime(2013, 5, 1),np.datetime64('2014-05-01')]\n", | |
"\n", | |
"bday_egypt = pd.offsets.CustomBusinessDay(holidays=holidays,weekmask=weekmask_egypt)\n", | |
"\n", | |
"# dt = datetime.datetime(2013, 4, 30)\n", | |
"dt = pd.Timestamp('2013-04-30 17:00')\n", | |
"dt+2*bday_egypt" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Using Truncate" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### Two date attributes after and before is used to filter the records" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 81, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>DATE</th>\n", | |
" <th>IPG2211A2N</th>\n", | |
" <th>Year</th>\n", | |
" <th>month</th>\n", | |
" <th>day</th>\n", | |
" <th>next_day</th>\n", | |
" <th>day_after</th>\n", | |
" <th>next_month_day</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <td>0</td>\n", | |
" <td>1985-01-01</td>\n", | |
" <td>72.5052</td>\n", | |
" <td>1985</td>\n", | |
" <td>1</td>\n", | |
" <td>1</td>\n", | |
" <td>1985-01-02</td>\n", | |
" <td>1985-01-03</td>\n", | |
" <td>1985-02-01</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>1</td>\n", | |
" <td>1985-02-01</td>\n", | |
" <td>70.6720</td>\n", | |
" <td>1985</td>\n", | |
" <td>2</td>\n", | |
" <td>1</td>\n", | |
" <td>1985-02-04</td>\n", | |
" <td>1985-02-03</td>\n", | |
" <td>1985-03-01</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>2</td>\n", | |
" <td>1985-03-01</td>\n", | |
" <td>62.4502</td>\n", | |
" <td>1985</td>\n", | |
" <td>3</td>\n", | |
" <td>1</td>\n", | |
" <td>1985-03-04</td>\n", | |
" <td>1985-03-03</td>\n", | |
" <td>1985-04-01</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>3</td>\n", | |
" <td>1985-04-01</td>\n", | |
" <td>57.4714</td>\n", | |
" <td>1985</td>\n", | |
" <td>4</td>\n", | |
" <td>1</td>\n", | |
" <td>1985-04-02</td>\n", | |
" <td>1985-04-03</td>\n", | |
" <td>1985-05-01</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>4</td>\n", | |
" <td>1985-05-01</td>\n", | |
" <td>55.3151</td>\n", | |
" <td>1985</td>\n", | |
" <td>5</td>\n", | |
" <td>1</td>\n", | |
" <td>1985-05-02</td>\n", | |
" <td>1985-05-03</td>\n", | |
" <td>1985-06-01</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" <td>...</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>197</td>\n", | |
" <td>2001-06-01</td>\n", | |
" <td>90.3955</td>\n", | |
" <td>2001</td>\n", | |
" <td>6</td>\n", | |
" <td>1</td>\n", | |
" <td>2001-06-04</td>\n", | |
" <td>2001-06-03</td>\n", | |
" <td>2001-07-01</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>198</td>\n", | |
" <td>2001-07-01</td>\n", | |
" <td>96.0740</td>\n", | |
" <td>2001</td>\n", | |
" <td>7</td>\n", | |
" <td>1</td>\n", | |
" <td>2001-07-02</td>\n", | |
" <td>2001-07-03</td>\n", | |
" <td>2001-08-01</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>199</td>\n", | |
" <td>2001-08-01</td>\n", | |
" <td>99.5534</td>\n", | |
" <td>2001</td>\n", | |
" <td>8</td>\n", | |
" <td>1</td>\n", | |
" <td>2001-08-02</td>\n", | |
" <td>2001-08-03</td>\n", | |
" <td>2001-09-01</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>200</td>\n", | |
" <td>2001-09-01</td>\n", | |
" <td>88.2810</td>\n", | |
" <td>2001</td>\n", | |
" <td>9</td>\n", | |
" <td>1</td>\n", | |
" <td>2001-09-03</td>\n", | |
" <td>2001-09-03</td>\n", | |
" <td>2001-10-01</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>201</td>\n", | |
" <td>2001-10-01</td>\n", | |
" <td>82.6860</td>\n", | |
" <td>2001</td>\n", | |
" <td>10</td>\n", | |
" <td>1</td>\n", | |
" <td>2001-10-02</td>\n", | |
" <td>2001-10-03</td>\n", | |
" <td>2001-11-01</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"<p>202 rows × 8 columns</p>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" DATE IPG2211A2N Year month day next_day day_after \\\n", | |
"0 1985-01-01 72.5052 1985 1 1 1985-01-02 1985-01-03 \n", | |
"1 1985-02-01 70.6720 1985 2 1 1985-02-04 1985-02-03 \n", | |
"2 1985-03-01 62.4502 1985 3 1 1985-03-04 1985-03-03 \n", | |
"3 1985-04-01 57.4714 1985 4 1 1985-04-02 1985-04-03 \n", | |
"4 1985-05-01 55.3151 1985 5 1 1985-05-02 1985-05-03 \n", | |
".. ... ... ... ... ... ... ... \n", | |
"197 2001-06-01 90.3955 2001 6 1 2001-06-04 2001-06-03 \n", | |
"198 2001-07-01 96.0740 2001 7 1 2001-07-02 2001-07-03 \n", | |
"199 2001-08-01 99.5534 2001 8 1 2001-08-02 2001-08-03 \n", | |
"200 2001-09-01 88.2810 2001 9 1 2001-09-03 2001-09-03 \n", | |
"201 2001-10-01 82.6860 2001 10 1 2001-10-02 2001-10-03 \n", | |
"\n", | |
" next_month_day \n", | |
"0 1985-02-01 \n", | |
"1 1985-03-01 \n", | |
"2 1985-04-01 \n", | |
"3 1985-05-01 \n", | |
"4 1985-06-01 \n", | |
".. ... \n", | |
"197 2001-07-01 \n", | |
"198 2001-08-01 \n", | |
"199 2001-09-01 \n", | |
"200 2001-10-01 \n", | |
"201 2001-11-01 \n", | |
"\n", | |
"[202 rows x 8 columns]" | |
] | |
}, | |
"execution_count": 81, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"df.truncate(after='2019-10')" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Timezones" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### Using parameter tz you can set the timezone for the timestamp, You can check the list of pytz timezones" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 41, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"Timestamp('2016-10-30 00:00:00+0530', tz='Asia/Kolkata')" | |
] | |
}, | |
"execution_count": 41, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"ts = pd.Timestamp('2016-10-30 00:00:00', tz='Asia/Kolkata')\n", | |
"ts" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### convert the timezone of a timestamp" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### Convert the timestamp to another timezone using tz_convert" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 43, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"Timestamp('2016-10-29 20:30:00+0200', tz='Europe/Amsterdam')" | |
] | |
}, | |
"execution_count": 43, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"\n", | |
"pd.Timestamp('2016-10-30 00:00:00', tz='Asia/Kolkata').tz_convert('Europe/Amsterdam')" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Business Hour" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### default business hour is from 9:00 AM to 5:00PM for 7 hours. Adding 2 business hours returns 11:00AM and adding 8 business hours returns the next day" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 69, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"Timestamp('2016-10-31 11:00:00+0530', tz='Asia/Kolkata')" | |
] | |
}, | |
"execution_count": 69, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"bh = pd.offsets.BusinessHour()\n", | |
"pd.Timestamp('2016-10-30 00:00:00', tz='Asia/Kolkata')+2*bh" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### setting business hour" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### You can also set your own business hours with a start and end time" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 70, | |
"metadata": { | |
"collapsed": true | |
}, | |
"outputs": [], | |
"source": [ | |
"bh = pd.offsets.BusinessHour(start='11:00', end=datetime.time(20, 0))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Custom Business Hour" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### You can also set a CustomBusinessHours incorporating the Holiday Calendar list with a start and end business hours and\n", | |
"#### weekmask as explained above" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 71, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"ename": "NameError", | |
"evalue": "name 'CustomBusinessHour' is not defined", | |
"output_type": "error", | |
"traceback": [ | |
"\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", | |
"\u001b[1;31mNameError\u001b[0m Traceback (most recent call last)", | |
"\u001b[1;32m<ipython-input-71-df4d1692dbac>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[0;32m 1\u001b[0m \u001b[1;31m# Makr this Italic\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 2\u001b[1;33m \u001b[0mCustomBusinessHour\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mn\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;36m1\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mnormalize\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;32mFalse\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mweekmask\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;34m'Mon Tue Wed Thu Fri'\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mholidays\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;32mNone\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mcalendar\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;32mNone\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mstart\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;34m'09:00'\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mend\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;34m'17:00'\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0moffset\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mdatetime\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mtimedelta\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;36m0\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", | |
"\u001b[1;31mNameError\u001b[0m: name 'CustomBusinessHour' is not defined" | |
] | |
} | |
], | |
"source": [ | |
"# Makr this Italic\n", | |
"CustomBusinessHour(n=1, normalize=False, weekmask='Mon Tue Wed Thu Fri', holidays=None, calendar=None, start='09:00', end='17:00', offset=datetime.timedelta(0))" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 73, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"Timestamp('2014-01-17 16:00:00')" | |
] | |
}, | |
"execution_count": 73, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"import datetime\n", | |
"from pandas.tseries.holiday import USFederalHolidayCalendar\n", | |
"dt = datetime.datetime(2014, 1, 17, 15)\n", | |
"\n", | |
"bhour_us = pd.offsets.CustomBusinessHour(calendar=USFederalHolidayCalendar(),start='11:00', end=datetime.time(20, 0),\n", | |
" weekmask='Mon Tue Wed Thu Fri')\n", | |
"dt+bhour_us" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Difference between two date columns" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### Lets see how to find difference between two datetime columns in dataframe in terms of no of days, seconds etc" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 75, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>Letter</th>\n", | |
" <th>First_Day</th>\n", | |
" <th>Last_Day</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <td>0</td>\n", | |
" <td>A</td>\n", | |
" <td>2019-10-06 12:25:53</td>\n", | |
" <td>2019-10-04 10:10:53</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>1</td>\n", | |
" <td>A</td>\n", | |
" <td>2019-10-04 10:10:53</td>\n", | |
" <td>2019-10-01 08:10:53</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>2</td>\n", | |
" <td>B</td>\n", | |
" <td>2019-10-01 08:10:53</td>\n", | |
" <td>2019-09-23 01:24:53</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>3</td>\n", | |
" <td>B</td>\n", | |
" <td>2019-09-23 01:24:53</td>\n", | |
" <td>2019-09-23 15:58:17</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" Letter First_Day Last_Day\n", | |
"0 A 2019-10-06 12:25:53 2019-10-04 10:10:53\n", | |
"1 A 2019-10-04 10:10:53 2019-10-01 08:10:53\n", | |
"2 B 2019-10-01 08:10:53 2019-09-23 01:24:53\n", | |
"3 B 2019-09-23 01:24:53 2019-09-23 15:58:17" | |
] | |
}, | |
"execution_count": 75, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"import pandas as pd\n", | |
"from datetime import datetime\n", | |
"import numpy as np\n", | |
"\n", | |
"# create dataframe\n", | |
"df = pd.DataFrame(data=[['A', '2019-10-06T12:25:53', '2019-10-04T10:10:53'],\n", | |
" ['A', '2019-10-04T10:10:53', '2019-10-01T08:10:53'],\n", | |
" ['B', '2019-10-01T08:10:53', '2019-09-23T01:24:53'],\n", | |
" ['B', '2019-09-23T01:24:53', '2019-09-23T15:58:17']],\n", | |
" columns=['Letter', 'First_Day', 'Last_Day'])\n", | |
"\n", | |
"df['First_Day']=pd.to_datetime(df['First_Day'])\n", | |
"df['Last_Day']=pd.to_datetime(df['Last_Day'])\n", | |
"df" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Difference between two dates in days and seconds" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 76, | |
"metadata": { | |
"collapsed": true | |
}, | |
"outputs": [], | |
"source": [ | |
"df['diff']=(pd.to_datetime(df['First_Day']) - pd.to_datetime(df['Last_Day'])).dt.days\n", | |
"df['diff_time_delta']=df['First_Day']-df['Last_Day']\n", | |
"df['diff-simple_subtract']=((df['First_Day']-df['Last_Day']).dt.total_seconds())//3600" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 77, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>Letter</th>\n", | |
" <th>First_Day</th>\n", | |
" <th>Last_Day</th>\n", | |
" <th>diff</th>\n", | |
" <th>diff_time_delta</th>\n", | |
" <th>diff-simple_subtract</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <td>0</td>\n", | |
" <td>A</td>\n", | |
" <td>2019-10-06 12:25:53</td>\n", | |
" <td>2019-10-04 10:10:53</td>\n", | |
" <td>2</td>\n", | |
" <td>2 days 02:15:00</td>\n", | |
" <td>50.0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>1</td>\n", | |
" <td>A</td>\n", | |
" <td>2019-10-04 10:10:53</td>\n", | |
" <td>2019-10-01 08:10:53</td>\n", | |
" <td>3</td>\n", | |
" <td>3 days 02:00:00</td>\n", | |
" <td>74.0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>2</td>\n", | |
" <td>B</td>\n", | |
" <td>2019-10-01 08:10:53</td>\n", | |
" <td>2019-09-23 01:24:53</td>\n", | |
" <td>8</td>\n", | |
" <td>8 days 06:46:00</td>\n", | |
" <td>198.0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <td>3</td>\n", | |
" <td>B</td>\n", | |
" <td>2019-09-23 01:24:53</td>\n", | |
" <td>2019-09-23 15:58:17</td>\n", | |
" <td>-1</td>\n", | |
" <td>-1 days +09:26:36</td>\n", | |
" <td>-15.0</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" Letter First_Day Last_Day diff diff_time_delta \\\n", | |
"0 A 2019-10-06 12:25:53 2019-10-04 10:10:53 2 2 days 02:15:00 \n", | |
"1 A 2019-10-04 10:10:53 2019-10-01 08:10:53 3 3 days 02:00:00 \n", | |
"2 B 2019-10-01 08:10:53 2019-09-23 01:24:53 8 8 days 06:46:00 \n", | |
"3 B 2019-09-23 01:24:53 2019-09-23 15:58:17 -1 -1 days +09:26:36 \n", | |
"\n", | |
" diff-simple_subtract \n", | |
"0 50.0 \n", | |
"1 74.0 \n", | |
"2 198.0 \n", | |
"3 -15.0 " | |
] | |
}, | |
"execution_count": 77, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"df.head()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"collapsed": true | |
}, | |
"outputs": [], | |
"source": [] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.6.3" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 2 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment