Last active
August 29, 2015 14:22
-
-
Save thekensta/067af7dc6705706e0236 to your computer and use it in GitHub Desktop.
Extract date components from Date column in pandas dataframe
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Extracting date components from a Date column in Pandas using IPython | |
# Converting to DatetimeIndex is 100x faster than using DataFrame.apply() | |
import pandas as pd | |
dates = pd.DataFrame({"Date": pd.date_range(start="1970-01-01", end="2037-12-31")}) | |
print(dates.head()) | |
# Date | |
# 0 1970-01-01 | |
# 1 1970-01-02 | |
# 2 1970-01-03 | |
# 3 1970-01-04 | |
# 4 1970-01-05 | |
print(dates.shape) | |
# (24837, 1) | |
print(dates.dtypes) | |
# Date datetime64[ns] | |
# dtype: object | |
%timeit -n10 -r10 dates.Date.apply(lambda x: x.year) | |
# 10 loops, best of 10: 111 ms per loop | |
%timeit -n10 -r10 pd.DatetimeIndex(dates.Date).year | |
# 10 loops, best of 10: 1.01 ms per loop | |
%timeit -n10 -r10 dates.Date.dt.year | |
# 10 loops, best of 10: 3.55 ms per loop |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment