Created
October 31, 2019 00:34
-
-
Save theredpea/feb7a15c875f97c7a6138528a163a7bc to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#Create a new field 'datetime', which converts strings to a datetime object | |
# datetime object will make our x-axis look good, showing appropriate labels like "Feb 2018" etc... | |
pyber_data_df['datetime'] = pd.to_datetime(pyber_data_df['date']) | |
# a datetime object *also* allows us to access the date part (2019-01-04) , ignoring the time-of-date part (05:39:20) | |
#Create another new field 'datedate' to store it | |
pyber_data_df['datedate'] = pyber_data_df['datetime'].dt.floor('d') | |
avg_fare_by_type_date = (pyber_data_df | |
#group by multiple fields by passing a list of those fields ; | |
#each field will become a "level" in the key which is distinct for each group | |
.groupby(['type', 'datedate']) | |
# get the average fare | |
.mean()['fare']) | |
(avg_fare_by_type_date | |
#Pivot the dates from rows to columns; pandas needs diff columns to plot; | |
#level 0 refers to unstacking the first level in our grouped-by-results; | |
#the first level (level=0) is the `type` of city; we want to pivot aka `unstack` city type to new columns | |
.unstack(level=0) | |
#Some dates are missing values; use pandas 'fillna' or 'interpolate' to decide what to do with those missing values | |
.fillna(method='ffill') | |
#This data is very specific, take a "rolling average" of each column; with a "window size" of 10 days: | |
.rolling(10).mean() | |
.plot()) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment