Skip to content

Instantly share code, notes, and snippets.

@glamp
Last active December 13, 2015 20:48
Show Gist options
  • Save glamp/4972650 to your computer and use it in GitHub Desktop.
Save glamp/4972650 to your computer and use it in GitHub Desktop.
import pandas as pd
import numpy as np
from datetime import datetime
# generate some fake tick data with 1 million observations
n = 1000000
df = pd.DataFrame({
"timestamp": [datetime.now() for t in range(n)],
"value": np.random.uniform(-1, 1, n)
})
# similar dataframe operations to R
df.head()
df.describe()
df.count()
# timing a basic operation on the data frame
s = time.time(); df.describe(); e = time.time()-s;
print e
# 0.7173671722412109
# creating an index
df['timestamp'] = pd.to_datetime(df['timestamp'])
df['value'] = df['value'].cumsum()
df.index = df['timestamp']
# index improves performance
s = time.time(); df.describe(); e = time.time()-s;
print e
# 0.5612149238586426
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment