Created
February 11, 2019 04:12
-
-
Save qqpann/26c407e456506974dfdcb7cf6c8523dc to your computer and use it in GitHub Desktop.
[前処理大全 Awesome Python] 前処理大全でAwesomeとされたPythonコード #Python
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Thanks: https://github.com/ghmagazine/awesomebook | |
# Filter | |
df.query('"2016-10-13" <= checkout_date <= "2016-10-14"') | |
# Sampling | |
df.sample(frac=0.5) # Random sample 50% | |
df.sample(n=100) # Specify by N | |
# 集約ID単位のサンプリング | |
# === | |
# サンプリング時に留意すべきは,割合の変動 | |
# 1行1回の宿泊予約データを50%サンプリングすると | |
# 予約データの割合は変わらないと考えることができるが, | |
# それ以外(顧客数の割合とか)の割合は変わってしまう可能性がある | |
target = pd.Series(df['customer_id'].unique()).sample(frac=0.5) | |
df[df['customer_id'].isin(target)] | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment