Skip to content

Instantly share code, notes, and snippets.

@panicpotatoe
Created March 12, 2019 02:53
Show Gist options
  • Select an option

  • Save panicpotatoe/73fe86d117e1584954a5c324d989b06e to your computer and use it in GitHub Desktop.

Select an option

Save panicpotatoe/73fe86d117e1584954a5c324d989b06e to your computer and use it in GitHub Desktop.
# STEP 1: GENERATE A RANDOM DATASET
# Generate under a random factor
# https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.seed.html
np.random.seed(10)
# Sample data randomly at fixed probabilities
voter_race = np.random.choice(a=["asian","black","hispanic","other","white"],
p=[0.05, 0.15 ,0.25, 0.05, 0.5],
size=1000)
# Sample data randomly at fixed probabilities
voter_party = np.random.choice(a=["democrat","independent","republican"],
p=[0.4, 0.2, 0.4],
size=1000)
# Binding 2 arrays (voter_race and voter_party) to make a DataFrame
voters = pd.DataFrame({"race":voter_race,
"party":voter_party})
# You can check the data of DataFrame by calling it
voters
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment