Skip to content

Instantly share code, notes, and snippets.

@kperry2215
Created October 23, 2019 01:34
Show Gist options
  • Save kperry2215/b906c89ab350706c1bca7333cb341320 to your computer and use it in GitHub Desktop.
Save kperry2215/b906c89ab350706c1bca7333cb341320 to your computer and use it in GitHub Desktop.
#Read in the cancer data set
df = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer/breast-cancer.data', header=None)
#Declare the column names of the cancer data set
df.columns=["Class", "Age", "Menopause",
"Tumor_Size", "Inv_Nodes",
"Node_Caps", "Deg_Malig",
"Breast", "Breast_quad",
"Irradiat"]
#Convert all of the categorical features variables to numeric (use LabelEncoder)
d = defaultdict(LabelEncoder)
df_label_encoded = df.apply(lambda x: d[x.name].fit_transform(x))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment