Created
November 5, 2020 15:26
-
-
Save Ben-Epstein/0fb2ed1b9c643e59d25bfdec0f8eba3d to your computer and use it in GitHub Desktop.
Load Iris Data into Spark
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from sklearn.datasets import load_iris | |
import pandas as pd | |
import numpy as np | |
data = load_iris() | |
cols = [i.replace('(cm)','').strip().replace(' ','_') for i in data.feature_names] + ['label'] # Column name cleanup | |
pdf = pd.DataFrame(np.c_[data.data, data.target], columns=cols) | |
df = spark.createDataFrame(pdf) | |
df.show() |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment