Created
March 9, 2019 10:32
-
-
Save dvgodoy/7dcc6b31360b5b4b68768cee131a968f to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import findspark | |
| from pyspark.sql import SparkSession | |
| from handyspark import * | |
| from matplotlib import pyplot as plt | |
| %matplotlib inline | |
| findspark.init() | |
| spark = SparkSession.builder.getOrCreate() | |
| # DOWNLOAD THE DATASET HERE | |
| # https://raw.githubusercontent.com/dvgodoy/handyspark/master/tests/rawdata/train.csv | |
| # Loads training data for Titanic dataset | |
| sdf = spark.read.csv('train.csv', header=True, inferSchema=True) | |
| # Makes Spark dataframe Handy :-) | |
| hdf = sdf.toHandy() |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment