Skip to content

Instantly share code, notes, and snippets.

@welly87
Created October 1, 2020 03:17
Show Gist options
  • Save welly87/2c9778b0946181f17b58d03934489f62 to your computer and use it in GitHub Desktop.
Save welly87/2c9778b0946181f17b58d03934489f62 to your computer and use it in GitHub Desktop.
@welly87
Copy link
Author

welly87 commented Oct 1, 2020

pdf = sdf.select("*").toPandas()

@welly87
Copy link
Author

welly87 commented Oct 1, 2020

sdf.createOrReplaceTempView("california_housing")

sqlDF = spark.sql("SELECT sum(population) FROM california_housing WHERE total_rooms > 1000")
sqlDF.head()

@welly87
Copy link
Author

welly87 commented Oct 1, 2020

@welly87
Copy link
Author

welly87 commented Oct 1, 2020

@welly87
Copy link
Author

welly87 commented Oct 1, 2020

!apt-get install openjdk-8-jdk-headless -qq > /dev/null
!wget -q https://downloads.apache.org/spark/spark-2.4.7/spark-2.4.7-bin-hadoop2.7.tgz
!tar xf spark-2.4.7-bin-hadoop2.7.tgz

!pip install -q findspark
!pip install -q pyarrow

@welly87
Copy link
Author

welly87 commented Oct 1, 2020

!rm -rf *.tgz
!rm -rf spark-3.0.1-bin-hadoop2.7/

@welly87
Copy link
Author

welly87 commented Oct 1, 2020

@welly87
Copy link
Author

welly87 commented Oct 1, 2020

@welly87
Copy link
Author

welly87 commented Oct 1, 2020

!wget https://github.com/welly87/spark-load/raw/master/mysql-connector-java-8.0.14.jar
!mv /content/mysql-connector-java-8.0.14.jar /content/spark-2.4.7-bin-hadoop2.7/jars/mysql-connector-java-8.0.14.jar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment