Skip to content

Instantly share code, notes, and snippets.

@diloabininyeri
Created August 9, 2020 21:53
Show Gist options
  • Save diloabininyeri/5cdc1bac7d7d3177bcf85d86b5dc31aa to your computer and use it in GitHub Desktop.
Save diloabininyeri/5cdc1bac7d7d3177bcf85d86b5dc31aa to your computer and use it in GitHub Desktop.
from pyspark.sql import SparkSession
import numpy as np
spark = SparkSession.builder.getOrCreate()
df = spark.read.format('csv').option('header', 'true').load('/home/zeus/Desktop/deneme/file.csv')
collect = df.groupBy('county').count().orderBy('count').collect()
array = np.array(collect)
print(array)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment