Skip to content

Instantly share code, notes, and snippets.

View SuvroBaner's full-sized avatar
🏠
Working from home

Suvro Banerjee SuvroBaner

🏠
Working from home
View GitHub Profile
@SuvroBaner
SuvroBaner / WordCounter.py
Last active June 21, 2018 08:34
This code will count the number of words in a textfile using Spark
# Creating Spark Configuration and Spark Context-
from pyspark import SparkContext, SparkConf
conf = SparkConf().setAppName("Word Counter")
sc = SparkContext(conf = conf)
# Reading the file-
myTextFile = sc.textFile("/Users/bsuvro/spark-2.3.0-bin-hadoop2.7/README.md")
# Removing the empty lines-
non_emptyLines = myTextFile.filter(lambda line: len(line) > 0)