Skip to content

Instantly share code, notes, and snippets.

@parsa-asgari
Forked from koverholt/spark_numpy.py
Created November 17, 2018 08:23
Show Gist options
  • Save parsa-asgari/b5d6889665d174e57dede21adbb8cdf5 to your computer and use it in GitHub Desktop.
Save parsa-asgari/b5d6889665d174e57dede21adbb8cdf5 to your computer and use it in GitHub Desktop.
Simple Numpy example in Spark
import numpy as np
from pyspark import SparkContext
from pyspark import SparkConf
conf = SparkConf()
conf.setMaster("spark://<HOSTNAME>:7077")
conf.setAppName("NumpyMult")
sc = SparkContext(conf=conf)
def mult(x):
y = np.array([2])
return x*y
x = np.arange(10000)
distData = sc.parallelize(x)
results = distData.map(mult).collect()
print results
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment