I'm recieving a strange error on a new install of Spark. I have setup a small 3 node spark cluster on top of an existing hadoop instance. The error I get is the same for any command I try to run on pyspark shell I get the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/spark/python/pyspark/rdd.py", line 1041, in count
return self.mapPartitions(lambda i: [sum(1 for _ in i)]).sum()
File "/opt/spark/python/pyspark/rdd.py", line 1032, in sum
return self.mapPartitions(lambda x: [sum(x)]).fold(0, operator.add)
File "/opt/spark/python/pyspark/rdd.py", line 906, in fold