Skip to content

Instantly share code, notes, and snippets.

@argenisleon
Last active August 24, 2018 16:40
Show Gist options
  • Select an option

  • Save argenisleon/e0961429437d015ddaae9ebf52e9dcf3 to your computer and use it in GitHub Desktop.

Select an option

Save argenisleon/e0961429437d015ddaae9ebf52e9dcf3 to your computer and use it in GitHub Desktop.
Basic Operations Pandas Vs Spark Vs Optimus
Operation Pandas Spark Optimus
Create Dataframe pd.Dataframe spark.createdataframe() op.create.df()
Append Column df.join(), pd.concat() df.withColumn() df.cols.append()
Append Row df.append() df.union() df.rows().append()
Filter Column df.filter(axis=1) df.select() df.cols.select()
Filter Row df.filter() df.filter() df.rows.select()
Apply df.apply() fn = F.udf(labmbda x:x+1, DoubleType()) df.withColumn('disp1', fn(df.disp)) df.cols.apply()
Drop Column df.drop(axis=1) df.drop() df.cols.drop()
Drop Row df.drop() df.filter() df.rows.drop()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment