| Operation | Pandas | Spark | Optimus |
|---|---|---|---|
| Create Dataframe | pd.Dataframe | spark.createdataframe() | op.create.df() |
| Append Column | df.join(), pd.concat() | df.withColumn() | df.cols.append() |
| Append Row | df.append() | df.union() | df.rows().append() |
| Filter Column | df.filter(axis=1) | df.select() | df.cols.select() |
| Filter Row | df.filter() | df.filter() | df.rows.select() |
| Apply | df.apply() | fn = F.udf(labmbda x:x+1, DoubleType()) df.withColumn('disp1', fn(df.disp)) | df.cols.apply() |
| Drop Column | df.drop(axis=1) | df.drop() | df.cols.drop() |
| Drop Row | df.drop() | df.filter() | df.rows.drop() |
Last active
August 24, 2018 16:40
-
-
Save argenisleon/e0961429437d015ddaae9ebf52e9dcf3 to your computer and use it in GitHub Desktop.
Basic Operations Pandas Vs Spark Vs Optimus
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment