Skip to content

Instantly share code, notes, and snippets.

@aialenti
Created September 13, 2020 14:50
Show Gist options
  • Save aialenti/99835484db2595de8258a91978760b14 to your computer and use it in GitHub Desktop.
Save aialenti/99835484db2595de8258a91978760b14 to your computer and use it in GitHub Desktop.
# Read the source tables in Parquet format
sales_table = spark.read.parquet("./data/sales_parquet")
'''
SELECT order_id,
product_id,
seller_id,
date,
num_pieces_sold,
bill_raw_text,
num_pieces_sold % 2 AS num_pieces_sold_is_even
FROM sales_table a
'''
sales_table_execution_plan = sales_table. \
withColumn("num_pieces_sold_is_even", col("num_pieces_sold")%2)
sales_table_execution_plan.show()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment