Skip to content

Instantly share code, notes, and snippets.

@NeerajBhadani
NeerajBhadani / win_rowsBetween_agg.scala
Created May 25, 2020 15:22
Apply custom window Specification
val rows_between_df = empsalary.withColumn("max_salary", max("salary").over(winSpec))
rows_between_df.show()
@NeerajBhadani
NeerajBhadani / win_rowsBetween_spec.scala
Created May 25, 2020 15:21
Define custom window Specification
val winSpec = Window.partitionBy("depName")
.orderBy("salary").rowsBetween(-1, 1)
@NeerajBhadani
NeerajBhadani / win_rangeBetween_agg_boundary.scala
Created May 25, 2020 15:19
Apply custom window Specification with custom Boundary
val winSpec = Window.partitionBy("depName").orderBy("salary")
.rangeBetween(300L, Window.unboundedFollowing)
val range_unbounded_df = empsalary.withColumn("max_salary", max("salary").over(winSpec))
range_unbounded_df.show()
@NeerajBhadani
NeerajBhadani / win_rangeBetween_agg.scala
Created May 25, 2020 15:17
Apply custom window Specification
val range_between_df = empsalary.withColumn("max_salary", max("salary").over(winSpec))
range_between_df.show()
val winSpec = Window.partitionBy("depName")
.orderBy("salary")
.rangeBetween(100L, 300L)
val winSpec = Window.partitionBy("depName").orderBy("salary")
val lead_df =
empsalary.withColumn("lead", lead("salary", 2).over(winSpec))
lead_df.show()
val winSpec = Window.partitionBy("depName").orderBy("salary")
val lag_df =
empsalary.withColumn("lag", lag("salary", 2).over(winSpec))
lag_df.show()
val winSpec = Window.partitionBy("depName").orderBy("salary")
val cume_dist_df =
empsalary.withColumn("cume_dist",cume_dist().over(winSpec))
cume_dist_df.show()
val ntile_df = empsalary.withColumn("ntile", ntile(3).over(winSpec))
ntile_df.show()
val percent_rank_df = empsalary.withColumn("percent_rank", percent_rank().over(winSpec))
percent_rank_df.show()