Skip to content

Instantly share code, notes, and snippets.

@sadikovi
Created July 28, 2017 00:03
Show Gist options
  • Save sadikovi/5a6d1d0d697ed5ed5c3309c1318e491e to your computer and use it in GitHub Desktop.
Save sadikovi/5a6d1d0d697ed5ed5c3309c1318e491e to your computer and use it in GitHub Desktop.
Spark SQL UDF for StructType
import org.apache.spark.sql._
import org.apache.spark.sql.types._
import org.apache.spark.sql.expressions._
val df = Seq(
("str", 1, 0.2)
).toDF("a", "b", "c").
withColumn("struct", struct($"a", $"b", $"c"))
// UDF for struct
val func = udf((x: Any) => {
x match {
case Row(a: String, b: Int, c: Double) =>
s"$a-$b-$c"
case other =>
sys.error(s"something else: $other")
}
}, StringType)
df.withColumn("d", func($"struct")).show
/*
+---+---+---+-----------+---------+
| a| b| c| struct| d|
+---+---+---+-----------+---------+
|str| 1|0.2|[str,1,0.2]|str-1-0.2|
+---+---+---+-----------+---------+
*/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment