Skip to content

Instantly share code, notes, and snippets.

@cherniag
Created June 11, 2018 15:35
Show Gist options
  • Save cherniag/85aab42389808508a163517d4a5a612b to your computer and use it in GitHub Desktop.
Save cherniag/85aab42389808508a163517d4a5a612b to your computer and use it in GitHub Desktop.
Spark UDF to filter map's key
// given - column with type Map<String, String>
import static org.apache.spark.sql.functions.callUDF;
import static org.apache.spark.sql.functions.lit;
import static org.apache.spark.sql.functions.col;
// define UDF
private UserDefinedFunction keyContains = functions.udf(
(scala.collection.Map<String, String> m, String key) -> m.contains(key),
DataTypes.BooleanType
);
// register UDF
sparkSession.udf().register("keyContains", keyContains);
// use:
Column filter = callUDF("keyContains", col("mapColumnName"), lit("keyToCheck"));
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment