Skip to content

Instantly share code, notes, and snippets.

@zoltanctoth
Last active July 15, 2023 13:23
Show Gist options
  • Save zoltanctoth/2deccd69e3d1cde1dd78 to your computer and use it in GitHub Desktop.
Save zoltanctoth/2deccd69e3d1cde1dd78 to your computer and use it in GitHub Desktop.
Writing an UDF for withColumn in PySpark
from pyspark.sql.types import StringType
from pyspark.sql.functions import udf
maturity_udf = udf(lambda age: "adult" if age >=18 else "child", StringType())
df = spark.createDataFrame([{'name': 'Alice', 'age': 1}])
df.withColumn("maturity", maturity_udf(df.age))
df.show()
@Weiyu-Luo
Copy link

I encountered this problem too. Have you solved it? Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment