Skip to content

Instantly share code, notes, and snippets.

@anjijava16
Created December 11, 2020 00:45
Show Gist options
  • Save anjijava16/a08089e5962833d3621aa400cb870be3 to your computer and use it in GitHub Desktop.
Save anjijava16/a08089e5962833d3621aa400cb870be3 to your computer and use it in GitHub Desktop.
####################################################################################
UDF VS UDAF VS UDTF
1.UDF : UDFs works on a single row in a table and produces a single row as output. Its one to one relationship between input and output of a function. e.g Hive built in TRIM() function.
Extends UDF
we have to overload a method called evaluate() inside our class.
2.UDAF : User defined aggregate functions works on more than one row and gives single row as output. e.g Hive built in MAX() or COUNT() functions.
Extends UDAF.
We need to overwrite five methods called init(), iterate(), terminatePartial(), merge() and terminate()
3.UDTF : UDTF: User defined tabular function works on one row as input and returns multiple rows as output. So here the relation in one to many. e.g Hive built in EXPLODE() function.
We need to override 3 methods namely initialize(), process() and close() in our class
####################################################################################
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment