Skip to content

Instantly share code, notes, and snippets.

@mgamba
Last active December 15, 2015 20:59
Show Gist options
  • Save mgamba/5322639 to your computer and use it in GitHub Desktop.
Save mgamba/5322639 to your computer and use it in GitHub Desktop.
How to write a simple UDF in Hive

Create a simple java app that adds two ints

// Plusser.java
package org.apache.hadoop.hive.contrib.udf.example;

import org.apache.hadoop.hive.ql.exec.UDF;

public class Plusser extends UDF{

  public Integer evaluate(Integer a, Integer b){
    return a+b;
  }
  
}

compile

bash$ javac -classpath /usr/local/Cellar/hive/0.9.0/libexec/lib/hive-exec-0.9.0.jar Plusser.java

namespacing! make sure the directory structure matches the package name

bash$ mkdir -p org/apache/hadoop/hive/contrib/udf/example
bash$ mv Plusser.class org/apache/hadoop/hive/contrib/udf/example/Plusser.class

jar it up

bash$ jar -cfv plusser.jar org

start hive

bash$ hive

add the jar file

hive> ADD JAR /hive-testing/plusser.jar;
hive> list jars;
file:/usr/local/Cellar/hive/0.9.0/libexec/lib/hive-builtins-0.9.0.jar
/hive-testing/plusser.jar

create a temporary function

hive> CREATE TEMPORARY FUNCTION plusser AS 'org.apache.hadoop.hive.contrib.udf.example.Plusser';

use it :

select plusser(1,4) from some_table limit 1;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment