Skip to content

Instantly share code, notes, and snippets.

@ericzhong
Last active November 24, 2017 10:03
Show Gist options
  • Select an option

  • Save ericzhong/886677be7d0ed331c461e86861a2ddc6 to your computer and use it in GitHub Desktop.

Select an option

Save ericzhong/886677be7d0ed331c461e86861a2ddc6 to your computer and use it in GitHub Desktop.
Oozie 安装与使用

安装

环境:

CentOS 7.4
java-1.8.0-openjdk
maven-3.0.5

下载源码:

wget http://apache.mirror.anlx.net/oozie/4.3.0/oozie-4.3.0.tar.gz
tar xvf oozie-4.3.0.tar.gz
cd oozie-4.3.0/

编译二进制包:

# 修改其它依赖软件的版本,编译时容易出错,缺省版本见官方文档。
bin/mkdistro.sh -DskipTests -Puber -Phadoop-2 -Dhadoop.version=2.8.2
cp distro/target/oozie-4.3.0-distro.tar.gz ..
cd ..
rm -rf oozie-4.3.0          # 当心,源码和二进制解开后的目录相同
tar xvf oozie-4.3.0-distro.tar.gz
cd oozie-4.3.0

设置环境变量:

echo "export PATH=`pwd`/bin:\$PATH" | sudo tee /etc/profile.d/oozie.sh
source /etc/profile.d/oozie.sh

修改 Hadoop 的 core-site.xml,替换其中的括号部分:

  <!-- OOZIE -->
  <property>
    <name>hadoop.proxyuser.[OOZIE_SERVER_USER].hosts</name>     <!-- vagrant -->
    <value>[OOZIE_SERVER_HOSTNAME]</value>                      <!-- localhost -->
  </property>
  <property>
    <name>hadoop.proxyuser.[OOZIE_SERVER_USER].groups</name>   <!-- vagrant -->
    <value>[USER_GROUPS_THAT_ALLOW_IMPERSONATION]</value>      <!-- vagrant -->
  </property>

新建 Oozie 的配置 conf/hadoop-conf/core-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- 保持跟 Hadoop 的配置一致 -->
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

将依赖包放入下面的目录:

mkdir libext/
# http://archive.cloudera.com/gplextras/misc/ext-2.2.zip

启动 Hadoop,然后执行:

oozie-setup.sh sharelib create -fs hdfs://localhost:9000

创建数据库:

ooziedb.sh create -sqlfile oozie.sql -run

为 libext 中的包创建 war:

oozie-setup.sh prepare-war

启动服务:

oozied.sh start   # 后台启动
oozied.sh run     # 前台启动
# oozied.sh stop

日志路径:logs/oozie.log

测试服务状态:

$ oozie admin -oozie http://localhost:11000/oozie -status
...
System mode: NORMAL

运行 Example

tar xvf oozie-examples.tar.gz
hdfs dfs -put examples examples

修改 examples/apps/map-reduce/job.properties

nameNode=hdfs://localhost:9000
# resource manager 的应用管理端口
jobTracker=localhost:8032
queueName=default
examplesRoot=examples
user.name=vagrant

oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/apps/map-reduce/workflow.xml
outputDir=map-reduce

运行 example:

$ export OOZIE_URL="http://localhost:11000/oozie"    # 可替代 "-oozie URL" 选项
$ oozie job -config examples/apps/map-reduce/job.properties -run
...
job: 0000000-171105131511434-oozie-vagr-W

查看状态(也可以打开上面的网页查看):

oozie job -info 0000000-171105131511434-oozie-vagr-W

问题

Oozie 支持的 Hbase 版本

https://github.com/apache/oozie/blob/master/release-log.txt

-- Oozie 5.0.0 release (trunk - unreleased)
...
OOZIE-2670 Upgrade Hbase to 1.2 (gezapeti via asasvari)

Oozie 5.0 才支持 Hbase 1.2,不过还没发布。

Troubleshooting

Could not find artifact org.apache.hbase:hbase:jar:1.2.6 in central

因为 Hbase 后来拆分成多个包了,所以 1.2.6 的仓库里没有名为 hbase 的 JAR 包,在 pom.xmlcore/pom.xml 中将依赖改成 hbase-common

                <groupId>org.apache.hbase</groupId>
                <!-- <artifactId>hbase</artifactId> -->
                <artifactId>hbase-common</artifactId>

mv: cannot stat ‘.../oozie-4.3.0/lib/WEB-INF/lib/*.jar’: No such file or directory

yum install unzip -y
rm -rf lib

Error: A JNI error has occurred, please check your installation and try again

创建 sharelib 失败后,重新执行前先删除 lib 目录:

rm -rf lib

The method getJspApplicationContext(ServletContext) is undefined for the type JspFactory

访问 http://localhost:11000/oozie/ 页面报错。解决方案如下:

cd oozie-server/webapps/oozie/WEB-INF/lib/
mv servlet-api-2.5.jar servlet-api-2.5.jar.bak
mv jsp-api-2.0.jar jsp-api-2.0.jar.bak
cd -
oozied.sh stop
oozied.sh start

可能是跟内置的 tomcat 中的 JAR 冲突了:

$ yum provides servlet-*.jar     # 得到 tomcat-*
$ grep tomcat * -R
release-log.txt:GH-0131 add an embedded tomcat in Oozie distribution
$ find . | grep "servlet.*.jar"
./oozie-server/lib/servlet-api.jar
./oozie-server/webapps/oozie/WEB-INF/lib/guice-servlet-3.0.jar
./oozie-server/webapps/oozie/WEB-INF/lib/servlet-api-2.5.jar
./lib/guice-servlet-3.0.jar
./lib/servlet-api-2.5.jar

也有可能是跟 war 包里面的冲突了,暂时不确定 ...

File /user/vagrant/share/lib does not exist

任务执行失败,在 "Job Error Log" 页面中查看到错误信息。

$ hdfs dfs -ls /user/vagrant/share/lib
Found 1 items
drwxr-xr-x   - vagrant supergroup          0 2017-11-05 11:39 /user/vagrant/share/lib/lib_20171105113935

$ oozie admin -shareliblist 
...
[Available ShareLib]

文件存在,是配置出了问题。将 Hadoop 的 core-site.xml 复制到 conf/hadoop-conf/ 即可。

Could not lookup launched hadoop Job ID [...] which was associated with action ...

首先,启动 Job History Service:

mr-jobhistory-daemon.sh start historyserver

然后,检查 Oozie 的配置文件 conf/hadoop-conf/core-site.xml

<!-- 保持跟 Hadoop 的配置一致 -->
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

运行 example 需要修改 examples/apps/map-reduce/job.properties

nameNode=hdfs://localhost:9000
jobTracker=localhost:8032
queueName=default
examplesRoot=examples
user.name=vagrant

oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/apps/map-reduce/workflow.xml
outputDir=map-reduce

修改配置后要重启 Oozie 服务。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment