Forked from airawat/00-OozieWorkflowHdfsAndEmailActions
Created
November 8, 2017 20:23
-
-
Save Arnold1/74a965ec2ffb6d391b3bbfd7478630fa to your computer and use it in GitHub Desktop.
Oozie workflow application with FS and email actions;
Includes sample data, workflow components, commands.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This gist includes components of a simple workflow application that created a directory and moves files within | |
hdfs to this directory; | |
Emails are sent out to notify designated users of success/failure of workflow. There is a prepare section, | |
to allow re-run of the action..the prepare essentially negates the move done by a potential prior run | |
of the action. Sample data is also included. | |
The sample application includes: | |
-------------------------------- | |
1. Oozie actions: hdfs action and email action | |
2. Oozie workflow controls: start, end, and kill. | |
3. Workflow components: job.properties and workflow.xml | |
4. Sample data | |
5. Commands to deploy workflow, submit and run workflow | |
6. Oozie web console - screenshots from sample program execution | |
Pictorial overview of workflow: | |
------------------------------- | |
Available at: | |
http://hadooped.blogspot.com/2013/06/apache-oozie-part-1-workflow-with-hdfs.html |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Workflow Components: | |
-------------------- | |
1. job.properties | |
File containing: | |
a) parameter and value declarations that are referenced in the workflows, and | |
b) environment information referenced by Oozie to run the workflow including name node, job tracker, workflow application path etc | |
2. workflow.xml | |
Workflow definition file |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Download location: | |
------------------ | |
GitHub - https://github.com/airawat/OozieSamples | |
Email me at [email protected] to contact me if you have access issues. | |
Directory structure applicable for this post/gist/blog: | |
------------------------------------------------------- | |
oozieProject | |
logs | |
airawat-syslog | |
<<node>> | |
<<year>> | |
<<month>> | |
messages | |
workflowHdfsAndEmailActions | |
job.prperties | |
workflow.xml | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oozie SMTP configuration | |
------------------------ | |
Add the following to the oozie-site.xml, and restart oozie. | |
Replace values with the same specific to your environment. | |
<!-- SMTP params--> | |
<property> | |
<name>oozie.email.smtp.host</name> | |
<value>cdh-dev01</value> | |
</property> | |
<property> | |
<name>oozie.email.smtp.port</name> | |
<value>25</value> | |
</property> | |
<property> | |
<name>oozie.email.from.address</name> | |
<value>oozie@cdh-dev01</value> | |
</property> | |
<property> | |
<name>oozie.email.smtp.auth</name> | |
<value>false</value> | |
</property> | |
<property> | |
<name>oozie.email.smtp.username</name> | |
<value></value> | |
</property> | |
<property> | |
<name>oozie.email.smtp.password</name> | |
<value></value> | |
</property> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#***************************** | |
# job.properties | |
#***************************** | |
nameNode=hdfs://cdh-nn01.hadoop.com:8020 | |
jobTracker=cdh-jt01:8021 | |
queueName=default | |
oozie.libpath=${nameNode}/user/oozie/share/lib | |
oozie.use.system.libpath=true | |
oozie.wf.rerun.failnodes=true | |
oozieProjectRoot=${nameNode}/user/${user.name}/oozieProject | |
oozie.wf.application.path=${oozieProjectRoot}/workflowHdfsAndEmailActions | |
dataInputDirectoryAbsPath=${oozieProjectRoot}/logs/airawat-syslog | |
makeDirectoryAbsPath=${oozieProjectRoot}/dataDump | |
dataDestinationDirectoryRelativePath=oozieProject/dataDump | |
emailToAddress=akhanolk@cdh-dev01 | |
#*******End************************ | |
Note: -The line - "oozie.wf.rerun.failnodes=true" is needed if you want to re-run; There is another config we can use instead as well that specifies which failed nodes to skip. Review Apache Oozie documentation for the same. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<!--******************************************--> | |
<!--workflow.xml --> | |
<!--******************************************--> | |
<workflow-app name="WorkFlowForHDFSAndEmailActions" xmlns="uri:oozie:workflow:0.1"> | |
<start to="hdfsCommands"/> | |
<action name="hdfsCommands"> | |
<fs> | |
<mkdir path='${makeDirectoryAbsPath}'/> | |
<move source='${dataInputDirectoryAbsPath}' target='${dataDestinationDirectoryRelativePath}'/> | |
</fs> | |
<ok to="sendEmailSuccess"/> | |
<error to="sendEmailKill"/> | |
</action> | |
<action name="sendEmailSuccess"> | |
<email xmlns="uri:oozie:email-action:0.1"> | |
<to>${emailToAddress}</to> | |
<subject>Status of workflow ${wf:id()}</subject> | |
<body>The workflow ${wf:id()} completed successfully</body> | |
</email> | |
<ok to="end"/> | |
<error to="end"/> | |
</action> | |
<action name="sendEmailKill"> | |
<email xmlns="uri:oozie:email-action:0.1"> | |
<to>${emailToAddress}</to> | |
<subject>Status of workflow ${wf:id()}</subject> | |
<body>The workflow ${wf:id()} had issues and was killed. The error message is: ${wf:errorMessage(wf:lastErrorNode())}</body> | |
</email> | |
<ok to="killJobFSAction"/> | |
<error to="killJobFSAction"/> | |
</action> | |
<kill name="killJobFSAction"> | |
<message>"Killed job due to error in FS Action"</message> | |
</kill> | |
<end name="end"/> | |
</workflow-app> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Commands to load data | |
---------------------- | |
a) Load data | |
$ hadoop fs -mkdir oozieProject | |
$ hadoop fs -put oozieProject/* oozieProject/ | |
b) Validate load | |
$ hadoop fs -ls -R oozieProject | awk '{print $8}' | |
You should see... | |
oozieProject/logs/airawat-syslog/<<node>>/<<year>>/<<month>>/messages | |
oozieProject/workflowHdfsAndEmailActions/job.properties | |
oozieProject/workflowHdfsAndEmailActions/workflow.xml | |
$ hadoop fs -rm -R oozieProject/data |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oozie commands | |
-------------- | |
Note: Replace oozie server and port, with your cluster-specific. | |
1) Submit job: | |
$ oozie job -oozie http://cdh-dev01:11000/oozie -config oozieProject/workflowHdfsAndEmailActions/job.properties -submit | |
job: 0000001-130712212133144-oozie-oozi-W | |
2) Run job: | |
$ oozie job -oozie http://cdh-dev01:11000/oozie -start 0000001-130712212133144-oozie-oozi-W | |
3) Check the status: | |
$ oozie job -oozie http://cdh-dev01:11000/oozie -info 0000001-130712212133144-oozie-oozi-W | |
4) Suspend workflow: | |
$ oozie job -oozie http://cdh-dev01:11000/oozie -suspend 0000001-130712212133144-oozie-oozi-W | |
5) Resume workflow: | |
$ oozie job -oozie http://cdh-dev01:11000/oozie -resume 0000001-130712212133144-oozie-oozi-W | |
6) Re-run workflow: | |
$ oozie job -oozie http://cdh-dev01:11000/oozie -config oozieProject/workflowHdfsAndEmailActions/job.properties -rerun 0000001-130712212133144-oozie-oozi-W | |
7) Should you need to kill the job: | |
$ oozie job -oozie http://cdh-dev01:11000/oozie -kill 0000001-130712212133144-oozie-oozi-W | |
8) View server logs: | |
$ oozie job -oozie http://cdh-dev01:11000/oozie -logs 0000001-130712212133144-oozie-oozi-W | |
Logs are available at: | |
/var/log/oozie on the Oozie server. | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Program output: | |
--------------- | |
Expected result: | |
1) The data in the logs directory should be in the directory by name dataDump under oozieProject directory. | |
2) The directory 'logs' should be deleted. | |
3) An email indicating success/failure of the application | |
1) | |
$ hadoop fs -ls -R oozieProject | awk '{print $8}' | |
oozieProject/dataDump/airawat-syslog | |
oozieProject/dataDump/airawat-syslog/cdh-dev01 | |
oozieProject/dataDump/airawat-syslog/cdh-dev01/2013 | |
oozieProject/dataDump/airawat-syslog/cdh-dev01/2013/04 | |
oozieProject/dataDump/airawat-syslog/cdh-dev01/2013/04/messages | |
oozieProject/dataDump/airawat-syslog/cdh-dev01/2013/05 | |
oozieProject/dataDump/airawat-syslog/cdh-dev01/2013/05/messages | |
oozieProject/dataDump/airawat-syslog/cdh-dn01 | |
oozieProject/dataDump/airawat-syslog/cdh-dn01/2013 | |
oozieProject/dataDump/airawat-syslog/cdh-dn01/2013/05 | |
oozieProject/dataDump/airawat-syslog/cdh-dn01/2013/05/messages | |
oozieProject/dataDump/airawat-syslog/cdh-dn02 | |
oozieProject/dataDump/airawat-syslog/cdh-dn02/2013 | |
oozieProject/dataDump/airawat-syslog/cdh-dn02/2013/04 | |
oozieProject/dataDump/airawat-syslog/cdh-dn02/2013/04/messages | |
oozieProject/dataDump/airawat-syslog/cdh-dn02/2013/05 | |
oozieProject/dataDump/airawat-syslog/cdh-dn02/2013/05/messages | |
oozieProject/dataDump/airawat-syslog/cdh-dn03 | |
oozieProject/dataDump/airawat-syslog/cdh-dn03/2013 | |
oozieProject/dataDump/airawat-syslog/cdh-dn03/2013/04 | |
oozieProject/dataDump/airawat-syslog/cdh-dn03/2013/04/messages | |
oozieProject/dataDump/airawat-syslog/cdh-dn03/2013/05 | |
oozieProject/dataDump/airawat-syslog/cdh-dn03/2013/05/messages | |
oozieProject/dataDump/airawat-syslog/cdh-jt01 | |
oozieProject/dataDump/airawat-syslog/cdh-jt01/2013 | |
oozieProject/dataDump/airawat-syslog/cdh-jt01/2013/04 | |
oozieProject/dataDump/airawat-syslog/cdh-jt01/2013/04/messages | |
oozieProject/dataDump/airawat-syslog/cdh-jt01/2013/05 | |
oozieProject/dataDump/airawat-syslog/cdh-jt01/2013/05/messages | |
oozieProject/dataDump/airawat-syslog/cdh-nn01 | |
oozieProject/dataDump/airawat-syslog/cdh-nn01/2013 | |
oozieProject/dataDump/airawat-syslog/cdh-nn01/2013/05 | |
oozieProject/dataDump/airawat-syslog/cdh-nn01/2013/05/messages | |
oozieProject/dataDump/airawat-syslog/cdh-vms | |
oozieProject/dataDump/airawat-syslog/cdh-vms/2013 | |
oozieProject/dataDump/airawat-syslog/cdh-vms/2013/05 | |
oozieProject/dataDump/airawat-syslog/cdh-vms/2013/05/messages | |
oozieProject/workflowHdfsAndEmailActions/job.properties | |
oozieProject/workflowHdfsAndEmailActions/workflow.xml |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Email from the program | |
----------------------- | |
From [email protected] Sun Jul 14 23:08:46 2013 | |
Return-Path: <[email protected]> | |
X-Original-To: akhanolk@cdh-dev01 | |
Delivered-To: [email protected] | |
From: [email protected] | |
To: [email protected] | |
Subject: Status of workflow 0000006-130712212133144-oozie-oozi-W | |
Content-Type: text/plain; charset=us-ascii | |
Date: Sun, 14 Jul 2013 23:08:46 -0500 (CDT) | |
Status: R | |
The workflow 0000006-130712212133144-oozie-oozi-W completed successfully |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Screenshots of the Oozie web console are available at: | |
------------------------------------------------------ | |
http://hadooped.blogspot.com/2013/06/apache-oozie-part-1-workflow-with-hdfs.html |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment