Skip to content

Instantly share code, notes, and snippets.

@PankajWorks
Last active December 6, 2022 13:52
Show Gist options
  • Save PankajWorks/d46a029b2f138fa925cbc5044d0fa873 to your computer and use it in GitHub Desktop.
Save PankajWorks/d46a029b2f138fa925cbc5044d0fa873 to your computer and use it in GitHub Desktop.
Cleanup ambari and HDP

get log directory locations in case you want to clean those directories

Log directories can be retrieved from the configs stored in ambari database. Have a look at https://cwiki.apache.org/confluence/display/AMBARI/Modify+configurations. To find how to get a config You can execute the script from ambari-server host Example : ./configs.sh get localhost my hadoop-env | grep -i log_dir

  • Configs and corresponding variable
    HDFS : "hadoop-env","hdfs_log_dir_prefix"
    YARN : "yarn-env","yarn_log_dir_prefix"
    MAPREDUCE2 : "mapred-env","mapred_log_dir_prefix"
    HIVE : "hive-env","hive_log_dir"
    HBASE : "hbase-env","hbase_log_dir"
    OOZIE : "oozie-env","oozie_log_dir"
    ZOOKEEPER : "zookeeper-env","zk_log_dir"
    STORM : "storm-env","storm_log_dir"
    FLUME : "flume-env","flume_log_dir"
    KAFKA : "kafka-env","kafka_log_dir"
    SPARK : "spark-env","spark_log_dir"
    FALCON : "falcon-env","falcon_log_dir"

One more way to get log directories but it will be tricky

find the directories owned by different service owners Example 1 : find . -group service_owner -name "*.log"
Example 2 : find . -type d -user hdfs

Know the services installed

curl -u ambari_USERNAME:ambari_PASSWORD -H “X-Requested-By: ambari” -X PUT -d ‘{“RequestInfo”:{“context”:”Stop Service”},”Body”:{“ServiceInfo”:{“state”:”INSTALLED”}}}’ http://AMBARI_SERVER_HOST:8080/api/v1/clusters/CLUSTER_NAME/services/SERVICE_NAME

Commands you need to run for deleting and cleaning out an ambari cluster

ambari-server stop
ambari-server reset
salt '*' cmd.run 'yum clean all'

you can also clean using pssh

pssh -i -h /tmp/hosts.txt -x "-oStrictHostKeyChecking=no -i /tmp/hw-qe-keypair.pem" yum clean all

Run the host cleanup script on all the hosts

python /usr/lib/python2.6/site-packages/ambari_agent/HostCleanup.py --silent

Remove Hadoop package completely

yum remove hive*
yum remove oozie*
yum remove pig*
yum remove zookeeper*
yum remove tez*
yum remove hbase*
yum remove ranger*
yum remove knox*
yum remove ranger*
yum remove storm*
yum remove hadoop*

Incase you want to remove amabari-server and ambari-agent

ambari-server stop yum erase ambari-server

ambari-agent stop yum erase ambari-agent

Remove the repos

rm -rf /etc/yum.repos.d/ambari.repo /etc/yum.repos.d/HDP*

Clean all log folders

You can also use the below api to find out the registered hosts

Example - curl -sH "X-Requested-By: ambari" -u $USER:$PWD -i http://localhost:8080/api/v1/hosts | grep host_name | wc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment