Skip to content

Instantly share code, notes, and snippets.

@vinodkc
Last active August 18, 2020 10:31
Show Gist options
  • Save vinodkc/b3b846c45ef05de8db9af81d0be3f393 to your computer and use it in GitHub Desktop.
Save vinodkc/b3b846c45ef05de8db9af81d0be3f393 to your computer and use it in GitHub Desktop.

Spark on Docker - HDP3 YARN

  1. Kerberize the cluster

  2. Enable CGroup from yarn and restart

To enable cgroups on an Ambari cluster, select YARN > Configs on the Ambari dashboard, then click CPU Isolation under CPU. Click Save, then restart all cluster components that require a restart

I got mount failure error: /sys/fs/cgroup/cpu/yarn Solution , run below command on all node manager hosts:

sudo mkdir /sys/fs/cgroup/cpu/yarn
sudo chown -R yarn:yarn /sys/fs/cgroup/cpu/yarn
  1. Install docker on all node manager hosts
yum install docker
systemctl start docker
systemctl status docker
systemctl enable docker
sudo systemctl edit --full docker.service

This brings up the whole configuration for editing. Just replace the systemd string to cgroupfs. Save the changes and restart both the systemd and Docker daemon:

sudo systemctl daemon-reload
sudo systemctl restart docker.service

(Else you will get Error while submitting spark application --> error output: /usr/bin/docker-current: Error response from daemon: cgroup-parent for systemd cgroup should be a valid slice named as "xxx.slice". Ref: https://issues.apache.org/jira/browse/YARN-9660)

docker ps
sudo groupadd docker
sudo usermod -aG docker spark
su spark 
docker ps

3.1. Enable docker in Ambari Yarn -> Advanced container-executor

  docker.allowed.ro-mounts=/sys/fs/cgroup,/etc/passwd,/etc/krb5.conf,{{nm_local_dirs}},{{docker_allowed_ro_mounts}}
 min_user_id=50

Ambari Yarn ->In Custom yarn-site

yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user=nobody
  1. whitelist antaladam

In Advanced container-executor

Ambari Yarn ->Docker Trusted Registries add local,centos,hortonworks,antaladam eg: Docker Trusted Registries local,centos,hortonworks,antaladam

Note: antaladam/python2:v1 is the docker image with python and numpy

  1. kinit as spark user and run
pyspark --master yarn --conf spark.executorEnv.YARN_CONTAINER_RUNTIME_TYPE=docker --conf spark.executorEnv.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=antaladam/python2:v1 --conf spark.executorEnv.YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS=/etc/krb5.conf:/etc/krb5.conf:ro,/etc/passwd:/etc/passwd:ro

Run below code snippet

def inside(p):
   import numpy as np
   return np.cos(np.pi * p / 2) > 0.5

inside(0) # to varify that numpy not in driver, only avaialble in docker image in executor, so the error below

num_samples = 100000

count = sc.parallelize(range(0, num_samples)).filter(inside).count()

# Run code,  which using numpy in docker in executor

print(count)

Livy configurations in zeppelin

  1. First, enure that livy intepretter is running fine without docker containers(default yarn containers).

  2. Then add below configurations in livy interpreter setting to run AM and executor using Docker containers

livy.spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_TYPE=docker 
livy.spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=antaladam/python2:v1
livy.spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS=/etc/krb5.conf:/etc/krb5.conf:ro,/etc/passwd:/etc/passwd:ro
livy.spark.executorEnv.YARN_CONTAINER_RUNTIME_TYPE=docker 
livy.spark.executorEnv.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=antaladam/python2:v1 
livy.spark.executorEnv.YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS=/etc/krb5.conf:/etc/krb5.conf:ro,/etc/passwd:/etc/passwd:ro
  1. Add a new zeppelin notebook
%livy2.pyspark
def inside(p):
   import numpy as np
   return np.cos(np.pi * p / 2) > 0.5

inside(0)  # Test method

num_samples = 100000

count = sc.parallelize(range(0, num_samples)).filter(inside).count()

# Run code,  which using numpy in docker in executor

print(count)

Ref:

https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.0.0/running-spark-applications/content/running_spark_in_docker_containers_on_yarn.html

https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.0/data-operating-system/content/configure_yarn_for_running_docker_containers.html

https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.4/data-operating-system/content/enabling_cgroups.html

https://docs.cloudera.com/runtime/7.2.0/yarn-troubleshooting/topics/yarn-troubleshooting-docker.html

https://blog.cloudera.com/introducing-apache-spark-on-docker-on-top-of-apache-yarn-with-cdp-datacenter-release/

https://hadoop.apache.org/docs/r3.1.1/hadoop-yarn/hadoop-yarn-site/DockerContainers.html

Todo : change yarn.nodemanager.linux-container-executor.group to the custom group save and restart

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment