Skip to content

Instantly share code, notes, and snippets.

@seanorama
Last active March 1, 2020 15:12
Show Gist options
  • Save seanorama/e577e5545dab4b2740ce6ef037272cef to your computer and use it in GitHub Desktop.
Save seanorama/e577e5545dab4b2740ce6ef037272cef to your computer and use it in GitHub Desktop.
Zeppelin: LDAP, Kerberos & Livy

Configure Zeppelin against Active Directory.

Requirements

  • HDP 2.6
  • LDAP bind details (guide is for Active Directory but can be altered for other LDAP servers).

Configuration from command-line

Securely store LDAP credentials on Zeppelin host(s):

## Do this as a hadoop superuser or the `hdfs` user
##   - Below shows authenticating to and then executing as `hdfs` user. Alter to your user:
cluster=sroberts100
sudo -u hdfs kinit -kt /etc/security/keytabs/hdfs.headless.keytab hdfs-"${cluster,,}"
sudo -u hdfs hadoop credential create activeDirectoryRealm.systemPassword -provider jceks:///etc/zeppelin/conf/credentials.jceks

Configuration

  1. Configure 'proxyuser' rights for Zeppelin:
    • DO NOT use * change to the appropriate hosts and groups!
    • Ambari: HDFS -> Configs -> Custom core-site:
hadoop.proxyuser.zeppelin.hosts=*
hadoop.proxyuser.zeppelin.groups=*
  1. Update Zeppelin config with configuration below.
    • From Ambari: Zeppelin -> Advanced zeppelin-env -> "shiro_ini_content"
  2. Restart affected services
  3. Test: Login to Zeppelin as AD user.

Configurations

Review notes in each config and update where appropriate, such as LDAP details.


[users]
# List of users with their password allowed to access Zeppelin.
# To use a different strategy (LDAP / Database / ...) check the shiro doc at http://shiro.apache.org/configuration.html#Configuration-INISections
#admin = admin, admin
#user1 = user1, role1, role2
#user2 = user2, role3
#user3 = user3, role2

# Sample LDAP configuration, for user Authentication, currently tested for single Realm
[main]
### A sample for configuring Active Directory Realm
activeDirectoryRealm = org.apache.zeppelin.realm.ActiveDirectoryGroupRealm
activeDirectoryRealm.systemUsername = [email protected]

#use either systemPassword or hadoopSecurityCredentialPath, more details in http://zeppelin.apache.org/docs/latest/security/shiroauthentication.html
#activeDirectoryRealm.systemPassword = ""
activeDirectoryRealm.hadoopSecurityCredentialPath = jceks:///etc/zeppelin/conf/credentials.jceks
activeDirectoryRealm.searchBase = DC=company,DC=com
activeDirectoryRealm.url = ldaps://company.com
activeDirectoryRealm.principalSuffix = @company.COM
activeDirectoryRealm.groupRolesMap = "CN=hadoop_admins,OU=hadoop,DC=company,DC=com":"admin"
activeDirectoryRealm.authorizationCachingEnabled = false

### A sample for configuring LDAP Directory Realm
#ldapRealm = org.apache.zeppelin.realm.LdapGroupRealm
## search base for ldap groups (only relevant for LdapGroupRealm):
#ldapRealm.contextFactory.environment[ldap.searchBase] = dc=COMPANY,dc=COM
#ldapRealm.contextFactory.url = ldap://ldap.test.com:389
#ldapRealm.userDnTemplate = uid={0},ou=Users,dc=COMPANY,dc=COM
#ldapRealm.contextFactory.authenticationMechanism = SIMPLE

### A sample PAM configuration
#pamRealm=org.apache.zeppelin.realm.PamRealm
#pamRealm.service=sshd


sessionManager = org.apache.shiro.web.session.mgt.DefaultWebSessionManager
### If caching of user is required then uncomment below lines
#cacheManager = org.apache.shiro.cache.MemoryConstrainedCacheManager
#securityManager.cacheManager = $cacheManager

securityManager.sessionManager = $sessionManager
# 86,400,000 milliseconds = 24 hour
securityManager.sessionManager.globalSessionTimeout = 86400000

shiro.loginUrl = /api/login

#[roles]
#role1 = *
#role2 = *
#role3 = *
#admin = *

[urls]
# This section is used for url-based security.
# You can secure interpreter, configuration and credential information by urls. Comment or uncomment the below urls that you want to hide.
# anon means the access is anonymous.
# authcBasic means Basic Auth Security
/api/version = anon
#/api/interpreter/** = authc, anyofroles[admin]
#/api/configurations/** = authc, anyofroles[admin]
#/api/credential/** = authc, anyofroles[admin]
#/api/notebookRepos/** = authc, anyofroles[admin]
#/api/helium/** = authc, anyofroles[admin]
# To enfore security, comment the line below and uncomment the next one
#/** = anon
/** = authc

Zeppelin [users] block with hashed password

Get shasum of the password:

read -s -p "Password: " password

echo ${password} | shasum -a 256
unset password

Update Advanced-shiro-ini with the hash from above:

[users]
ambari-qa = TheHashFromAbove, admin

[roles]
admin = *

[main]
sha256Matcher = org.apache.shiro.authc.credential.Sha256CredentialsMatcher
iniRealm.credentialsMatcher = $sha256Matcher

Zeppelin: Kerberos & Livy

With Zeppelin, use of %livy is required instead of %spark for jobs to run as the logged-in user. This is called impersonation.

The below will configure Livy and Zeppelin to use that Livy.

Requirements

  • HDP 2.6 with Spark Livy Server, Spark Thrift, Zeppelin.
    • Zeppelin configured for LDAP: #file-zeppelin-active-directory
    • HDFS keytab or HDFS Superuser access (for securely storing LDAP credentials).
    • End users must exist in the OS and have a hdfs://user/${USERNAME} directory.

Known issues:

  1. In Zeppelin, when using %hive, this error occurs: "Failed to validate proxy privilege of zeppelin"

Configuration from Ambari

  • HDFS: Confirm 'proxyuser' rights:
    • HDFS -> Configs -> Custom core-site:
hadoop.proxyuser.zeppelin.groups= the groups to grant access to
hadoop.proxyuser.zeppelin.hosts= the zeppelin hosts
hadoop.proxyuser.zeppelin.users= (optional. Best to use groups instead)
  • Ranger KMS: Confirm 'proxyuser' rights:
    • Ambari: Ranger KMS -> Configs -> Custom kms-site:
hadoop.kms.proxyuser.livy.groups= the groups to grant access to
hadoop.kms.proxyuser.livy.hosts= the livy hosts
hadoop.kms.proxyuser.livy.users= (optional. Best to use groups instead)
  • (OPTIONAL) Spark: Confirm Livy configuration
    • Ambari: Spark -> Custom livy-conf, add:

NOTE: This is to limit who can submit to livy. It is not required.

livy.server.access_control.enabled=true
livy.server.access_control.users=livy,zeppelin-clustername,ambari-qa-clustername
   ## change the zeppelin & ambari-qa to match their principal name. For example: zeppelin-clustername,ambari-qa-clustername

Configuration from Zeppelin

  • Interpreters -> %livy Interpreter:

    • fix: zeppelin.livy.url to full hostname of Livy
  • Test %livy from a Notebook:

%livy.spark

sc.version

%livy.sql
show databases
  • Open YARN Resource Manager UI and confirm the job executed as your user (i.e. myuser)

Zeppelin using PAM instead of LDAP

Instead of using LDAP, Zeppelin can use the local systems PAM for authentication.

  1. Create /etc/pam.d/cloudera:
    sudo tee /etc/pam.d/cloudera > /dev/null <<'EOF'
    #%PAM-1.0
    auth    sufficient        pam_unix.so
    auth    sufficient        pam_sss.so
    account sufficient        pam_unix.so
    account sufficient        pam_sss.so
    EOF
    
  2. Zeppelin -> Advanced zeppelin-env -> "shiro_ini_content" (Update with config from below.)
  3. Restart affected services
  4. Test: Login to Zeppelin as any system user.
[main]
pamRealm=org.apache.zeppelin.realm.PamRealm
pamRealm.service=cloudera

anyofrolesuser = org.apache.zeppelin.utils.AnyOfRolesUserAuthorizationFilter 

sessionManager = org.apache.shiro.web.session.mgt.DefaultWebSessionManager

cookie = org.apache.shiro.web.servlet.SimpleCookie
cookie.name = JSESSIONID
#Uncomment the line below when running Zeppelin-Server in HTTPS mode
cookie.secure = true
cookie.httpOnly = true
sessionManager.sessionIdCookie = $cookie

securityManager.sessionManager = $sessionManager
# 86,400,000 milliseconds = 24 hour
securityManager.sessionManager.globalSessionTimeout = 86400000
shiro.loginUrl = /api/login

[urls]
# This section is used for url-based security.
# You can secure interpreter, configuration and credential information by urls. Comment or uncomment the below urls that you want to hide.
# anon means the access is anonymous.
# authc means Form based Auth Security

## change 'admin' to the group name(s), separated by comma
/api/version = anon
/api/interpreter/setting/restart/** = authc, anyofrolesuser[admins]
/api/interpreter/** = authc, anyofrolesuser[admins]
/api/notebook-repositories/** = authc, anyofrolesuser[admins,users]
/api/configurations/** = authc, anyofrolesuser[admins]
/api/credential/** = authc, anyofrolesuser[admins]
/api/admin/** = authc, anyofrolesuser[admins]
/** = authc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment