The following instructions for enabling Azure SSO for Apache Airflow nearly take you all the way - but fall short a couple of details around the configuration of airflow itself:
https://objectpartners.com/2021/12/24/enterprise-auth-for-airflow-azure-ad
All the "Azure" instructions there can be safely followed - the resulting webserver_config.py
(which can be injected into a dockerised Airflow in /opt/airflow/webserver_config.py
) can be built from the following:
from __future__ import annotations
import os
from airflow.www.fab_security.manager import AUTH_OAUTH
from airflow.www.security import AirflowSecurityManager
from airflow.utils.log.logging_mixin import LoggingMixin
basedir = os.path.abspath(os.path.dirname(__file__))
# Flask-WTF flag for CSRF
WTF_CSRF_ENABLED = True
WTF_CSRF_TIME_LIMIT = None
AUTH_TYPE = AUTH_OAUTH
OAUTH_PROVIDERS = [{
'name':'Microsoft Azure AD',
'token_key':'access_token',
'icon':'fa-windows',
'remote_app': {
'api_base_url': "https://login.microsoftonline.com/{}".format(os.getenv("AAD_TENANT_ID")),
'request_token_url': None,
'request_token_params': {
'scope': 'openid email profile'
},
'access_token_url': "https://login.microsoftonline.com/{}/oauth2/v2.0/token".format(os.getenv("AAD_TENANT_ID")),
"access_token_params": {
'scope': 'openid email profile'
},
'authorize_url': "https://login.microsoftonline.com/{}/oauth2/v2.0/authorize".format(os.getenv("AAD_TENANT_ID")),
"authorize_params": {
'scope': 'openid email profile'
},
'client_id': os.getenv("AAD_CLIENT_ID"),
'client_secret': os.getenv("AAD_CLIENT_SECRET"),
'jwks_uri': 'https://login.microsoftonline.com/common/discovery/v2.0/keys'
}
}]
AUTH_USER_REGISTRATION_ROLE = "Public"
AUTH_USER_REGISTRATION = True
AUTH_ROLES_SYNC_AT_LOGIN = True
AUTH_ROLES_MAPPING = {
"airflow_prod_admin": ["Admin"],
"airflow_prod_user": ["Op"],
"airflow_prod_viewer": ["Viewer"]
}
class AzureCustomSecurity(AirflowSecurityManager, LoggingMixin):
def get_oauth_user_info(self, provider, response=None):
me = self._azure_jwt_token_parse(response["id_token"])
return {
"name": me["name"],
"email": me["email"],
"first_name": me["given_name"],
"last_name": me["family_name"],
"id": me["oid"],
"username": me["preferred_username"],
"role_keys": me["roles"]
}
# the first of these two appears to work with older Airflow versions, the latter newer.
FAB_SECURITY_MANAGER_CLASS = 'webserver_config.AzureCustomSecurity'
SECURITY_MANAGER_CLASS = AzureCustomSecurity
The above assumes environment variables are configured for the OAuth client secret, etc - and has been tested thoroughly and confirmed working.
Note the roles need to match what you configured in Azure (the example above is using airflow_prod_user
etc, in deviation to the linked article above).
@vdozal Look at this variable which is sensitive AIRFLOW__LOGGING__FAB_LOGGING_LEVEL set it to
DEBUG
it will output in the logs all the access token and jwt going on. Make sure to turn it off after it is sensitive.An other mean of debugging is to used the official docker-compose file to start locally an airflow instance and set the same settings as your deployment including the webserver_config.py with the above code and debug the code with print or logging statement.
One last thing is to make sure that you have set in your settings the proper config for the webserver to include the Oauth config, I am using kubernetes and I use a
configmap
namedwebserver_config.py
which is referenced in thevalues.yaml
at thewebserver.webserverConfigConfigMapName: webserver-config-custom
. This is where the code should be included to be executed.I have checked once more and the version of the code running in my cluster is exactly what I pasted above.
Hope it helps.
Best