To invoke JupyterLab with Spark capabilities there are two ways. An ad hoc method is to just state on the command line that JupyterLab should use pyspark as kernel. For instance starting JupyterLab with Python 3.6 (needs to be consistent with your Spark distribution), 20 executors each having 5 cores might look like this:
PYSPARK_PYTHON=python3.6 PYSPARK_DRIVER_PYTHON="jupyter" PYSPARK_DRIVER_PYTHON_OPTS="notebook --no-browser --port=8899" /usr/bin/pyspark2 --master yarn --deploy-mode client --num-executors 20 --executor-memory 10g --executor-cores 5 --conf spark.dynamicAllocation.enabled=false
In order to be able to create notebooks with a specific PySpark kernel directly from JupyterLab, just create a file ~/.local/share/jupyter/kernels/pyspark/kernel.json holding:
{
"display_name": "PySpark",
"language": "python",