Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save monocongo/279458b00ce8043ae2042355ad798cf3 to your computer and use it in GitHub Desktop.
Save monocongo/279458b00ce8043ae2042355ad798cf3 to your computer and use it in GitHub Desktop.
How to start a Jupyter Notebook with PySpark Kernel
#!/usr/bin/env bash
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Starts a Jupyter Notebook Server with a PySpark Kernel.
# to run, SPARK_HOME must be set and point to a Spark installation
# or run from the Spark installation directory.
# Alternatively, as a single command instead of the script:
# $ PYSPARK_DRIVER_PYTHON=jupyter PYSPARK_DRIVER_PYTHON_OPTS='notebook' $SPARK_HOME/bin/pyspark
# Use current directory if SPARK_HOME not set
if [ -z "${SPARK_HOME}" ]; then
export SPARK_HOME="$(dirname "$0")"
fi
export PYSPARK_DRIVER_PYTHON=jupyter
export PYSPARK_DRIVER_PYTHON_OPTS='notebook'
exec "${SPARK_HOME}"/bin/pyspark "$@"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment