Skip to content

Instantly share code, notes, and snippets.

@widdowquinn
Created December 9, 2017 16:33
Show Gist options
  • Select an option

  • Save widdowquinn/e91b9bb850ece8873bbd084944798fe2 to your computer and use it in GitHub Desktop.

Select an option

Save widdowquinn/e91b9bb850ece8873bbd084944798fe2 to your computer and use it in GitHub Desktop.
Set up JupyterHub on AWS

JupyterHub on AWS

EC2 Setup

  • Log in to AWS
  • Go to a sensible region
  • Start a new instance with Ubuntu Trusty (14.04) - compute-optimised instances have a high vCPU:memory ratio, and the lowest-cost CPU time. c4.2xlarge is a decent choice.
  • Set security group (firewall) to have ports 22, 80, and 443 open (SSH, HTTP, HTTPS)
  • If you want a static IP address (for long-running instances) then select Elastic IP for this VM
  • If you want to use HTTPS, you'll probably need a paid certificate, or to use Amazon's Route 53 to get a non-Amazon domain (to avoid region blocking).

Route 53

  • Open Route 53 on Amazon
  • If you don't already have a domain name registered, register/transfer a domain name.
  • This will create a new Hosted Zone and Record Set for you, for the 'parent' domain

To use the parent domain

  • Create a new record set
  • Enter the subdomain name, and choose no for Alias. Enter the IP address for the EC2 setup above.
  • Accept the other defaults (TTL, Routing Policy) and click Create.

To use a new subdomain

  • Create a new Hosted Zone, and give it the full subdomain name (subdomain.parent.domain)
  • Copy the nameservers from the NS nameserver Record Set in the subdomain.
  • Create a new nameserver Record Set in the parent domain Hosted Zone, with the full subdomain name (subdomain.parent.domain), and paste in the subdomain nameservers you copied above.
  • Return to the subdomain Hosted Zone, and create a new Record Set of type A.
  • Enter the subdomain name, and choose no for Alias. Enter the IP address for the EC2 instance you set up above.
  • Accept the other defaults (TTL, Routing Policy) and click Create.

Set up server

  • SSH into your new server
  • Create server directory, and perform some updates
sudo mkdir /srv/jupyterhub
sudo chown -R ubuntu:ubuntu /srv/jupyterhub
sudo apt-get update
sudo apt-get install git
  • Get SSL keys using LetsEncrypt (this requires a registered domain name)
git clone https://github.com/letsencrypt/letsencrypt
cd letsencrypt
./letsencrypt-auto certonly --standalone -v -d <host.domain.name>
# You'll need to enter an email, and I'd recommend sharing info with EFF
  • Store the keys in the server directory
mkdir /srv/jupyterhub/ssl
sudo cp /etc/letsencrypt/live/<host.domain.name>/fullchain.pem /etc/letsencrypt/live/<host.domain.name>/privkey.pem /srv/jupyterhub/ssl
  • Install and start Docker
sudo apt-key adv --keyserver hkp://p80.pool.sks-keyservers.net:80 --recv-keys 58118E89F3A912897C070ADBF76221572C52609D
echo "deb https://apt.dockerproject.org/repo ubuntu-trusty main" | sudo tee /etc/apt/sources.list.d/docker.list >/dev/null
sudo apt-get update
sudo apt-get upgrade docker-engine
sudo usermod -aG docker ubuntu
sudo service docker start
  • Log out and then back in, and test docker:
docker run hello-world
  • Install Python3
sudo apt-get install python3-pip
  • Install npm and its dependencies
sudo apt-get install npm nodejs-legacy
sudo npm install -g configurable-http-proxy
  • Install jupyterhub
sudo pip3 install jupyterhub
sudo pip3 install --upgrade notebook
  • Install and configure OAuth
sudo pip3 install oauthenticator
  • Visit https://github.com/settings/applications/new and enter the following:

  • Application name: something to identify your site

  • Homepage URL: https://<your_host>

  • Application description: some text describing your site

  • Callback URL: https://<your_host>/hub/oauth_callback

  • Click on Register application

  • Create a new jupyter_config.py file:

jupyterhub --generate-config
  • The settings required are:
# jupyterhub_config.py
c = get_config()
        
import os
pjoin = os.path.join

runtime_dir = pjoin('/srv/jupyterhub')
ssl_dir = pjoin(runtime_dir, 'ssl')
if not os.path.exists(ssl_dir):
    os.makedirs(ssl_dir)

# https on :8443
c.JupyterHub.port = 8443
c.JupyterHub.ssl_key = '/etc/letsencrypt/live/<HOSTNAME>/privkey.pem'
c.JupyterHub.ssl_cert = '/etc/letsencrypt/live/<HOSTNAME>/fullchain.pem'

# put the JupyterHub cookie secret and state db
# in /var/run/jupyterhub
c.JupyterHub.cookie_secret_file = pjoin(runtime_dir, 'cookie_secret')
c.JupyterHub.db_url = pjoin(runtime_dir, 'jupyterhub.sqlite')
# or `--db=/path/to/jupyterhub.sqlite` on the command-line

# use GitHub OAuthenticator for local users

c.JupyterHub.authenticator_class = 'oauthenticator.LocalGitHubOAuthenticator'
c.GitHubOAuthenticator.oauth_callback_url = 'https://<HOSTNAME>/hub/oauth_callback'
c.GitHubOAuthenticator.client_id = <GITHUB_CLIENT_ID>
c.GitHubOAuthenticator.client_secret = <GITHUB_CLIENT_SECRET>

# specify users and admin
c.Authenticator.whitelist = {'<USERNAME>', }
c.Authenticator.admin_users = {'<USERNAME>', }

# start single-user notebook servers in ~/assignments,
# with ~/assignments/Welcome.ipynb as the default landing page
c.Spawner.notebook_dir = '~/assignments'
c.Spawner.args = ['--NotebookApp.default_url=/notebooks/Welcome.ipynb']

c.JupyterHub.extra_log_file = '/var/log/jupyterhub.log'
  • Redirect port 8443 to HTTPS:
sudo iptables -t nat -A PREROUTING -p tcp --dport 443 -j REDIRECT --to 8443
  • Configure JupyterHub as a service
wget https://gist.githubusercontent.com/lambdalisue/f01c5a65e81100356379/raw/ecf427429f07a6c2d6c5c42198cc58d4e332b425/jupyterhub
sudo mv jupyterhub /etc/init.d/jupyterhub
sudo mkdir /etc/jupyterhub
sudo jupyterhub --generate-config -f /etc/jupyterhub/jupyterhub_config.py
  • Allow notebook widget extensions
sudo pip3 install ipywidgets
sudo jupyter nbextension enable --py --sys-prefix widgetsnbextension
  • Start JupyterHub
sudo service jupyterhub start
  • Logging in
$ ssh -i ".ssh/<YOUR_PEM>.pem" <YOUR_SERVER>
$ ssh <USERNAME>@<HOSTNAME>
@btomtom5
Copy link
Copy Markdown

wow! Thank you so much. This helped a ton!

@itzceekay
Copy link
Copy Markdown

Hey thanks, you saved my capstone project. But I get an "400 : Bad Request
OAuth state missing from cookies: When I login using the github Id. Experienced this before ?

@maneeshdisodia
Copy link
Copy Markdown

Great work sir.
Just stuck in error "500 : Internal Server Error", i have replicated all the steps you have mentioned in gist.
and service setup needs some more clarifications.
Thanks

@yuyueugene84
Copy link
Copy Markdown

yuyueugene84 commented Mar 24, 2019

Thanks for writing this guide!

Make sure the nodejs you installed supports ES6 syntax, otherwise you will get an error like this: winstonjs/winston#1256

I suggest you add the following into this guide:

curl -sL https://deb.nodesource.com/setup_10.x | sudo -E bash -
sudo apt-get install -y nodejs

Cheers!

@JeremyMcCormick
Copy link
Copy Markdown

My god, what a configuration nightmare. :(

@phaustin
Copy link
Copy Markdown

Note the official docs now include AWS: https://zero-to-jupyterhub.readthedocs.io/en/latest/

@anujonthemove
Copy link
Copy Markdown

Where did you make use of docker in the whole process?

@widdowquinn
Copy link
Copy Markdown
Author

Fair question, @anujonthemove. I don't remember. These were notes to myself for setting up JupyterHub for students to use in a training course. It's possible that the Docker instructions are there so I didn't forget them, but that I installed it for a completely different reason.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment