Skip to content

Instantly share code, notes, and snippets.

Last active January 26, 2022 01:39
Show Gist options
  • Save gwhitelaw/88095e01209b79b627a7ff7c8371b2cf to your computer and use it in GitHub Desktop.
Save gwhitelaw/88095e01209b79b627a7ff7c8371b2cf to your computer and use it in GitHub Desktop.
Easily connect to an AWS Glue Dev endpoint

This is how I quickly got an Apache Zepplin notebook running against the AWS Glue Dev endpoint. None of the guides out there seemed concise, and I found some custom Docker containers doing what you can do easily. This gives you the power - it sets up port forwarding & runs the official Docker image.

  1. Create your Glue Dev endpoint (this involves creating a keypair, I just used ssh-keygen)
  2. Once READY, select it and copy the "SSH tunnel to remote interpreter"
  1. Connect to the endpoint in a terminal session, modifying the above to match: ssh -i ~/.ssh/glue-dev -vnNT -L :9007:**:9007 glue@<ec2-endpoint>.<region>
  2. Run the Apache Zepplin Docker container docker run -p 8080:8080 --rm -v $PWD/logs:/logs -v $PWD/notebook:/notebook -e ZEPPELIN_LOG_DIR='/logs' -e ZEPPELIN_NOTEBOOK_DIR='/notebook' --name zeppelin apache/zeppelin:0.7.3
  3. Update your interpreters to use the existing process (the AWS Glue endpoint).
  • Find the intepreter of choice
  • Hit edit top right
  • Check "Connect to existing process"
  • Set Host to: host.docker.internal
  • Set Port to: 9007
  1. You should now be able to create a notebook and get started!
Copy link

Note that for Glue 1.0 and alter, use Zeppelin v0.8.1, not 0.7.3 as is stated in the script.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment