This is how I quickly got an Apache Zepplin notebook running against the AWS Glue Dev endpoint. None of the guides out there seemed concise, and I found some custom Docker containers doing what you can do easily. This gives you the power - it sets up port forwarding & runs the official Docker image.
- Create your Glue Dev endpoint (this involves creating a keypair, I just used
ssh-keygen
) - Once READY, select it and copy the "SSH tunnel to remote interpreter"
- eg: ssh -i <private-key.pem> -vnNT -L :9007:169.254.76.1:9007 [email protected]
- Connect to the endpoint in a terminal session, modifying the above to match:
ssh -i ~/.ssh/glue-dev -vnNT -L :9007:*127.0.0.1*:9007 glue@<ec2-endpoint>.<region>.compute.amazonaws.com
- Run the Apache Zepplin Docker container
docker run -p 8080:8080 --rm -v $PWD/logs:/logs -v $PWD/notebook:/notebook -e ZEPPELIN_LOG_DIR='/logs' -e ZEPPELIN_NOTEBOOK_DIR='/notebook' --name zeppelin apache/zeppelin:0.7.3
- Update your interpreters to use the existing process (the AWS Glue endpoint).
- Find the intepreter of choice
- Hit edit top right
- Check "Connect to existing process"
- Set Host to:
host.docker.internal
- Set Port to:
9007
- You should now be able to create a notebook and get started!
Note that for Glue 1.0 and alter, use Zeppelin v0.8.1, not 0.7.3 as is stated in the script.