Created
August 18, 2024 13:14
-
-
Save biggers/919fcd0c1c3a7f220a2994a6551ed2ab to your computer and use it in GitHub Desktop.
Bootstrapping a working Jupyter notebook environment for Spark/ML book (python)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## the git-repo contains the example code and solutions to the exercises in O'Reilly book: | |
## | |
## "Scaling Machine Learning with Spark: Distributed ML with MLlib, TensorFlow, and PyTorch" 2023 by Adi Polak | |
# You will need a Python install using "anaconda" by means of Mamba: | |
# https://mamba.readthedocs.io/en/latest/installation/mamba-installation.html | |
git clone https://github.com/adipolak/ml-with-apache-spark.git | |
mamba create -n spark-jupyter-openjdk8-py3 -c conda-forge python=3.11 jupyter notebook openjdk=8 findspark | |
mamba activate spark-jupyter-openjdk8-py3 | |
mamba install pyspark | |
cd ml-with-apache-spark | |
# put url-token-link in your fav browser | |
# browse to a book-example Notebook | |
jupyter-notebook |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment