Skip to content

Instantly share code, notes, and snippets.

@pablosjv
Created August 27, 2021 11:58
Show Gist options
  • Save pablosjv/3da98bb13ee25e0614c87b446db76158 to your computer and use it in GitHub Desktop.
Save pablosjv/3da98bb13ee25e0614c87b446db76158 to your computer and use it in GitHub Desktop.
Large Scale Pytorch Inference Pipeline: Spark vs Dask - Code Examples
FROM amazoncorretto:8
ENV PYSPARK_DRIVER_PYTHON python3
ENV PYSPARK_PYTHON python3
RUN yum -y update
RUN yum -y groupinstall development
RUN yum -y update \
&& yum -y group install "Development Tools" development \
&& yum -y install yum-utils which hostname python3-devel python-devel python3-pip python3-virtualenv
RUN pip3 install --upgrade pip wheel setuptools
RUN pip3 install pipenv==2018.11.26
WORKDIR /opt/emr-job/
COPY Pipfile.lock .
COPY Pipfile .
RUN pipenv install --system --deploy --ignore-pipfile
COPY . .
RUN python3 setup.py bdist_wheel
RUN python3 setup.py install
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment