Skip to content

Instantly share code, notes, and snippets.

@ohe
Created September 7, 2016 16:17
Show Gist options
  • Save ohe/0d1ffcb3dad799ac6858bbeb6202678e to your computer and use it in GitHub Desktop.
Save ohe/0d1ffcb3dad799ac6858bbeb6202678e to your computer and use it in GitHub Desktop.
Dockerfile to build latest pyarrow version
FROM ubuntu:xenial
RUN apt-get update && apt-get install -y \
automake \
bison \
cmake \
curl \
flex \
g++ \
git \
libboost-all-dev \
libevent-dev \
libssl-dev \
libtool \
pkg-config \
wget
WORKDIR /src
RUN wget https://bootstrap.pypa.io/get-pip.py && python get-pip.py
RUN wget http://apache.crihan.fr/dist/thrift/0.9.3/thrift-0.9.3.tar.gz && tar -xvzf thrift-0.9.3.tar.gz
RUN git clone https://github.com/apache/parquet-cpp.git && git clone git://git.apache.org/arrow.git
RUN cd thrift-0.9.3 && cmake . && make install
RUN cd parquet-cpp && \
./thirdparty/download_thirdparty.sh && \
./thirdparty/build_thirdparty.sh
RUN cd parquet-cpp && \
/bin/bash -c ". ./thirdparty/set_thirdparty_env.sh && cmake . && make install"
RUN cd arrow/cpp && \
./thirdparty/download_thirdparty.sh && \
./thirdparty/build_thirdparty.sh
RUN cd arrow/cpp && \
/bin/bash -c ". ./thirdparty/set_thirdparty_env.sh && cmake . && make install"
ENV ARROW_HOME /usr/local
RUN pip install pandas cython
RUN cd arrow/python && \
python setup.py build_ext --inplace
@ohe
Copy link
Author

ohe commented Sep 8, 2016

pyarrow library does not support (yet) install target on its setup script. To successfully import and test pyarrow, you need to cd in /src/arrow/python.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment