Recently, I was building out a set of Python packages and services and needed to find a way to pull down packages from an Azure Artifacts feed into a Docker image. It was straightforward to use the tasks to package an artifact, authenticate to the feed, and publish.
I had to do a bit more digging to piece together a flow I was comfortable with for building container images. This post describes some of the challenges involved and how I solved for them.
The PipAuthenticate task is great - it authenticates with your artifacts feed and per the docs, will store the location of a config file that can be used to connect in the PYPIRC_PATH environment variable.
That said - by design, containers run in an isolated environment. We can't directly access it while building a container image. We need a way to get that config inside the build phase so that our calls to python -m pip install are successful. You are using a virtual environment & python -m pip install to install packages, right?
Docker doesn't currently support* mounting volumes at build time. So we can't just mount our PYPIRC_PATH file from the Azure Pipelines host into the build.
It would be much easier to pass a string as a --build-arg to Docker and then consume it. Azure Pipelines tasks are open source on GitHub, so I thought I'd take a look to see how the task worked and possibly extend it. It turns out that the PipAuthenticate task has some undocumented behavior bonus features and it already does what I want! It populates the PIP_EXTRA_INDEX_URL environment variable, which is automatically picked up by pip.
*Well, sort of! You can solve this with
--mount=type=secretwhen you enable BuildKit. If this was a personal project, I'd have stopped there and said #shipit! In this case, I was really looking to find something that works for all users and isn't explicitly marked "experimental".
Great! We pass in our build arg, set ENV PIP_EXTRA_INDEX_URL=$PIP_EXTRA_INDEX_URL and call it a day, right! Right...?
Not so fast - we want to have PIP_EXTRA_INDEX_URL available when we pull packages, but we don't want secret environment variables baked into any of the layers of a runtime image. So we'll combine what we've learned so far with a multi-stage build and we're off to the races!
In my real container build, I needed to install gcc, musl-dev, python3-dev and a bunch of other things to pull down my dependencies & build wheels - so a multi-stage build drops my final image size from >1GB down to ~100MB anyway
I've attached a few sample files that I pulled from my working pipeline to get you started with this approach. I hope this helps and plan for this post to be soon obsolete after I complete a few pull requests into Microsoft docs! :)
I believe this is overly complicated and partially wrong (ENV in Dockerfile).
All I did was
in the Dockerfile:
and in the yml file add the PipAuthenticate and the build args PIP_EXTRA_INDEX_URL=$(PIP_EXTRA_INDEX_URL)
The use of the
INDEX_URLis just a unnecessary detour.Reasoning:
What is the goal? To have the PIP_EXTRA_INDEX_URL environment variable set during the build and inside the docker. This is done by specifying a
ARG PIP_EXTRA_INDEX_URLin the Dockerfile to tell what environment variable can be passed in.The
ENVis not required in the Dockerfile and should not be present. ENV sets an environment variable for the running instance, so everybody can see it when attaching a shell to the docker image. This would only be needed when a pip install is executed inside a running container.Thanks to your finding that PipAuthenticate sets the environment variable in the shell hosting the build, all we have to do is pass it in:
PIP_EXTRA_INDEX_URL=$(PIP_EXTRA_INDEX_URL)