Skip to content

Instantly share code, notes, and snippets.

@noelbundick
Last active October 24, 2024 18:04
Show Gist options
  • Save noelbundick/0262aa0f37ec3f9a179137c622b6280d to your computer and use it in GitHub Desktop.
Save noelbundick/0262aa0f37ec3f9a179137c622b6280d to your computer and use it in GitHub Desktop.
Consuming packages from a private Azure Pipelines Python artifact feed

Consuming Azure Pipelines Python artifact feeds in Docker

Recently, I was building out a set of Python packages and services and needed to find a way to pull down packages from an Azure Artifacts feed into a Docker image. It was straightforward to use the tasks to package an artifact, authenticate to the feed, and publish.

I had to do a bit more digging to piece together a flow I was comfortable with for building container images. This post describes some of the challenges involved and how I solved for them.

What's the problem?

The PipAuthenticate task is great - it authenticates with your artifacts feed and per the docs, will store the location of a config file that can be used to connect in the PYPIRC_PATH environment variable.

That said - by design, containers run in an isolated environment. We can't directly access it while building a container image. We need a way to get that config inside the build phase so that our calls to python -m pip install are successful. You are using a virtual environment & python -m pip install to install packages, right?

Challenge 1: No volumes at build time!

Docker doesn't currently support* mounting volumes at build time. So we can't just mount our PYPIRC_PATH file from the Azure Pipelines host into the build.

It would be much easier to pass a string as a --build-arg to Docker and then consume it. Azure Pipelines tasks are open source on GitHub, so I thought I'd take a look to see how the task worked and possibly extend it. It turns out that the PipAuthenticate task has some undocumented behavior bonus features and it already does what I want! It populates the PIP_EXTRA_INDEX_URL environment variable, which is automatically picked up by pip.

*Well, sort of! You can solve this with --mount=type=secret when you enable BuildKit. If this was a personal project, I'd have stopped there and said #shipit! In this case, I was really looking to find something that works for all users and isn't explicitly marked "experimental".

Challenge 2: Keep it secret, keep it safe!

Great! We pass in our build arg, set ENV PIP_EXTRA_INDEX_URL=$PIP_EXTRA_INDEX_URL and call it a day, right! Right...?

Not so fast - we want to have PIP_EXTRA_INDEX_URL available when we pull packages, but we don't want secret environment variables baked into any of the layers of a runtime image. So we'll combine what we've learned so far with a multi-stage build and we're off to the races!

Bonus!

In my real container build, I needed to install gcc, musl-dev, python3-dev and a bunch of other things to pull down my dependencies & build wheels - so a multi-stage build drops my final image size from >1GB down to ~100MB anyway

Wrapping up

I've attached a few sample files that I pulled from my working pipeline to get you started with this approach. I hope this helps and plan for this post to be soon obsolete after I complete a few pull requests into Microsoft docs! :)

trigger:
- master
pool:
vmImage: ubuntu-16.04
variables:
artifactFeed: myfeed # the name of an Azure artifacts feed
azureSubscription: mysubscription # the name of an Azure Resource Manager Service Connection
azureContainerRegistry: myregistry # the name of an Azure Container Registry
imageName: my-container
steps:
- task: AzureCLI@1
displayName: Login to ACR
inputs:
azureSubscription: $(azureSubscription)
scriptLocation: inlineScript
inlineScript: |
az acr login -n $(azureContainerRegistry)
# This task populates the PIP_EXTRA_INDEX_URL environment variable
# https://github.com/microsoft/azure-pipelines-tasks/blob/7eab2bc96011927a971f2613ce6e85d93ee9b3f1/Tasks/PipAuthenticateV0/pipauthenticatemain.ts#L60
- task: PipAuthenticate@0
displayName: Authenticate with artifact feed
inputs:
artifactFeeds: $(artifactFeed)
# Docker build w/ a build arg to pass PIP_EXTRA_INDEX_URL into the build phase
- bash: |
docker build \
--build-arg 'INDEX_URL=$(PIP_EXTRA_INDEX_URL)' \
-t $(azureContainerRegistry).azurecr.io/$(imageName):$(Build.BuildNumber) \
-t $(azureContainerRegistry).azurecr.io/$(imageName):latest \
.
docker push $(azureContainerRegistry).azurecr.io/$(imageName):$(Build.BuildNumber)
docker push $(azureContainerRegistry).azurecr.io/$(imageName):latest
displayName: Build and push container
trigger:
- master
pool:
vmImage: ubuntu-16.04
variables:
artifactFeed: myfeed # the name of an Azure artifacts feed
steps:
- task: UsePythonVersion@0
inputs:
versionSpec: 3.6
- script: python -m pip install -U pip setuptools wheel twine
displayName: Install build tools
- script: python setup.py bdist_wheel
displayName: Build package
- task: TwineAuthenticate@0
displayName: Configure twine authentication
inputs:
artifactFeeds: $(artifactFeed)
- script: twine upload -r $(artifactFeed) --config-file $(PYPIRC_PATH) dist/*
displayName: Publish artifacts
# We set an environment variable in this phase so it gets picked up by pip, but we don't want to bake secrets into our container image
FROM python:3.6-alpine AS builder
ARG INDEX_URL
ENV PIP_EXTRA_INDEX_URL=$INDEX_URL
COPY requirements.txt .
RUN pip install -U pip \
&& pip install --user -r requirements.txt
# We use a multistage build to start fresh (and drop PIP_EXTRA_INDEX_URL from our published image)
FROM python:3.6-alpine
COPY --from=builder /root/.local /root/.local
ENV PATH=/root/.local/bin:$PATH
WORKDIR /app
COPY app .
CMD ["python", "app.py"]
MIT License
Copyright (c) 2019 Noel Bundick
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
@FurcyPin
Copy link

FurcyPin commented Mar 3, 2021

Security advice:

As demonstrated in this excellent blog article and as recommended in this Azure white paper, it is strongly advised to use PIP_INDEX_URL instead of PIP_EXTRA_INDEX_URL to mitigate risk of package substitution.

Thanks for the gist, by the way. It was very useful and worked nicely for me.

@axelPalmerin
Copy link

Don't know if its an azure issue or a docker-compose one, but I adapted this example to run a docker-compose build instead, and it keeps failing because something adds an empty space at the beggining of the PIP_EXTRA_INDEX_URL. The task is defined as:

- task: DockerCompose@0
      displayName: 'Build Docker images'
      inputs:
        azureSubscription: ...
        azureContainerRegistry: ...
        projectName: ...
        dockerComposeCommand: build --build-arg 'INDEX_URL=$(PIP_EXTRA_INDEX_URL)'

But the execution runs (note the space between = and https://build):

[command]/usr/local/bin/docker-compose -f /home/vsts/work/1/s/docker-compose.yml -f /home/vsts/agents/2.153.1/.docker-compose.1560464790310.yml build --build-arg 'INDEX_URL= https://build:***@...'

do you know what could be the issue?

Add this bash script in order to trim de var.

        - task: Bash@3   
          displayName: trim index_url var
          inputs:        
            targetType: 'inline'          
            script: |       
              PIP_INDEX_URL=`echo $(PIP_EXTRA_INDEX_URL) | sed 's/ //g'`              
              echo $PIP_INDEX_URL
              echo "##vso[task.setvariable variable=PIP_INDEX_URL]$PIP_INDEX_URL"

Then just use the PIP_INDEX_URL var

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment