Skip to content

Instantly share code, notes, and snippets.

@marcinantkiewicz
Last active February 25, 2026 05:56
Show Gist options
  • Select an option

  • Save marcinantkiewicz/85c819bd1b26f7503ce52d389f51c583 to your computer and use it in GitHub Desktop.

Select an option

Save marcinantkiewicz/85c819bd1b26f7503ce52d389f51c583 to your computer and use it in GitHub Desktop.
Create GH issue listing images used in dockerfiles in specified or all repositories in a github org.

Note:

  • Default GH token does not allow reads from other repos. I use GH App to auth the action.
  • GH search API has vicious rate limits, 3s sleep is not enough, or I am getting labelled as a bot. WTF Microsoft?
  • This will open one issue, listing all the images, in a table |repo|dockerfile|image|. It should process multi-stage dockerfiles.
  • the way it finds dockerfiles is dumb - find anything with dockerfile in name, find FROM line... works fine on my computer. I
name: List docker images
on:
  schedule:
    - cron: '0 8 * * *' # 8am utc/midnight-late night in the US
  workflow_dispatch:
    inputs:
      repo_scope:
        description: "scope to what repos"
        required: true
        default: "default"
        type: choice
        options:
          - default
          - all

jobs:
  list-dockerfiles:
    runs-on: ubuntu-latest
    permissions:
      issues: write
      contents: read
    steps:
      - name: generate token
        id: generate_token
        uses: actions/create-github-app-token@v2
        with:
          app-id:      ${{ secrets.SECURITY_REPORTER_APP_ID }}
          private-key: ${{ secrets.SECURITY_REPORTER_PRIVATE_KEY }}
          owner:       ${{ github.repository_owner }}

      - name: Search and List Dockerfiles
        env:
          GH_TOKEN: ${{ steps.generate_token.outputs.token }}
          ORG: ${{ github.repository_owner }}
          SCOPE: ${{ github.event.inputs.repo_scope }}
          #DEFAULT_REPOS: "myreponame"
          DEFAULT_REPOS: ${{ github.event.repository.name }} # repo where the action runs
        run: | 
        ```
        ```bash
          set -euo pipefail
          summary() { echo -e "$*" | tee -a "$GITHUB_STEP_SUMMARY"; }
          report()  { echo -e "$*" | tee -a report.md; }
          
          report "## Dockerfile Audit Results"                
          report "| Repository | Path | Base Image (FROM) |" 
          report "|------------|------|-------------------|" 
        
          summary "SCOPE: $SCOPE" 
          
          if [ "$SCOPE" == "all" ]; then
            REPOS=$(gh repo list "$ORG" --limit 200 --json name)
          else
            REPOS=$(printf '{"name":"%s"}' "$DEFAULT_REPOS" | jq -s '.')
          fi
        
          summary "\nREPOS: $REPOS"
          
          echo "$REPOS" | jq -r '.[].name' | while read -r REPO; do
            [ -z "$REPO" ] && continue;
          
            summary "### Processing \`$REPO\`"
          
            RAW_RESPONSE=$(gh api "search/code?q=filename:Dockerfile+repo:$ORG/$REPO" )
          
            {
              echo "<details>"
              echo "<summary>Raw Response from GH Search API</summary>"
              echo "" 
              echo "\`\`\`json"
              echo "$RAW_RESPONSE"
              echo "\`\`\`"
              echo ""
              echo "</details>"
            } >> "$GITHUB_STEP_SUMMARY"
          
            echo "$RAW_RESPONSE" | jq -r '.items[].path' 2>/dev/null | while read -r FILE_PATH; do
              [ -z "FILE_PATH" ] && continue;
          
          
              if echo "$RAW_RESPONSE" | grep -q "rate limit exceeded"; then
                summary "FATAL: Rate limit hit. Exiting."
                exit 1
              fi
          
              {
                echo "<details>"
                echo "<summary>Raw Response from GH Search API</summary>"
                echo "" 
              } >> "$GITHUB_STEP_SUMMARY"
          
              # the pattern and the regex are split because escaping regex in the braces is a pain
              # bash `test` does not support regex, and I don't want to run that by grep
              pattern='[{}[] ]'
              [[ "$FILE_PATH" =~ $pattern ]] && continue
          
              echo "-- processing: $FILE_PATH"
              echo "getting raw file from: repos/$ORG/$REPO/contents/$FILE_PATH"
              CONTENT=$(gh api "repos/$ORG/$REPO/contents/$FILE_PATH" -H "Accept: application/vnd.github.raw" || true)
              [ -z "$CONTENT" ] && { echo "ERROR: failed to retrieve file: $FILEPATH"; continue; }
              echo "content received, parsing for image name"
              
              IMAGES=$(echo "$CONTENT" | awk 'toupper($1) == "FROM" { if ($2 ~ /^--/) print $3; else print $2 }' | paste -sd "," - | sed 's/,/, /g')
              echo "image parsing done, result: $IMAGES"
              
              summary "* **inspecting file**: \`$FILE_PATH\`" 
             
              if [ -n "$IMAGES" ]; then
                report "| $REPO | \`$FILE_PATH\` | \`$IMAGES\` |" 
                summary "       _found image_: \`$IMAGES\`" 
              fi
          
              sleep $((4 + RANDOM % 2))
            done
            
            {
              echo ""
              echo "</details>"
            } >> "$GITHUB_STEP_SUMMARY"
          
            sleep $((5 + RANDOM % 2))
          done

      - name: Create Issue
        env:
          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: |
          gh issue create \
            --title "Dockerfile Audit - $(date +'%Y-%m-%d')" \
            --body-file report.md \
            --repo ${{ github.repository }}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment