Skip to content

Instantly share code, notes, and snippets.

@gavrie
Last active January 12, 2025 02:08
Show Gist options
  • Save gavrie/ebfad1920d7c300ff9b55d68839adbc2 to your computer and use it in GitHub Desktop.
Save gavrie/ebfad1920d7c300ff9b55d68839adbc2 to your computer and use it in GitHub Desktop.
Building Consistent Artifacts from Multiple Git Repositories

The following is based on this prompt to Claude:

Design a solution to build artifacts from several Git repos that all use the same compiler flags. We do not want to use a central makefile for this. The repos are all built using a specific Docker image that includes all the build tools. How should we store and cache the artifacts, and retrieve the relevant versions?

I'll design a solution that focuses on consistency, reusability, and efficient artifact management. Here's a comprehensive approach:

  1. Compiler Flags Management:
# compiler-config.yaml (stored in a separate config repo)
compiler:
  version: "gcc-11.2" # We don't need to store tool versions, since they're part of the dev container image
  flags:
    common:
      - "-O2"
      - "-Wall"
      - "-Wextra"
    cpp:
      - "-std=c++17"
    c:
      - "-std=c11"
  defines:
    - "ENABLE_LOGGING=1"
  1. Build Script Template (for each repo):
#!/bin/bash
# build.sh

# Download compiler config
curl -s "https://config-repo/compiler-config.yaml" > compiler-config.yaml

# Generate build hash based on:
# - compiler config
# - source files
# - dependencies
BUILD_HASH=$(cat compiler-config.yaml source-files dependencies | sha256sum | cut -d' ' -f1)

# Check if artifacts exist in cache
if check_artifact_cache ${BUILD_HASH}; then
    download_artifacts ${BUILD_HASH}
    exit 0
fi

# Build using Docker
docker run --rm \
    -v "$(pwd):/src" \
    -v "${HOME}/.cache/artifacts:/cache" \
    build-tools:latest \
    ./docker-build.sh ${BUILD_HASH}
  1. Docker Build Script:
#!/bin/bash
# docker-build.sh

BUILD_HASH=$1

# Load compiler config
eval $(parse_yaml compiler-config.yaml)

# Set up compiler flags
export CFLAGS="${compiler_flags_common} ${compiler_flags_c}"
export CXXFLAGS="${compiler_flags_common} ${compiler_flags_cpp}"

# Build project
cmake -B build -S .
cmake --build build

# Package artifacts
tar czf artifacts.tar.gz build/

# Store in cache
store_artifact_cache ${BUILD_HASH} artifacts.tar.gz
  1. Artifact Storage System:
# artifact_storage.py
from typing import Optional
import boto3
import os

class ArtifactStorage:
    def __init__(self):
        self.s3 = boto3.client('s3')
        self.bucket = 'build-artifacts'
        self.local_cache = os.path.expanduser('~/.cache/artifacts')

    def store(self, build_hash: str, artifact_path: str):
        # Store locally
        local_path = f"{self.local_cache}/{build_hash}"
        os.makedirs(os.path.dirname(local_path), exist_ok=True)
        os.copy(artifact_path, local_path)

        # Store in S3
        self.s3.upload_file(artifact_path, self.bucket, build_hash)

    def retrieve(self, build_hash: str) -> Optional[str]:
        local_path = f"{self.local_cache}/{build_hash}"
        
        # Check local cache first
        if os.path.exists(local_path):
            return local_path

        # Try S3
        try:
            self.s3.download_file(self.bucket, build_hash, local_path)
            return local_path
        except:
            return None
  1. CI/CD Integration (GitHub Actions example):
# .github/workflows/build.yml
name: Build

on: [push, pull_request]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      
      - name: Cache artifacts
        uses: actions/cache@v2
        with:
          path: ~/.cache/artifacts
          key: ${{ runner.os }}-artifacts-${{ hashFiles('**/*') }}
      
      - name: Build
        run: ./build.sh

This solution provides:

  1. Centralized Compiler Configuration:
  • Stored in a separate repo
  • Easily updateable
  • Version controlled
  • Shared across all repos
  1. Efficient Caching:
  • Two-level cache (local and S3)
  • Hash-based artifact identification
  • Considers source code, dependencies, and compiler config
  1. Docker-based Building:
  • Consistent build environment
  • Isolated builds
  • Reproducible results
  1. CI/CD Integration:
  • Automated builds
  • Cache integration
  • Easy to extend

To retrieve specific versions:

# Retrieve by commit hash
./retrieve_artifact.sh ${COMMIT_HASH}

# Retrieve by tag
./retrieve_artifact.sh ${TAG}

# The retrieve script would:
1. Get the build hash for that version
2. Check local cache
3. Check S3 storage
4. Download and extract artifacts

For version tracking:

class VersionTracker:
    def __init__(self):
        self.dynamodb = boto3.resource('dynamodb')
        self.table = self.dynamodb.Table('version-tracking')

    def store_version(self, git_ref: str, build_hash: str):
        self.table.put_item(
            Item={
                'git_ref': git_ref,
                'build_hash': build_hash,
                'timestamp': datetime.now().isoformat()
            }
        )

    def get_build_hash(self, git_ref: str) -> Optional[str]:
        response = self.table.get_item(Key={'git_ref': git_ref})
        return response.get('Item', {}).get('build_hash')

This solution scales well and provides consistent builds across all repositories while maintaining independence between them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment