The following is based on this prompt to Claude:
Design a solution to build artifacts from several Git repos that all use the same compiler flags. We do not want to use a central makefile for this. The repos are all built using a specific Docker image that includes all the build tools. How should we store and cache the artifacts, and retrieve the relevant versions?
I'll design a solution that focuses on consistency, reusability, and efficient artifact management. Here's a comprehensive approach:
- Compiler Flags Management:
# compiler-config.yaml (stored in a separate config repo)
compiler:
version: "gcc-11.2" # We don't need to store tool versions, since they're part of the dev container image
flags:
common:
- "-O2"
- "-Wall"
- "-Wextra"
cpp:
- "-std=c++17"
c:
- "-std=c11"
defines:
- "ENABLE_LOGGING=1"
- Build Script Template (for each repo):
#!/bin/bash
# build.sh
# Download compiler config
curl -s "https://config-repo/compiler-config.yaml" > compiler-config.yaml
# Generate build hash based on:
# - compiler config
# - source files
# - dependencies
BUILD_HASH=$(cat compiler-config.yaml source-files dependencies | sha256sum | cut -d' ' -f1)
# Check if artifacts exist in cache
if check_artifact_cache ${BUILD_HASH}; then
download_artifacts ${BUILD_HASH}
exit 0
fi
# Build using Docker
docker run --rm \
-v "$(pwd):/src" \
-v "${HOME}/.cache/artifacts:/cache" \
build-tools:latest \
./docker-build.sh ${BUILD_HASH}
- Docker Build Script:
#!/bin/bash
# docker-build.sh
BUILD_HASH=$1
# Load compiler config
eval $(parse_yaml compiler-config.yaml)
# Set up compiler flags
export CFLAGS="${compiler_flags_common} ${compiler_flags_c}"
export CXXFLAGS="${compiler_flags_common} ${compiler_flags_cpp}"
# Build project
cmake -B build -S .
cmake --build build
# Package artifacts
tar czf artifacts.tar.gz build/
# Store in cache
store_artifact_cache ${BUILD_HASH} artifacts.tar.gz
- Artifact Storage System:
# artifact_storage.py
from typing import Optional
import boto3
import os
class ArtifactStorage:
def __init__(self):
self.s3 = boto3.client('s3')
self.bucket = 'build-artifacts'
self.local_cache = os.path.expanduser('~/.cache/artifacts')
def store(self, build_hash: str, artifact_path: str):
# Store locally
local_path = f"{self.local_cache}/{build_hash}"
os.makedirs(os.path.dirname(local_path), exist_ok=True)
os.copy(artifact_path, local_path)
# Store in S3
self.s3.upload_file(artifact_path, self.bucket, build_hash)
def retrieve(self, build_hash: str) -> Optional[str]:
local_path = f"{self.local_cache}/{build_hash}"
# Check local cache first
if os.path.exists(local_path):
return local_path
# Try S3
try:
self.s3.download_file(self.bucket, build_hash, local_path)
return local_path
except:
return None
- CI/CD Integration (GitHub Actions example):
# .github/workflows/build.yml
name: Build
on: [push, pull_request]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Cache artifacts
uses: actions/cache@v2
with:
path: ~/.cache/artifacts
key: ${{ runner.os }}-artifacts-${{ hashFiles('**/*') }}
- name: Build
run: ./build.sh
This solution provides:
- Centralized Compiler Configuration:
- Stored in a separate repo
- Easily updateable
- Version controlled
- Shared across all repos
- Efficient Caching:
- Two-level cache (local and S3)
- Hash-based artifact identification
- Considers source code, dependencies, and compiler config
- Docker-based Building:
- Consistent build environment
- Isolated builds
- Reproducible results
- CI/CD Integration:
- Automated builds
- Cache integration
- Easy to extend
To retrieve specific versions:
# Retrieve by commit hash
./retrieve_artifact.sh ${COMMIT_HASH}
# Retrieve by tag
./retrieve_artifact.sh ${TAG}
# The retrieve script would:
1. Get the build hash for that version
2. Check local cache
3. Check S3 storage
4. Download and extract artifacts
For version tracking:
class VersionTracker:
def __init__(self):
self.dynamodb = boto3.resource('dynamodb')
self.table = self.dynamodb.Table('version-tracking')
def store_version(self, git_ref: str, build_hash: str):
self.table.put_item(
Item={
'git_ref': git_ref,
'build_hash': build_hash,
'timestamp': datetime.now().isoformat()
}
)
def get_build_hash(self, git_ref: str) -> Optional[str]:
response = self.table.get_item(Key={'git_ref': git_ref})
return response.get('Item', {}).get('build_hash')
This solution scales well and provides consistent builds across all repositories while maintaining independence between them.