Skip to content

Instantly share code, notes, and snippets.

@brennanMKE
Created December 19, 2024 19:41
Show Gist options
  • Save brennanMKE/0c99ec72a637a4d4d70843962c0caa1d to your computer and use it in GitHub Desktop.
Save brennanMKE/0c99ec72a637a4d4d70843962c0caa1d to your computer and use it in GitHub Desktop.
Web Server Mapping Bluesky Profiles

Web Server Mapping Bluesky Profiles

With this solution, there is no CGI programming. One innefficency of CGI programming is that every request spawns another process which can be costly when there are many requests. Instead the work with this solution is done up front by generating the output from the JSON files to static files. Incoming requests are handled by rewrite rules supported by the web server which are handled in process and very quickly.

Changes can be made to the JSON files and re-run the generation script which will create output to a temporary directory and then rsync the changes to the directory used by the rewrite rules. This way removed profiles will also be removed from the web server.

(from ChatGPT)

Script: generate_profile_files.sh

#!/bin/bash

# Directory containing the JSON profile files
PROFILE_DIR="/var/www/profiles"

# Base directory for generated files
GENERATED_BASE="/var/www/generated"

# Temporary directory for generating output
TEMP_DIR=$(mktemp -d)

# Temporary directories for redirects and DID files
TEMP_REDIRECT_DIR="$TEMP_DIR/redirects"
TEMP_DID_DIR="$TEMP_DIR/dids"

# Create temporary directories
mkdir -p "$TEMP_REDIRECT_DIR" "$TEMP_DID_DIR"

# Loop through each domain folder
for domain in "$PROFILE_DIR"/*; do
    # Ensure it's a directory
    [[ -d "$domain" ]] || continue

    # Extract domain name (e.g., acme.com)
    domain_name=$(basename "$domain")

    # Create domain-specific directories in temporary folders
    mkdir -p "$TEMP_REDIRECT_DIR/$domain_name" "$TEMP_DID_DIR/$domain_name"

    # Process each JSON file in the domain folder
    for file in "$domain"/*.json; do
        # Extract the base filename (e.g., user from user.json)
        base=$(basename "$file" .json)

        # Extract properties from the JSON file
        url=$(jq -r '.url' "$file")
        did=$(jq -r '.["did-plc"]' "$file")

        # Generate the redirect file
        if [[ -n "$url" ]]; then
            echo -e "HTTP/1.1 302 Found\r\nLocation: $url\r\n" > "$TEMP_REDIRECT_DIR/$domain_name/$base"
        fi

        # Generate the DID file
        if [[ -n "$did" ]]; then
            echo "$did" > "$TEMP_DID_DIR/$domain_name/$base"
        fi
    done
done

# Use rsync to update the generated directory, removing outdated files
rsync -av --delete "$TEMP_REDIRECT_DIR/" "$GENERATED_BASE/redirects/"
rsync -av --delete "$TEMP_DID_DIR/" "$GENERATED_BASE/dids/"

# Remove the temporary directory
rm -rf "$TEMP_DIR"

Explanation of Changes

  1. Temporary Directory:

    • All output is generated in a temporary directory ($TEMP_DIR).
    • Prevents partially written files from affecting the live environment.
  2. rsync:

    • Synchronizes the temporary directories with the live directories (/var/www/generated/redirects and /var/www/generated/dids).
    • Uses the --delete option to remove files in the live directory that are not present in the temporary directory.
  3. Cleanup:

    • The temporary directory is deleted after synchronization.

Apache Configuration (Unchanged)

  1. Redirect URL Mapping:

    <VirtualHost *:80>
        ServerName example.com
        ServerAlias *.example.com
    
        DocumentRoot "/var/www/html"
    
        RewriteEngine On
    
        # Extract subdomain and domain
        RewriteCond %{HTTP_HOST} ^([a-zA-Z0-9_-]+)\.([a-zA-Z0-9.-]+)$
        RewriteRule ^/$ /generated/redirects/%2/%1 [L]
    </VirtualHost>
  2. DID Property Mapping:

    <VirtualHost *:80>
        ServerName example.com
        ServerAlias *.example.com
    
        DocumentRoot "/var/www/html"
    
        RewriteEngine On
    
        # Extract subdomain and domain
        RewriteCond %{HTTP_HOST} ^([a-zA-Z0-9_-]+)\.([a-zA-Z0-9.-]+)$
        RewriteRule ^/.well-known/atproto-did$ /generated/dids/%2/%1 [L]
    
        # Ensure text/plain content type for DID files
        <Files "/generated/dids/*">
            ForceType text/plain
        </Files>
    </VirtualHost>

Advantages of Using rsync

  1. File Synchronization:

    • Ensures that only current profiles are reflected in the generated directories.
    • Removes outdated files automatically.
  2. Efficiency:

    • Only copies files that have changed, minimizing overhead.
  3. Reliability:

    • Prevents disruptions caused by incomplete file writes.

Automating Updates

  1. Schedule the Script:

    • Run the script periodically to update the generated files.

    Example Cron Job:

    # Run the script every hour
    0 * * * * /path/to/generate_profile_files.sh
  2. Reload Apache:

    • Reload Apache to apply any structural changes (if necessary):
      sudo systemctl reload apache2

Testing the Setup

  1. Add, modify, and remove profiles in /var/www/profiles/<domain>.
  2. Run the script manually:
    bash /path/to/generate_profile_files.sh
  3. Verify that:
    • Redirect files and DID files are updated correctly in /var/www/generated.
    • Outdated files are removed.
  4. Test the endpoints:
    • https://user.acme.com/ should issue a 302 redirect.
    • https://user.acme.com/.well-known/atproto-did should return the did-plc value.

Final Directory Structure

After running the script:

/var/www/generated/
└── redirects/
    ├── acme.com/
    │   ├── user1
    │   └── user2
    └── example.com/
        ├── admin
        └── guest
└── dids/
    ├── acme.com/
    │   ├── user1
    │   └── user2
    └── example.com/
        ├── admin
        └── guest

This approach ensures the generated files are always up-to-date, reflecting additions, modifications, and deletions in the profiles directory. It’s efficient, reliable, and maintains clean synchronization between the profile source and the live files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment