Skip to content

Instantly share code, notes, and snippets.

@jan-krueger
Last active February 8, 2025 15:08
Show Gist options
  • Save jan-krueger/879881d05ab9e8a5bacdc97d8e579857 to your computer and use it in GitHub Desktop.
Save jan-krueger/879881d05ab9e8a5bacdc97d8e579857 to your computer and use it in GitHub Desktop.
Script to quickly deploy the archive warrior on an Ubuntu or Debian machine

Archive.org Warrior Deployment

Overview

This script automates the deployment of Archive.org Warrior containers for ArchiveTeam web scraping projects. It helps you contribute to projects like usgovernment by deploying containers with a custom downloader username.

Purpose

The Warrior is used to collect and archive large volumes of web data. By deploying containers, you contribute to preserving important web content.

Prerequisites

  • A Linux system (Ubuntu or Debian)
  • Docker (automatically installed by the script)
  • bash (for executing the script)

Usage

bash deploy-warrior.sh <PROJECT> <DOWNLOADER> [REPLICAS]

or

curl -sSL https://archive-warrior.krueger-jan.de/quick.sh -o quick.sh && bash quick.sh

The script will ask you for the project you want to contribute to, your username, and how many replicas to create.

#!/bin/bash
set -e
usage() {
echo "Usage: $0 <PROJECT> <DOWNLOADER> <REPLICAS>"
echo " PROJECT - Name of the project to run (required)"
echo " DOWNLOADER - Username for the downloader (required)"
echo " REPLICAS - Number of warrior instances (required)"
exit 1
}
if [ -z "$1" ] || [ -z "$2" ] || [ -z "$3" ]; then
echo "Error: Missing required arguments."
usage
fi
PROJECT=$1
DOWNLOADER=$2
REPLICAS=$3
# Remove conflicting packages
for pkg in docker.io docker-doc docker-compose podman-docker containerd runc; do
sudo apt-get remove -y $pkg || true
done
# Add Docker's official GPG key
sudo apt-get update
sudo apt-get install -y ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
OS=$(lsb_release -is 2>/dev/null || . /etc/os-release && echo "$ID")
if [[ "$OS" == "Ubuntu" ]]; then
sudo curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "${UBUNTU_CODENAME:-$VERSION_CODENAME}") stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
elif [[ "$OS" == "Debian" ]]; then
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc
sudo curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc
echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
else
echo "Unsupported OS: $OS"
exit 1
fi
# Update package list
sudo apt-get update -y
# Install Docker and iptables
echo "Installing Docker and iptables with no prompts..."
apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
# Create docker-compose.yml
cat > docker-compose.yml <<EOF
networks:
main:
enable_ipv6: true
driver: bridge
services:
watchtower:
image: containrrr/watchtower
restart: on-failure
volumes:
- /var/run/docker.sock:/var/run/docker.sock
command:
- --label-enable
- --include-restarting
- --cleanup
- --interval
- "3600"
archiveteam-warrior:
image: atdr.meo.ws/archiveteam/warrior-dockerfile
restart: on-failure
networks:
- main
ports:
- "8001-9000:8001"
labels:
com.centurylinklabs.watchtower.enable: "true"
logging:
driver: json-file
options:
max-size: "50m"
environment:
DOWNLOADER: "$DOWNLOADER"
SELECTED_PROJECT: "$PROJECT"
CONCURRENT_ITEMS: 6
deploy:
mode: replicated
replicas: $REPLICAS
endpoint_mode: vip
EOF
docker compose up -d
echo "Deployment complete: $REPLICAS warriors running for project '$PROJECT' under downloader '$DOWNLOADER'."
#!/bin/bash
set -e
curl -sSL https://gist.githubusercontent.com/jan-krueger/879881d05ab9e8a5bacdc97d8e579857/raw/deploy-archive-warrior.sh -o deploy-archive-warrior.sh && chmod +x deploy-archive-warrior.sh && echo "Enter project name:" && read PROJECT && echo "Enter downloader username:" && read DOWNLOADER && echo "Enter number of replicas (default 1):" && read REPLICAS && ./deploy-archive-warrior.sh $PROJECT $DOWNLOADER ${REPLICAS:-1}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment