The following guide is for setting up Docker with docker-compose v2 on Amazon Linux 2023. The steps are intendend for AL2023 on EC2 but should mostly work for the AL2023 VMs running on other hypervisors.
Get the current release:
rpm -q system-release --qf "%{VERSION}\n"
Find out the latest release:
sudo dnf check-release-update --latest-only --version-only
# Use the following for more verbose output
#sudo dnf check-release-update
To upgrade the host for the current release:
sudo dnf check-update --refresh
sudo dnf upgrade --refresh
To upgrade the host to the latest release:
#sudo touch /etc/dnf/vars/releasever && echo 'latest' | sudo tee /etc/dnf/vars/releasever
sudo dnf check-update --refresh --releasever=latest
sudo dnf upgrade --refresh --releasever=latest
Install the following packages, which are good to have installed:
sudo dnf install --allowerasing -y \
kernel-modules-extra \
dnf-plugins-core \
dnf-utils \
dnf-plugin-support-info \
git-core \
git-lfs \
grubby \
kexec-tools \
chrony \
audit \
dbus \
dbus-daemon \
polkit \
systemd-pam \
systemd-container \
udisks2 \
nss-util \
nss-tools \
dmidecode \
nvme-cli \
lvm2 \
dosfstools \
e2fsprogs \
xfsprogs \
xfsprogs-xfs_scrub \
attr \
acl \
shadow-utils \
shadow-utils-subid \
fuse3 \
squashfs-tools \
star \
gzip \
pigz \
bzip2 \
zstd \
xz \
unzip \
p7zip \
numactl \
iproute \
iproute-tc \
iptables-nft \
nftables \
conntrack-tools \
ipset \
ethtool \
net-tools \
iputils \
traceroute \
mtr \
telnet \
whois \
socat \
bind-utils \
tcpdump \
cifs-utils \
nfsv4-client-utils \
nfs4-acl-tools \
libseccomp \
psutils \
python3 \
python3-pip \
python3-policycoreutils \
policycoreutils-python-utils \
bash-completion \
vim-minimal \
wget \
jq \
awscli-2 \
ec2rl \
ec2-utils \
htop \
sysstat \
fio \
inotify-tools \
rsync
sudo dnf install --allowerasing -y ec2-instance-connect ec2-instance-connect-selinux
sudo dnf install --allowerasing -y amazon-efs-utils
Amazon Linux now ships with the smart-restart package, which the smart-restart utility restarts systemd services on system updates whenever a package is installed or deleted using the systems package manager. This occurs whenever a dnf <update|upgrade|downgrade>
is executed.
The smart-restart uses the needs-restarting from the dnf-utils package and a custom denylisting mechanism to determine which services need to be restarted and whether a system reboot is advised. If a system reboot is advised, a reboot hint marker file is generated (/run/smart-restart/reboot-hint-marker).
sudo dnf install --allowerasing -y smart-restart python3-dnf-plugin-post-transaction-actions
After the installation, the subsequent transactions will trigger the smart-restart logic.
Run the following command to install and enable the kernel live patching feature:
sudo dnf install --allowerasing -y kpatch-dnf kpatch-runtime
sudo dnf kernel-livepatch -y auto
sudo systemctl enable --now kpatch.service
Run the following command to remove the EC2 Hibernation Agent:
sudo dnf remove -y ec2-hibinit-agent
Install the Amazon SSM Agent:
sudo dnf install --allowerasing -y amazon-ssm-agent
The following is a tweak, which should resolve the following reported issue.
- https://repost.aws/questions/QU_tj7NQl6ReKoG53zzEqYOw/amazon-linux-2023-issue-with-installing-packages-with-cloud-init
- amazonlinux/amazon-linux-2023#397
Add the following drop-in to make sure networking is up, dns resolution works and cloud-init has finished before the amazon ssm agent is started.
sudo mkdir -p /etc/systemd/system/amazon-ssm-agent.service.d
cat <<'EOF' | sudo tee /etc/systemd/system/amazon-ssm-agent.service.d/00-override.conf
[Unit]
# To have a service start after cloud-init.target it requires the
# addition of DefaultDependencies=no due to the following default
# DefaultDependencies=y, which results in the default target e.g.
# multi-user.target to depending on the service.
#
# See the follow for more details: https://serverfault.com/a/973985
Wants=network-online.target
After=network-online.target nss-lookup.target cloud-init.target
DefaultDependencies=no
ConditionFileIsExecutable=/usr/bin/amazon-ssm-agent
EOF
sudo systemctl daemon-reload
sudo systemctl enable --now amazon-ssm-agent.service
sudo systemctl try-reload-or-restart amazon-ssm-agent.service
sudo systemctl status amazon-ssm-agent.service
Install the Unified CloudWatch Agent:
sudo dnf install --allowerasing -y amazon-cloudwatch-agent collectd
Add the following drop-in to make sure networking is up, dns resolution works and cloud-init has finished before the unified cloudwatch agent is started.
sudo mkdir -p /etc/systemd/system/amazon-cloudwatch-agent.d
cat <<'EOF' | sudo tee /etc/systemd/system/amazon-cloudwatch-agent.d/00-override.conf
[Unit]
# To have a service start after cloud-init.target it requires the
# addition of DefaultDependencies=no due to the following default
# DefaultDependencies=y, which results in the default target e.g.
# multi-user.target depending on the service.
#
# See the follow for more details: https://serverfault.com/a/973985
Wants=network-online.target
After=network-online.target nss-lookup.target cloud-init.target
DefaultDependencies=no
ConditionFileIsExecutable=/opt/aws/amazon-cloudwatch-agent/bin/start-amazon-cloudwatch-agent
EOF
sudo systemctl daemon-reload
sudo systemctl enable --now amazon-cloudwatch-agent.service
sudo systemctl try-reload-or-restart amazon-cloudwatch-agent.service
sudo systemctl status amazon-cloudwatch-agent.service
The current version of the CloudWatchAgentServerPolicy
:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"cloudwatch:PutMetricData",
"ec2:DescribeVolumes",
"ec2:DescribeTags",
"logs:PutLogEvents",
"logs:DescribeLogStreams",
"logs:DescribeLogGroups",
"logs:CreateLogStream",
"logs:CreateLogGroup"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"ssm:GetParameter"
],
"Resource": "arn:aws:ssm:*:*:parameter/AmazonCloudWatch-*"
}
]
}
Run the following to install ansible on the host:
sudo dnf install -y \
python3-psutil \
ansible \
ansible-core \
sshpass
Locale:
sudo localectl set-locale LANG=en_US.UTF-8
localectl
Hostname:
sudo hostnamectl set-hostname <hostname>
sudo hostnamectl set-chassis vm
hostnamectl
Set the system timezone to UTC and ensure chronyd is enabled and started:
sudo timedatectl set-timezone Etc/UTC
sudo systemctl enable --now chronyd
sudo timedatectl set-ntp true
timedatectl
Logging:
sudo mkdir -p /etc/systemd/journald.conf.d
cat <<'EOF' | sudo tee /etc/systemd/journald.conf.d/00-override.conf
[Journal]
SystemMaxUse=100M
RuntimeMaxUse=100M
RuntimeMaxFileSize=10M
RateLimitIntervals=1s
RateLimitBurst=10000
EOF
sudo systemctl daemon-reload
sudo systemctl restart systemd-journald.service
touch ~/.{profile,bashrc,bash_profile,bash_login,bash_logout,hushlogin}
mkdir -pv "${HOME}/bin"
mkdir -pv "${HOME}/.config/environment.d"
mkdir -pv "${HOME}/.config/systemd/user"
mkdir -pv "${HOME}/.config/systemd/user/sockets.target.wants"
mkdir -pv "${HOME}/.local/share/systemd/user"
mkdir -pv "${HOME}/.local/bin"
#cat <<'EOF' | tee ~/.config/environment.d/environment_vars.conf
#PATH="${HOME}/bin:${HOME}/.local/bin:${PATH}"
#
#EOF
loginctl enable-linger $(whoami)
systemctl --user daemon-reload
If you need to switch to root user, use the following instead of sudo su - <user>
.
# sudo machinectl shell <username>@
sudo machinectl shell root@
Run the following command to install moby aka docker:
sudo dnf install --allowerasing -y \
docker \
containerd \
runc \
container-selinux \
cni-plugins \
oci-add-hooks \
amazon-ecr-credential-helper \
udica
Configure the following docker daemon settings:
sudo mkdir -p /etc/docker
cat <<'EOF' | sudo tee /etc/docker/daemon.json
{
"debug": false,
"experimental": false,
"exec-opts": ["native.cgroupdriver=systemd"],
"userland-proxy": false,
"live-restore": true,
"log-level": "warn",
"log-driver": "json-file",
"log-opts": {
"max-size": "100m",
"max-file": "3"
}
}
EOF
- https://docs.docker.com/reference/cli/dockerd/#daemon-configuration-file
- https://docs.docker.com/config/containers/logging/awslogs/
Add the current user e.g. ec2-user
to the docker group:
sudo usermod -aG docker $USER
Enable and start the docker service:
sudo systemctl enable --now docker
sudo systemctl status docker
Install the Docker Compose plugin with the following commands:
# Install the docker compose plugin for all users
sudo mkdir -p /usr/local/lib/docker/cli-plugins
sudo curl -sL https://github.com/docker/compose/releases/latest/download/docker-compose-linux-"$(uname -m)" \
-o /usr/local/lib/docker/cli-plugins/docker-compose
# Set ownership to root and make executable
test -f /usr/local/lib/docker/cli-plugins/docker-compose \
&& sudo chown root:root /usr/local/lib/docker/cli-plugins/docker-compose
test -f /usr/local/lib/docker/cli-plugins/docker-compose \
&& sudo chmod +x /usr/local/lib/docker/cli-plugins/docker-compose
(Optional) To install for the local user, run the following commands:
mkdir -p "${HOME}/.docker/cli-plugins" \
&& touch "${HOME}/.docker/config.json"
cp /usr/local/lib/docker/cli-plugins/docker-compose "${HOME}/.docker/cli-plugins/docker-compose"
cat <<'EOF' | tee -a "${HOME}/.bashrc"
XDG_CONFIG_HOME="${HOME}/.config"
XDG_DATA_HOME="${HOME}/.local/share"
XDG_RUNTIME_DIR="${XDG_RUNTIME_DIR:-/run/user/$(id -u)}"
DBUS_SESSION_BUS_ADDRESS="unix:path=${XDG_RUNTIME_DIR}/bus"
export XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR DBUS_SESSION_BUS_ADDRESS
#DOCKER_CONFIG=/usr/local/lib/docker
DOCKER_CONFIG="${DOCKER_CONFIG:-$HOME/.docker}"
DOCKER_TLS_VERIFY=1
export DOCKER_CONFIG DOCKER_TLS_VERIFY
#DOCKER_HOST="unix:///run/user/$(id -u)/docker.sock"
#export DOCKER_HOST
EOF
Verify the plugin is installed correctly with the following command(s):
docker compose version
(Optional) Install docker scout with the following commands:
<commands goes here>
Note: You can safely skip this step as it should not be necessary due to the version of Moby shipped in AL2023 bundling the buildx plugin by default.
(Optional) Install the docker buildx plugin with the following commands:
sudo curl -sSfL 'https://github.com/docker/buildx/releases/download/v0.14.0/buildx-v0.14.0.linux-amd64' \
-o /usr/local/lib/docker/cli-plugins/docker-buildx
#sudo curl -sL https://github.com/docker/compose/releases/latest/download/docker-buildx-linux-"$(uname -m)" \
# -o /usr/local/lib/docker/cli-plugins/docker-buildx
# Set ownership to root and make executable
test -f /usr/local/lib/docker/cli-plugins/docker-buildx \
&& sudo chown root:root /usr/local/lib/docker/cli-plugins/docker-buildx
test -f /usr/local/lib/docker/cli-plugins/docker-buildx \
&& sudo chmod +x /usr/local/lib/docker/cli-plugins/docker-buildx
cp /usr/local/lib/docker/cli-plugins/docker-buildx "${HOME}/.docker/cli-plugins/docker-buildx"
docker buildx install
This is mostly optional if needed, otherwise you can just skip this one.
sudo dnf install --allowerasing -y aws-nitro-enclaves-cli aws-nitro-enclaves-cli-devel
sudo usermod -aG ne $USER
sudo systemctl enable --now nitro-enclaves-allocator.service
- https://docs.aws.amazon.com/enclaves/latest/user/nitro-enclave-cli-install.html
- https://github.com/aws/aws-nitro-enclaves-cli
To install the Nvidia drivers:
sudo dnf install -y wget kernel-modules-extra kernel-devel gcc
Download the driver install script, run it and verify:
curl -sL 'https://us.download.nvidia.com/tesla/535.161.08/NVIDIA-Linux-x86_64-535.161.08.run' -O
sudo sh NVIDIA-Linux-x86_64-535.161.08.run -a -s --ui=none -m=kernel-open
nvidia-smi
For the Nvidia container runtime:
curl -sL 'https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo' | sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
sudo dnf check-update
sudo dnf install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
To create an Ubuntu based container with access to the host GPUs:
docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
# configure region
aws configure set default.region $(curl --noproxy '*' -w "\n" -s -H "X-aws-ec2-metadata-token: $(curl --noproxy '*' -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")" http://169.254.169.254/latest/dynamic/instance-identity/document | jq -r .region)
# use regional endpoints
aws configure set default.sts_regional_endpoints regional
# get credentials from imds
aws configure set default.credential_source Ec2InstanceMetadata
# get credentials last for 1hr
aws configure set default.duration_seconds 3600
# set default pager
aws configure set default.cli_pager ""
# set output to json
aws configure set default.output json
Verify:
aws configure list
aws sts get-caller-identity
Login to the AWS ECR service:
aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws
To create an AL2023 based container:
docker pull public.ecr.aws/amazonlinux/amazonlinux:2023
docker run -it --security-opt seccomp=unconfined public.ecr.aws/amazonlinux/amazonlinux:2023 /bin/bash
- https://docs.aws.amazon.com/linux/al2023/release-notes/relnotes.html
- https://docs.aws.amazon.com/linux/al2023/ug/deterministic-upgrades-usage.html
- Manage package and operating system updates in AL2023
- https://mobyproject.org/
- https://github.com/docker/docker-install
- https://github.com/docker/docker-ce-packaging
- https://download.docker.com/linux/static/stable/
- https://docs.docker.com/compose/install/linux/
- https://github.com/docker/compose/
- https://github.com/docker/docker-credential-helpers
- https://github.com/docker/buildx