Skip to content

Instantly share code, notes, and snippets.

@rwcitek
Last active October 31, 2024 22:24
Show Gist options
  • Save rwcitek/3490a6949e36a5de8f7433c3c29f8b7d to your computer and use it in GitHub Desktop.
Save rwcitek/3490a6949e36a5de8f7433c3c29f8b7d to your computer and use it in GitHub Desktop.
Data only in Docker image

Creating a data-only Docker image

# Creating the data
echo "Hello, world" > data.txt

# Creating an image with the data
{ cat <<'eof'
from scratch
workdir /data
copy data.txt .
eof
} | docker image build -t data -f - .

# Docker silliness for creating a volume
docker container create --volume data:/data data : |
  xargs -r docker container rm

# Creating a new instance that "attaches" the volume with the data
docker container run --volume data:/data --rm ubuntu cat /data/data.txt

# Removing the volume
docker volume rm data

rm -f data.txt

Using a multi-stage build

# Creating an image with the data
{ cat <<'eof'
FROM ubuntu:22.04 as data
ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y curl
WORKDIR /data
RUN curl -L -s -o titanic.csv 'https://ddc-datascience.s3.amazonaws.com/Projects/Example/Data/Titanic.train.csv'
RUN curl -L -s -o a-z.01-1k.tsv 'https://ddc-datascience.s3.amazonaws.com/a-z.business/2023-08-21/01.1k.txt'
RUN curl -L -s -o a-z.combined.tsv 'https://ddc-datascience.s3.amazonaws.com/a-z.business/2023-08-21/combined.txt'

from scratch
copy --from=data /data /data
eof
} | docker image build -t data -f - .

# Docker silliness for creating a volume
docker container create --volume data:/data data : |
  xargs -r docker container rm

# Creating a new instance that "attaches" the volume with the data
docker container run --volume data:/data --rm ubuntu ls -la /data/

# Removing the volume
docker volume rm data
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment