This is a description of the mirroring script at tuna/tunasync-scripts#49 (and other related things)
Links:
- NixOS/nixpkgs#32659 (Asking about how to create a mirror)
- tuna/issues#323 (TUNA mirror request)
- tuna/tunasync-scripts#49 (PR with script to TUNA mirror)
- ustclug/mirrorrequest#165 (USTC mirror request)
- ustclug/ustcmirror-images#57 (PR with script to USTC mirror)
- http://www.shlug.org/discuss/2020/01/11/nixos_mirror.html (Discussion by @yuchangyuan, in Chinese)
- NixOS/nixpkgs#32659 (comment) (Request to review this gist)
- https://discourse.nixos.org/t/requesting-review-before-starting-nixos-mirror-in-china/5514 (Discourse)
- For each channel found from
https://nixos.org/channels
- If the channel is too old, skip. (Last updated before
2018-12-01
, there were quite a few channels and even a format change before that.) - Follow the channel URL and find the release version from redirected URL (like
nixos-19.09.1840.f7d050ed4e3
) - If the channel is already at this version, skip
- If there is a
.${channel}.update
symlink, mark this channel for binary cache update and skip the rest of this loop - Create a directory
releases/${version}
and download all files (including the most importantnixexprs.tar.xz
) (check hashes) that do not end in.ova
and.iso
(images go to a separate mirror). Also, replacebinary-cache-url
with mirror URL, saving the original. - Write a
releases/${version}/.released-time
file containing the time of release in%Y-%m-%d %H:%M
format. - If all hashes are fine, create a
.${channel}.update
symlinkreleases/${version}
pointing to and mark this channel for binary cache update.
- If the channel is too old, skip. (Last updated before
- For each channel that needs a binary cache update
- Find the original binary cache URL (avoiding hard-coding here)
- Download (only once for each run)
nix-cache-info
from the binary cache. - Download all paths from
store-paths.xz
usingxargs nix copy
into the binary cache (default location:store
). - If successful, move
.${channel}.update
to${channel}
, possibly replacing the original
- The script depends on Nix, Python 3 with
pyquery
andrequests
.- Nix is used for
nix copy
, which is probably the official way to mirror. Reimplementing these will require reimplementing at least: downloading and parsing metadata, skipping existing packages, recursively finding dependencies, retrying... pyquery
andrequests
are used to crawl HTML and they were chosen because other scripts on TUNA mirror already use them. (See below for installing Nix by just downloading and copying a few files)
- Nix is used for
- The script is hopefully relatively straightforward.
- The current design of this juggling of 'channel needs updating' concept and symlinks is because we want the binary cache to be updated before the user can see the new channel release.
- The channels are symlinked rather than redirected because TUNA cannot easily provide dynamically changing redirections.
- Random data files exist as hidden (dot prefixed) files in the mirror, such as
XDG_CACHE_HOME
which is set to<mirror_path>.cache
- https://github.com/NixIPFS/nixipfs-scripts is not used because it predates
nix copy
so reimplements a lot of it, which is (in the context of Nix 2.0) is unnecessary.
Currently we do not do any garbage collection for paths referred to from old releases, but do retain the information to do so.
Proposed changes:
- Track in a database the latest known time a path is referred to (timestamp for short).
- Each time a channel has its paths cloned, the path should have its timestamp updated to the release time of the channel, if either the path has not been seen before or the timestamp was older.
- When freeing old releases is needed, sort store paths by timestamp, and delete oldest store paths.
https://gist.github.com/dramforever/d2ff99318c70f44149db6070a87da5a0#gistcomment-3136152
Some thoughts about garbage collection:
- After
nix copy
each time, run commandxzcat release/${version}/store-paths.xz | xargs nix path-info -r LOCAL_STORE_URL | sort | uniq
(or other equivalent command) to generate a list of full paths, and save the list asrelease/${version}/full-paths.txt
. - For each time GC, target a
release/${version}
instead. We can comparefull-paths.txt
in thisrelease/${version}
with this file in otherrelease/${version}
, find out which store path is referenced only by therelease/${verson}
need to delete. - For each store path to delete, first delete corresponding
nar.xz
file, then deletenarinfo
file, when allnar.xz
andnarinfo
files are deleted, we can safely deleterelease/${version}
https://gist.github.com/dramforever/d2ff99318c70f44149db6070a87da5a0#gistcomment-3136166
- For each store path to delete, first delete corresponding
nar.xz
file, then deletenarinfo
file, when allnar.xz
andnarinfo
files are deleted, we can safely deletereleases/${version}
This should probably reversed. We should first delete releases/${version}
and then delete the binary cache files so that all available releases have binary cache files. Also do we need to delete releases/${version}
? Maybe we can ask users to always keep https://cache.nixos.org
as a backup cache for files we deleted? This way users can pin nixpkgs to a mirrored version.
(Should work for any alpine, just replace ustcmirror/base:alpine
)
FROM ustcmirror/base:alpine as fetcher
RUN wget https://nixos.org/releases/nix/nix-2.3.2/nix-2.3.2-x86_64-linux.tar.xz -O /tmp/nix.tar.xz && \
mkdir /tmp/nix.unpack && \
tar xpf /tmp/nix.tar.xz -C /tmp/nix.unpack && \
mkdir /nix && \
cp -dpr /tmp/nix.unpack/*/store /nix/store
FROM ustcmirror/base:alpine
COPY --from=fetcher /nix /nix
RUN ln -s /nix/store/*-nix-*/bin/* /usr/local/bin
# Required for Nix
RUN apk add ca-certificates
RUN apk add python3 py3-requests py3-pip py3-lxml
RUN pip3 install pyquery
Found from https://channels.nix.gsc.io/nixos-19.09/history-url.
Show
Timestamp | Version | Paths changed | Delta NarSize / GiB | Delta FileSize / GiB |
---|---|---|---|---|
2019-10-09 06:45 | 19.09.711.25757b66e18 | 44149 | 310.84 | 78.82 |
2019-10-09 10:50 | 19.09.714.2a5bfda3f43 | 1393 | 18.85 | 4.19 |
2019-10-09 16:35 | 19.09.716.88bbb3c8096 | 265 | 0.40 | 0.09 |
2019-10-10 06:55 | 19.09.735.8d0dc8d737c | 2214 | 26.68 | 8.06 |
2019-10-10 12:05 | 19.09.736.9bbad4c6254 | 189 | 0.09 | 0.01 |
2019-10-11 01:30 | 19.09.741.dbad7c7d59f | 195 | 0.10 | 0.01 |
2019-10-13 01:22 | 19.09.766.222004e52e8 | 1910 | 34.93 | 12.00 |
2019-10-13 07:55 | 19.09.789.7952807791d | 1386 | 11.48 | 4.15 |
2019-10-14 19:05 | 19.09.794.28d2548a03f | 43168 | 303.28 | 75.29 |
2019-10-15 04:50 | 19.09.809.5000b1478a1 | 793 | 8.32 | 2.16 |
2019-10-16 06:05 | 19.09.840.8bf142e001b | 1148 | 8.14 | 3.63 |
2019-10-21 19:35 | 19.09.891.80b42e630b2 | 25399 | 231.13 | 60.18 |
2019-10-22 23:35 | 19.09.907.f6dac808387 | 43095 | 302.80 | 75.16 |
2019-10-26 10:13 | 19.09.941.27a5ddcf747 | 391 | 2.05 | 0.44 |
2019-10-28 15:35 | 19.09.976.c75de8bc12c | 16923 | 206.30 | 61.45 |
2019-11-01 09:50 | 19.09.1019.c5aabb0d603 | 5784 | 98.47 | 35.20 |
2019-11-07 13:45 | 19.09.1098.821c7ed030b | 42962 | 301.78 | 74.93 |
2019-11-08 07:25 | 19.09.1125.d628521d0b7 | 1089 | 8.19 | 2.71 |
2019-11-08 22:50 | 19.09.1134.d9a83d34c8d | 695 | 9.55 | 3.26 |
2019-11-09 02:50 | 19.09.1149.107e2b7b29f | 249 | 0.77 | 0.22 |
2019-11-09 14:35 | 19.09.1155.bae4d7daa01 | 192 | 0.09 | 0.01 |
2019-11-10 08:05 | 19.09.1160.a22b0189002 | 341 | 2.38 | 0.75 |
2019-11-10 19:15 | 19.09.1172.2d896998dc9 | 30254 | 265.15 | 66.64 |
2019-11-12 06:50 | 19.09.1197.d493b97b265 | 781 | 5.41 | 1.90 |
2019-11-12 12:50 | 19.09.1208.ef8c34c4721 | 189 | 0.09 | 0.01 |
2019-11-13 12:55 | 19.09.1221.e6a37ef446f | 753 | 5.05 | 1.55 |
2019-11-13 13:55 | 19.09.1223.cb2cdab7136 | 198 | 0.44 | 0.07 |
2019-11-15 12:50 | 19.09.1232.133d836dafa | 238 | 0.94 | 0.24 |
2019-11-15 13:45 | 19.09.1241.259a67ca221 | 244 | 1.59 | 0.30 |
2019-11-15 16:45 | 19.09.1247.851d5bdfb04 | 193 | 0.21 | 0.04 |
2019-11-16 05:20 | 19.09.1254.9104be2ee08 | 245 | 0.35 | 0.06 |
2019-11-16 18:05 | 19.09.1258.07e66484e67 | 215 | 0.20 | 0.03 |
2019-11-19 17:55 | 19.09.1292.e1843646b04 | 1078 | 14.10 | 5.00 |
2019-12-09 12:37 | 19.09.1529.808d3c6d123 | 14951 | 201.92 | 60.46 |
2019-12-09 15:40 | 19.09.1548.3a1861fcabc | 715 | 3.84 | 1.51 |
2019-12-11 01:15 | 19.09.1549.45ea6092203 | 191 | 0.09 | 0.01 |
2019-12-14 12:15 | 19.09.1584.7351aa52acd | 41502 | 296.12 | 73.88 |
2019-12-14 20:35 | 19.09.1589.57b7b019812 | 192 | 0.09 | 0.01 |
2019-12-15 19:50 | 19.09.1590.d85e435b7bd | 191 | 0.09 | 0.01 |
2019-12-17 03:20 | 19.09.1594.fbe321e6669 | 226 | 0.62 | 0.18 |
2019-12-17 23:00 | 19.09.1618.c2ef0cee28a | 204 | 1.00 | 0.22 |
2019-12-18 00:00 | 19.09.1619.c337a7423bc | 284 | 0.71 | 0.22 |
2019-12-18 02:05 | 19.09.1620.d40f024a3ba | 190 | 0.09 | 0.01 |
2019-12-18 08:20 | 19.09.1625.0dc46b0e1c8 | 210 | 0.62 | 0.12 |
2019-12-19 00:50 | 19.09.1629.ce54d9601ea | 584 | 3.37 | 1.37 |
2019-12-19 15:45 | 19.09.1638.6655a13a56f | 288 | 0.77 | 0.22 |
2019-12-19 22:40 | 19.09.1647.2e73f72c87e | 202 | 0.33 | 0.09 |
2019-12-20 15:35 | 19.09.1654.dd26550fda5 | 240 | 0.64 | 0.24 |
2019-12-21 03:35 | 19.09.1662.8e4c9d15456 | 201 | 0.67 | 0.20 |
2019-12-21 18:40 | 19.09.1664.968381812b4 | 192 | 0.20 | 0.03 |
2019-12-22 05:15 | 19.09.1670.36aa728f2cd | 409 | 1.04 | 0.22 |
2019-12-22 19:35 | 19.09.1673.9bcf1148144 | 191 | 0.09 | 0.01 |
2019-12-23 21:15 | 19.09.1682.bfdae0860e4 | 714 | 4.10 | 1.58 |
2019-12-24 18:10 | 19.09.1685.e9ef090eb54 | 190 | 0.09 | 0.01 |
2019-12-26 08:05 | 19.09.1686.69ed29f5f41 | 240 | 0.29 | 0.04 |
2019-12-29 00:30 | 19.09.1687.c5d5561f772 | 196 | 0.40 | 0.10 |
2019-12-29 08:05 | 19.09.1690.0d9055a2ac2 | 189 | 0.09 | 0.01 |
2019-12-30 03:40 | 19.09.1693.eab4ee0c27c | 191 | 0.14 | 0.02 |
2020-01-03 03:40 | 19.09.1748.ad1e1af5ad3 | 11363 | 165.70 | 53.65 |
2020-01-04 10:10 | 19.09.1764.2d9454702e5 | 356 | 1.21 | 0.32 |
2020-01-04 22:40 | 19.09.1772.54c9e1f53a7 | 588 | 4.89 | 1.03 |
2020-01-05 08:35 | 19.09.1774.a3070689aef | 191 | 0.09 | 0.01 |
2020-01-06 01:40 | 19.09.1776.b926503738c | 190 | 0.18 | 0.02 |
2020-01-06 18:55 | 19.09.1778.db3e8325a9b | 528 | 3.05 | 1.26 |
2020-01-07 07:55 | 19.09.1781.d245ff1bb9b | 200 | 0.09 | 0.01 |
2020-01-07 15:50 | 19.09.1784.fd4ccdbe3a6 | 200 | 0.16 | 0.05 |
2020-01-08 20:15 | 19.09.1791.ac218438bdb | 1802 | 15.89 | 4.96 |
2020-01-09 04:40 | 19.09.1803.db5273ce2ab | 341 | 0.76 | 0.21 |
2020-01-09 09:50 | 19.09.1806.b047b7315d8 | 192 | 0.13 | 0.02 |
2020-01-10 04:30 | 19.09.1815.caad1a78c47 | 200 | 0.42 | 0.10 |
2020-01-11 11:35 | 19.09.1821.9f453eb97ff | 586 | 3.30 | 1.35 |
2020-01-12 10:05 | 19.09.1840.f7d050ed4e3 | 948 | 35.08 | 16.79 |
2020-01-13 12:15 | 19.09.1850.5dc4d071ffe | 686 | 4.33 | 1.57 |
2020-01-14 03:00 | 19.09.1861.eb65d1dae62 | 701 | 4.01 | 1.59 |
Some thoughts about garbage collection:
nix copy
each time, run commandxzcat release/${version}/store-paths.xz | xargs nix path-info -r LOCAL_STORE_URL | sort | uniq
(or other equivalent command) to generate a list of full paths, and save the list asrelease/${version}/full-paths.txt
.release/${version}
instead. We can comparefull-paths.txt
in thisrelease/${version}
with this file in otherrelease/${version}
, find out which store path is referenced only by therelease/${verson}
need to delete.nar.xz
file, then deletenarinfo
file, when allnar.xz
andnarinfo
files are deleted, we can safely deleterelease/${version}