https://untrusted.website/@mr_daemon
This roughly describes the process I use to dump my old CDs to image files while attempting to retain as much of the original data as possible.
This guide is very terminal oriented but it should remain accessible.
This covers mostly data cds such as games and software, and has mixed results with some copy protections, even if you dump subchannel data.
Some old protection schemes like SafeDisc and SecuROM are not really usable from image formats most things expect, so YMMV.
As a general rule, this guide should also work for data DVD media. I don't dump Video or Audio DVD much if at all, so you are on your own there.
All the formats below are intended to be lossless and are suitable for usage in things like DOSBox or 86box, or even to be written back to a CD one day, if you find new old stock.
I'm using a cheap TSSTcorp usb DVD writer attached to my Linux laptop. This is the same controller in every chinesium drive available on Amazon for the price of a fancy sandwich, and it will biodegrade if you expose it to direct sunlight. If you have better, feel free to use it.
All of this is likely available in your distribution repositories. Sometimes some are packaged together, even.
disktype
for identifying the type of CDcdrdao
for dumping hybrid images or anything with a weird layoutreadom
for dumping straightforward ISOs, as it does checks, part of the wodim packageddrescue
for attempting to salvage degraded or otherwise beat up CDscdparanoia
for dumping audio tracks as files (optional)flac
for losslessly compressing the audio tracks (optional)sha256sum
for checksum generation/validation (optional)
On Debian and Debian-likes (Ubuntu etc), you can install all of these with
$ sudo apt install disktype cdrdao wodim gddrescue cdparanoia flac coreutils
If you're on Windows or Mac OS, all of this probably works but you'll be on your own to figure out the exact semantics of accessing your optical drive from cygwin or whatnot. There are likely better guides and nice frontends for this too.
Assuming your optical drive presents as /dev/sr0
, you can use disktype
to determine the type of CD you're dealing with. For standard stuff with
just a data track, even if it has both Apple HFS and iso9660 sections, you
can just straight up dump the iso.
For anything that is mixed mode, has a weird layout, or especially has audio track with a lead out that extends past the data track, you'll need to use cdrdao and also save a TOC or CUE file that documents how to piece it together.
Example of a traditional data CD that can be just dumped with readom
:
$ disktype /dev/sr0
--- /dev/sr0
Block device, size 241.9 MiB (253599744 bytes)
CD-ROM, 1 track, CDDB disk ID 02067301
Track 1: Data track, 241.9 MiB (253599744 bytes)
ISO9660 file system
Volume name "AFTERLIFE"
Preparer "OPTICAL MEDIA QUICKTOPIX 2.20"
Data size 241.3 MiB (252981248 bytes, 123526 blocks of 2 KiB)
Example below of a hybrid cd that supports both Mac and Windows. This can also just
be straight dumped with readom
and will absolutely contain everything.
$ disktype /dev/sr0
--- /dev/sr0
Block device, size 598.1 MiB (627159040 bytes)
CD-ROM, 1 track, CDDB disk ID 020FF301
Track 1: Data track, 598.1 MiB (627159040 bytes)
Apple partition map, 2 entries
Partition 1: 1 KiB (1024 bytes, 2 sectors from 1)
Type "Apple_partition_map"
Partition 2: 384.9 MiB (403588608 bytes, 788259 sectors from 436051)
Type "Apple_HFS"
HFS file system
Volume name "Adobe<A8> Photoshop<A8> 5.0 LE"
Volume size 384.9 MiB (403578880 bytes, 49265 blocks of 8 KiB)
ISO9660 file system
Volume name "PHOTOSLE"
Application "TOAST ISO 9660 BUILDER COPYRIGHT (C) 1997 ADAPTEC, INC. - HAVE A NICE DAY"
Data size 597.8 MiB (626847744 bytes, 306078 blocks of 2 KiB)
Joliet extension, volume name "PHOTOSLE"
Example of a mixed mode CD that needs cdrdao
since it contains audio tracks:
$ disktype /dev/sr0
--- /dev/sr0
Block device, size 624.8 MiB (655173632 bytes)
CD-ROM, 11 tracks, CDDB disk ID A010A90B
Track 1: Data track, 106.6 MiB (111755264 bytes)
ISO9660 file system
Volume name "QUAKE101"
Data size 105.7 MiB (110837760 bytes, 54120 blocks of 2 KiB)
Track 2: Audio track, 51.92 MiB (54437040 bytes), 5 min 08 sec
Track 3: Audio track, 24.56 MiB (25756752 bytes), 2 min 26 sec
Track 4: Audio track, 84.16 MiB (88244688 bytes), 8 min 20 sec
Track 5: Audio track, 61.47 MiB (64454208 bytes), 6 min 05 sec
Track 6: Audio track, 74.79 MiB (78425088 bytes), 7 min 24 sec
Track 7: Audio track, 87.18 MiB (91415184 bytes), 8 min 38 sec
Track 8: Audio track, 56.51 MiB (59256288 bytes), 5 min 35 sec
Track 9: Audio track, 65.43 MiB (68605488 bytes), 6 min 28 sec
Track 10: Audio track, 35.76 MiB (37495584 bytes), 3 min 32 sec
Track 11: Audio track, 53.40 MiB (55991712 bytes), 5 min 17 sec
In summary: if it has only one track, even if that track contains multiples volumes, straight up readom
or dd
will do the trick.
If it has multiple tracks you will need cdrdao
and a toc or cue file to indicate how they are structured inside the image blob.
For a straightforward ISO dump, you can use readom
:
$ readom retries=4 dev=/dev/sr0 f=imagename.iso
This will save the entire CD to imagename.iso
, with minimal error correction
and retries. You can try to increase the number of retries, but it is likely
to not help much. If this fails early due to read errors, you can try ddrescue
.
$ ddrescue -b 2048 -r4 -v /dev/sr0 imagename.iso imagename.map
We read using 2048 byte blocks, which is standard for CDs, with 4 retries. The peculiarity here is the map file, which keeps tracks of what was read and what wasn't. This allows you to try to read only the missing bits again, either after cleaning the media, or in a different drive.
To do another pass, simply run it again with the same iso and map file.
There are a few options you can twiddle to do further passes, but this is beyond the scope of this document.
For a mixed mode CD, you'll need to use cdrdao
to dump the audio tracks
and the data track separately. It will also generate a TOC file for you that
describes the layout of the tracks, so it can be reproduced if you ever write
it back to a CD.
$ cdrdao read-cd --read-raw --driver generic-mmc:0x20000 --device /dev/sr0 --datafile imagename.bin imagename.toc
Since TOC files, while great, are not really supported by anything of interest
these days beyond cdrdao itself, you can also use the toc2cue
tool to convert
the TOC file to a CUE file that is more widely supported. This tool is part of
the cdrdao
package and so you should have it already by this point.
$ toc2cue imagename.toc imagename.cue
I would recommend keeping both the TOC and CUE files, as the TOC file is more accurate and complete, but the CUE file is more widely supported.
IMPORTANT: By default, the audio tracks will be saved with some whack1 byte order
that is also not really understood by anything either except cdrdao, again. Make sure
to pay special attention to the --driver
option and especially the 0x20000
parameter,
otherwise the audio tracks will come out as static garbage when played back or written
from the CUE file. You could omit the parameter to store an image that is very accurate
but also of very limited use, practically. Unless your thing eats TOC files, this is
probably not what you want.
If you want to dump the audio tracks as files, you can use cdparanoia
.
I tend to do this to also keep these as useful files for listening to or whenever
a modern source port of a game can use them directly for the soundtrack.
Just make sure to not dump the first track, which contains data. We can do this by specifiying the range like so:
$ mkdir audio
$ cd audio/
$ cdparanoia -B "2-"
You'll end up with a directory full of WAV files. If you want, you can losslessly compress them to flac (or something else) and keep that instead.
$ flac *.wav
$ rm -f *.wav
To ensure the dumps remain fresh and uncorrupted, I would recommend saving
checksums of the images. I use SHA256
because we're in the modern era.
$ sha256sum * | tee SHA256
fa723eabc28fd8fdc3333034b70c7a9f459608b40af119b137dd64f6bceadc57 quake106.bin
a78aa83f731affbb886ae831598016e5369cc28586baf1235c0f55dda09c1496 quake106.cue
422b90fc21a30f9b0ba2a57f8866122070643bbf4fd1eaaff656dcfbc6695570 quake106.toc
This can be later on verified like so:
$ sha256sum -c SHA256
quake106.bin: OK
quake106.cue: OK
quake106.toc: OK
$ cdrdao read-cd --read-raw --driver generic-mmc:0x20000 --device /dev/sr0 --datafile imagefile.bin imagefile.toc && toc2cue imagefile.toc imagefile.cue && mkdir audio && pushd audio && cdparanoia -B "2-" && flac -8 track*.wav && rm -f *.wav && popd && (sha256sum * | tee SHA256) && eject
Hope this helps! Don't forget to upload cool weird stuff you find to the internet archive if it isn't already there, someone's trash driver CD is another's treasure.
Footnotes
-
By default, it's Big-Endian, just like how physical CDs present. ↩