I've compared mozjpeg, webp, jpeg2000 (openjpeg & an proprietary encoder), jpeg-xl and avif with a text-like image containing circles with different width and tint.
The generated 920×920 px images (with different noise levels) are compressed with
/opt/mozjpeg/bin/cjpeg -quant-table 4 -q 75
and the other methods are adjusted (quality level) to match file size of the mozjpeg output.
The original (excerpt stripe) of each noise level is testX.png.png .
All encoders are of Nov. 2022, but the proprietary jpeg2000 encoder, which is of 2020.
To me, it seems that webp gives the best result.
920×920 px test images are https://digi.ub.uni-heidelberg.de/diglitData/v/test1.png , https://digi.ub.uni-heidelberg.de/diglitData/v/test2.png , https://digi.ub.uni-heidelberg.de/diglitData/v/test3.png , https://digi.ub.uni-heidelberg.de/diglitData/v/test4.png