The Illustrated Transformer is fantastic, but I would suggest that those going into it really should read the previous articles in the series to get a foundation to understand it more, plus later articles that go into GPT and BERT, here's the list:
- A Visual and Interactive Guide to the Basics of Neural Networks
- A Visual And Interactive Look at Basic Neural Network Math
- Visualizing A Neural Machine Translation Model (Mechanics of Seq2seq Models With Attention)
- The Illustrated Transformer
- The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning)
- The Illustrated GPT-2 (Visualizing Transformer Language Models)
- How GPT3 Works - Visualizations and Animations
- The Illustrated Retrieval Transformer
- The Illustrated Stable Diffusion
If you want to learn how to code them, this book is great