- The AI Hierarchy of Needs
- The Rise of Data Engineer
- The Downfall of the Data Engineer
- A Beginner’s Guide to Data Engineering
- Functional Data Engineering — a modern paradigm for batch data processing
- Algorithmic Toolbox in Russian
- Data Structures in Russian
- Data Structures & Algorithms Specialization on Coursera
- Algorithms Specialization from Stanford on Coursera
- Comprehensive SQL Tutorial by Mode Analytics
- SQL Practice on Leetcode
- Modern SQL a website about modern SQL syntax
- Scala School by Twitter
- Fluent Python intermediate level book about Python
- Intro to Scala in Russian on Stepik by Tinkoff Bank
- The Hitchhiker’s Guide to Python by Kenneth Reitz & Tanya Schlusser
- Intro to Database Systems by Carnegie Mellon University
- Advanced Database Systems by Carnegie Mellon University
- On Disk IO
- Distributed systems for fun and profit by Mikito Takada
- Distributed Systems by by Maarten van Steen & Andrew S. Tanenbaum
- CS 436: Distributed Computer Systems by University of Waterloo
- Design Data-Intensive Applications by Martin Kleppmann
- Introduction to Algorithms by Thomas Cormen
- The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling
- Star Schema The Complete Reference
- Big Data for Data Engineers Specialization by Yandex
- Data Engineering on Google Cloud Platform Specialization by Google
- Data Engineer Nanodegree by Udacity
- Martin Kleppmann author of Designing Data-Intensive Application
- BaseDS by Vaidehi Joshi about Distributed Systems
- Apache Airflow is a platform to programmatically author, schedule and monitor workflows in Python
- Apache Spark is a unified analytics engine for large-scale data processing
- Apache Kafka is a distributed streaming platform
- Luigi is a Python package that helps you build complex pipelines of batch jobs.