Topics to read
- Scalars, Vectors, Matrices
- Matrix addition and multiplication
- Dot product
- Norms (L1, L2)
- Linear transformations
- Shapes of matrices
- Dense (Fully Connected) layer equation
Correct Reference
- Essence of Linear Algebra — 3Blue1Brown (not “shlu one brown”)
Checkpoint
- Understand this equation and shapes
y = W x + b
# Example:
# x : input vector of shape (3, 1)
# W : weight matrix of shape (1, 3)
# b : bias of shape (1,)
# Output y is a single number
Topics to read
-
Random variables
-
Mean, Median, Mode
-
Variance, Standard Deviation
-
Probability distributions
- Bernoulli
- Gaussian (Normal)
- Categorical
-
Bayes Theorem
-
Correlation
-
Histogram
-
Confidence Interval
-
Overfitting vs Underfitting
Corrections
- “barnoli” → Bernoulli
- “gshian” → Gaussian
- “base rule” → Bayes Rule
- “quantiz” → Quantiles
Checkpoint
# Example:
Marks = [40, 50, 60, 70, 80]
Mean = 60
Histogram shows how many students fall in each range
Topics to read
- Derivative as rate of change
- Partial derivatives
- Gradient
- Chain rule
- Gradient Descent (intuition only)
Correct Reference
- Essence of Calculus — 3Blue1Brown
Checkpoint
# Example:
Loss = (prediction - actual)^2
Gradient descent:
Move weights in direction where loss reduces
Topics to read
- Variables, data types
- Lists, tuples, dictionaries, sets
- Loops and conditions
- Functions
- Classes (basic OOP)
- File input/output
- Virtual environments
References
- Programming with Mosh
- freeCodeCamp (YouTube)
Example
data = [1, 2, 3, 4]
total = sum(data)
- Arrays
- Indexing & slicing
- Broadcasting
- DataFrame
- Filtering
- GroupBy
- Joins
- Missing values
- Matplotlib (basic plots)
- PyTorch tensors
Correction
- “Mattplot lily” → Matplotlib
Example
import pandas as pd
df = pd.DataFrame({
"age": [20, 25, 30],
"salary": [20000, 30000, 40000]
})
df[df["age"] > 22]
- Train / Test split
- Cross-validation
- Bias–Variance trade-off
- Overfitting vs Underfitting
Regression
- MSE
- MAE
- R²
Classification
- Accuracy
- Precision
- Recall
- F1-score
- ROC-AUC
- Linear Regression
- Ridge Regression
- Lasso Regression
- Logistic Regression
- K-Nearest Neighbors
- Decision Trees
- Random Forest
- Gradient Boosted Trees
- XGBoost / LightGBM
- Support Vector Machines (conceptual)
Practice Rule
- Use scikit-learn
- Apply each on 2–3 datasets
- Tune hyperparameters
- Plot learning curves
Reference
- Machine Learning Specialization — Andrew Ng (Coursera)
Topics
- K-Means Clustering
- PCA (Dimensionality Reduction)
- t-SNE (visualization only)
- UMAP (visualization only)
Focus
- When to use clustering vs classification
- PCA for preprocessing, not solving problems
Example
# Example:
Customer data without labels
Use K-Means to group similar customers
Example Projects
- Loan default prediction
- Fraud detection
Steps
- Define problem
- Choose metric
- EDA
- Baseline model (Logistic / Linear)
- Strong model (Random Forest / XGBoost)
- Compare results
- Reflect on why performance changed
Why
- Interviewers ask why model improved, not just accuracy
Core Topics
- Perceptron
- Multi-Layer Perceptron (MLP)
- Activation functions (ReLU, Sigmoid)
- Loss functions (MSE, Cross Entropy)
- Backpropagation (conceptual)
- Gradient Descent, SGD, Adam
- Train / Validation / Test
- Regularization
- Dropout
- Early stopping
Framework
- PyTorch (recommended)
Correction
- “relig functions” → ReLU functions
- “atom” → Adam optimizer
Reference
- Deep Learning Specialization — Coursera
- CNN
- Pretrained ResNet
- Image classification (Cats vs Dogs, CIFAR-10)
Topics
- Tokenization
- Embeddings
- RNN, LSTM (high level)
- Transformers
- Pretrained models (Hugging Face)
Tasks
- Sentiment analysis
- Text classification
Reference
- NLP course by IIT Professors (free)
Topics
- Lag features
- Rolling mean
- Time-aware train/test split
Example
Sales today depends on last 7 days sales
Rules
- 3 to 5 projects
- Clear problem
- Clear metric
- Baseline vs improved model
- Interpretability
Topics
- Feature importance
- SHAP values
Steps
- Train model
- Save model
- Load in API
- Deploy using Flask or FastAPI
Example
User sends input → API → Model → Prediction
Actions
- GitHub (organized repos)
- Blog posts / notebooks
- LinkedIn write-ups
Why
- Recruiters notice explanation, not certificates
- Watching too many courses without projects
- Waiting to “master math” before ML
- Jumping to GANs, RL, LLM fine-tuning too early
- Over-focusing on tools and frameworks
Fundamentals + Projects matter more than tools. Build, reflect, explain, and show your work.
If you want, next I can:
- Convert this into week-by-week plan
- Create project ideas with datasets
- Make interview preparation checklist