Principal Components Analysis (PCA), that tries to identify the subspace in which the data approximately lies. PCA is computationally efficient, it only requires an eigenvector calculation. Additionally, it is an unsupervised learning technique used for dimensionality reduction.
Imagine you have a dataset in three dimensions: height, weight, and age of a group of people. Each person in the dataset is a data point represented by a vector (height, weight, age). Now, PCA aims to find the directions in which the data varies the most.
In this case, let's say that height and weight have high variance compared to age. PCA would identify the directions (principal components) along which the data varies the most. These directions are orthogonal to each other.
Now, instead of representing each person in the original three-dimensional space, PCA allows you to project them onto a new subspace defined by these principal components. This new subspace retains the most important inf