🧠 Are There Basis Vectors in Concept Embedding Spaces?

Yes — while not literal basis vectors like in linear algebra, there are useful analogues in neural embedding spaces that serve similar purposes.

🧭 Quick Linear Algebra Refresher

A basis is a minimal set of vectors that span a space.
Any vector can be written as a linear combination of basis vectors.
In 3D space: v = a·x̂ + b·ŷ + c·ẑ

🧠 Embedding Spaces and Concept Vectors

In LLMs, words and concepts live in high-dimensional vector spaces (e.g., 768+ dimensions). These spaces encode semantic relationships geometrically.

So what’s the equivalent of basis vectors here?

✅ 1. Latent Semantic Directions

You can find interpretable directions such as:

vec("man") - vec("woman") → gender axis
vec("king") - vec("queen") → royalty axis

These act like semantic axes you can project onto:

$$projection(vec("doctor"), vec("man") - vec("woman"))$$

They’re not formal basis vectors, but they behave similarly.

✅ 2. Principal Components as Data-Driven Bases

By applying PCA or SVD to embedding spaces, you get dominant axes:

First few components often align with interpretable properties (e.g. frequency, sentiment).
These can serve as an orthogonal basis for dimensionality reduction and interpretation.

✅ 3. Probeable Subspaces

Linear classifiers (a.k.a. probes) can separate grammatical, syntactic, or semantic properties.
These learned directions form subspaces useful for reasoning and analysis.

⚠️ Limitations vs. True Bases

No fixed or universal basis vectors
Directions may not be orthogonal
Embedding spaces may be nonlinear manifolds, not strict vector spaces

📌 Summary Table

Conceptual Role	Analogue in Embedding Space
Basis vectors	Latent directions, semantic analogies
Orthogonal components	PCA / SVD axes
Projection	Dot product or cosine along latent axes
Vector decomposition	Probing via linear classifiers

🛠 Want to Explore More?

You can:

Use cosine similarity to find projection strength along semantic directions
Apply PCA to visualize embeddings
Build linear probes to test for specific properties

decagondev/vector-basis-vs-vectorization-data.md