In MySQL, a Virtual Column (specifically known as a Generated Column) is a column whose value is automatically calculated from an expression or other columns in the same table, rather than being explicitly inserted or updated by a user.
Think of it like a formula in an Excel spreadsheet: you define the rule once, and the database ensures the data stays in sync.
MySQL offers two "flavors" of these columns, and the distinction is mostly about storage space versus CPU performance:
| Type | Stored on Disk? | Performance Impact |
|---|---|---|
| VIRTUAL | No | Calculated on-the-fly when the row is read. Uses no disk space but costs a bit of CPU during SELECT. |
| STORED | Yes | Calculated when the row is written/updated. Uses disk space but is faster to read (like a normal column). |
Instead of writing complex logic in every SELECT statement, you bake the logic into the table schema.
- Example: If you have
first_nameandlast_name, you can create a virtualfull_namecolumn. You'll never have to manually concatenate them in your code again.
This is the "killer feature" of Generated Columns. You cannot directly index the result of a function in a WHERE clause in older versions of MySQL.
- The Problem: Querying
WHERE JSON_EXTRACT(data, "$.id") = 10is slow because it requires a full table scan. - The Solution: Create a Generated Column that extracts that JSON field and index that column. Now, your JSON lookups are lightning-fast.
Because the database handles the calculation, there is zero risk of the generated value getting "out of sync" with the source data. If the price or tax_rate changes, the total_price column updates automatically and atomically.
You can use Generated Columns to enforce business logic. For example, you could create a generated column that returns TRUE if a complex set of conditions is met, and then place a UNIQUE index on it to prevent invalid data combinations.
Here is a breakdown of the syntax:
ALTER TABLE products -- Tell the database we are modifying the structure of the existing products table.
ADD COLUMN discount_price DECIMAL(10,2) -- Create a new column named discount_price.
GENERATED ALWAYS AS (original_price * 0.9) -- Don't let users input data here; calculate it using this rule instead.
STORED; -- This defines how the data is kept.- VIRTUAL: The value is calculated "on the fly" only when you query the table.
- STORED: The result of the calculation is written to disk whenever
original_priceis updated.
In routing systems (URL matching), you often want the "most specific" match to win. Since a longer pattern is usually more specific than a shorter one. The following example shows how an index can allow the database to instantly find the longest applicable pattern for a specific version_id and match_type.
You'll see we are using a Generated Column (path_pattern_length). This is necessary because of a fundamental rule in database indexing: You cannot directly index the result of a function unless you use a Generated Column (or a Functional Index); and by indexing this in descending order, we are trying to prioritize longer path patterns first.
This is useful for when you frequently run queries like ORDER BY path_pattern_length DESC LIMIT 10. By having the index already sorted in descending order allows the database to read the values directly from the "start" of the index without having to reverse-scan.
ALTER TABLE path_groups
ADD COLUMN path_pattern_length INT
GENERATED ALWAYS AS (LENGTH(path_pattern)) STORED;
CREATE INDEX idx_version_matchtype_length
ON path_groups (version_id, match_type, path_pattern_length DESC);It eliminates the need for a "Sort" operation in the execution plan if your query asks for the longest paths first.
It allows the database to find the "Maximum" value for that column (within a specific version/match type) in O(1) or O(log n) time.
If we simply created an index on path_pattern, the database would store the strings (e.g., /user/*, /admin/settings/logs). If we then ran a query asking for the length of those strings, the database would have to:
- Scan every single row.
- Calculate the length for each one on the fly.
- Sort them.
This is a "Full Table Scan," which is extremely slow as your data grows. By creating a generated column, you give the database a static value that it can physically put into an index.